TO: Students in CJ 405
FROM: R. B. Taylor
DATE: March 12, 1999
RE: Comments on homeworks describing scatterplots

I have finished reviewing the homeworks where you described four scatterplots. The following points suggest some ways these descriptions could be improved. One or more of the following may apply to your own homework.

* Be sure to introduce each variable in specific terms. Let the reader know what the units are in which each variable is measured. In the case of the alcohol variable, purchase does not equal consumption.

* Detailed descriptions of the scatterplots are essential. Tell me in detail how the data points are arrayed in the scatterplot.

* If you are describing a data point as an outlier, tell me: is it an outlier on X or Y or both? Be sure to identify exactly where in the plot you find the point, its identity, its values on X and/or Y (as appropriate); and do a "nearest neighbor" analysis. How big is the separation between the data point and its nearest neighbor on X or Y (as appropriate)? In other words, be as specific as you can about what makes it an outlier.

* Some of you commented that the data points were "clustered in the middle on the X (or Y) axis." Whether or not the data points cluster in the middle of an axis, or are spread all along it, is a function of the minimum and maximum values chosen for the axis in question when constructing that scatterplot.

* In a couple of instances, you all referred to the 1990 crime variable as a change variable. It is not change; it is just the 1990 level. We can go ahead and get change for this variable; more details later.

Perhaps the most difficult aspect of this exercise was the theorizing requested. It is going to take a while to get the hang of this enterprise. But I view it as important and an integral part of what this course is about; i.e., a key part of interpreting results is thinking about the theoretical implications. Some suggestions.

* With the ecological data file the processes you describe should be ecological themselves, not individual. In other words, what attributes of states might drive rates up or down? Some of you approached this level when you began talking about varying features of state climates. But you do not want to be talking about individuals. We will be switching back and forth between ecological and individual level theorizing depending on the data set in question.

* Some of you approached the theorizing task by "going clinical," focusing on just the dynamics within a couple of states. Your focus instead should be on common attributes that vary across states, and you want to describe this variation and the dynamics.

* Your theorizing will refer to PROCESSES that connect the variables. In other words, what process is reflected in each variable, and how do those two processes connect?