TO: Students who took the final exam in CJ 405, 12/10/01
FROM: R. B. Taylor
DATE: 12/11/01
RE: Comments on the final exam
The text of the final exam itself is now posted on the website and can be
found at:
www.rbtaylor.net/405fa01_finalexam.pdf
Here are some comments on the specific questions, focusing on the ways more
than one student went awry.
- The r squared of .38 tells you the proportion of the variance in the
logged census-tract level property crime rates in Columbus, OH, averaged
from 1989-1991, explained by the two disadvantage index dummy variables, and
the other predictors (vacancy rate, rental rate, etc.). Several readers
thought the outcome was the disadvantage index. The article is clear and the
table is clear: the outcome is the logged property crime rate. Several
readers thought the R squared was just for the disadvantage indicators. The
R squared, as always in a multiple regression, is for all the predictors.
- The null hypothesis is that the partial slope of the logged property crime
rates on percent rental households is zero, after controlling for the other
predictors in the model, in the population of Columbus (OH) census tracts.
[Note for really advanced students: the tracts in the study were not sampled
per se, but rather are all Columbus tracts that met the researchers'
criteria. So we have almost the population of census tracts. Therefore, one
could argue that tests of statistical significance are inappropriate.]
- You will reject the null hypothesis.
- After controlling for two dummies representing high and extremely high
scores on the structural disadvantage index, and for other predictors in the
model, for each one percent increase in rental households in the tract, the
log of the property crime rate increases .0083.
- After controlling for the two dummies representing high and extremely high
scores on the structural disadvantage index, and for the other predictors in
the model, for each one percent increase in the percent of vacant housing in
the tract, there is an increase of .6191 in the violent crime rate.
- The null hypothesis is that the partial slope of the violent crime rate on
percent vacant households is zero, after controlling for the other
predictors in the model, including the two dummies based on the disadvantage
index, in the population of Columbus (OH) census tracts.
- You will reject the above null hypothesis.
- Residents living in extremely high poverty tracts have higher predicted
violent crime rates than residents living in high poverty rate tracts. Since
there are two dummies, each b weight is a contrast with the predicted
average violent crime rate in the low poverty tracts -- the low tracts are
the reference string -- when all the other predictors are set to zero. Some
readers of this question did not realize I was asking you about the OR -
i.e., tell me which one -- high or extremely high -- scores higher. Some
readers did realize they could add the b weight and the constant and see how
much higher the predicted rates were in extreme vs. high poverty locations.
- The constant tells you the average imprisonment rate per 100,000 persons
in 1985 in Midwestern states.
- The b weight for South tells you how much higher southern states are, on
average, in their 1985 imprisonment rate, as compared to Midwestern states:
1.0 prisoners per 100,000 residents. You also can add a + b to get the
average imprisonment rate in Southern states if you wished (2.29).
- For each case in the simple or multiple regression: Y-observed minus
Y-predicted.
- The adjusted R squared tells you the overlap between the predictor(s) and
the outcome in the population of cases from which the sample of cases was
representatively drawn; how much of the variance in the outcome is likely to
be explained when we refer back to the population.
- As Hamilton describes it on page 1: "errors have identical
distributions .... errors are independent ... errors are normally
distributed."
- Lots to say here; folks generally handled this well.