TO: Students in CJ 605
FROM: R. B. Taylor
DATE: 4/6/06
RE: Comment points in some methods sections that would benefit from attention
Attached is a list of some shortcomings found in one or more draft methods
write ups. I understand these are all works in progress, and that everyone has
to start somewhere, and that some of you have had more experience writing these
kinds of papers than others. Nonetheless, the most important point is that you
may want to pay attention to ALL these points, because if they are not relevant
today, they may be relevant some time in the future.
When you get your methods section draft back if may have one or more of these
numbered comments on it. Here is the idea corresponding to each number.
- Some of you have, and some of you haven't, mentioned that missing data
replacement was done. You DO want to mention that, and how it was done.
Further, you also want to report, on an item by item basis (this can go in
your first descriptive table or in a footnote somewhere) how many cases had
missing data replaced with an estimate. You can no longer get by saying
"small amounts of missing cases were replaced with estimates." We've tried.
- Include a description of the weighting done for Philadelphia
respondents, as explained in the Dote document. This is hard to get right.
Here are the steps: PUMS 2000 data for Philadelphia, randomly select one
adult per household, figure out what the distribution of these persons is in
a 2 x 2 x 2 table (gender x white/nonwhite x hs ed/gt hs), figure out for
each cell the ratio of PUMS/Phila survey, take the inverse of that ratio to
make the first stage of the weight, at second stage control for multiple
phone lines, at third stage multiply first by second.
- Need a brief synopsis of demographic profile of weighted respondents -
cross reference with a descriptive Table 1 including min, max, mean, sd on
your variables, including your demographics.
- Be sure to include the survey response rate - also if you include it, do
NOT say it was low. It wasn't given today's standards. Further, the response
rate needs to be considered in light of the match overall for metro area.
See ISR document.
- Avoid personal reference throughout - NO I/we/me/my/our
- Give in the text or in Table 1 the min and max and mean and median and
sd for n per neighborhood, both weighted and unweighted.
- TENSE: when this is done everything WAS done. The variable WAS recoded
(not will be). The distribution on the variable WAS etc.
- When you have in line citations to authors you need dates along with
authors
- A figure is often a good way to summarize your theoretical model.
Consider it.
- Data ARE plural. A piece of data is a datum. You can say the
dataset shows but you would say the data show. PLEASE GET THIS RIGHT. This
is just part of being a social scientist. It is really really really bad
form to say the data shows.
- Some of you were missing one or more of the key elements describing the
survey procedure. These key elements include at the least: who did the
surveying; when did the surveying begin and end; what was the procedure on
callbacks; what was the response rate; how was the sampling frame
constructed; what was the mode of surveying; how was probability sampling
carried out at each level; and how much weighting was done and how was it
done. You end with a thumbnail demographic sketch of the Philadelphia
subsample.
- When you are describing a specific variable, it is preferable, unless
the item is pretty obvious, like age or gender (well, maybe not so obvious
any more...), you want to include exact question text, exact response
categories, and the numbers associated with each response categgory. [ e.g.,
Respondents were asked "How serious are each of the following problems
in your neighborhood: "skinheads on your lawn," "abandoned cars," "a
plethora of pink flamingos" using the response categories (3) "serious
problem" / (2) "moderate problem" / (1) "small problem" / (0) "not a
problem"]. If just the ends were labeled, then report those labels and the
number of intervening categories.
- If you are group mean centering, explain in a sentence or so, not for
each variable, but just once, what it is doing theoretically.
Similarly for grand mean centering at L2. If you are group mean centering
just some of your variables at L1 and not others, explain the distinction to
the reader.
- How about centering at L2?
- LIKERT is turning over in his grave. The use of this term is undergoing
a resurgence, for reasons not clear to me, but it is often used erroneously.
Likert scaling refers to the response categories with them being evenly
balanced between the two sides. For example: strongly disagree, disagree,
agree, strongly agree. It does NOT refer to not a problem, somewhat of a
problem, etc. STRONGLY suggest you just avoid using this term altogether. If
you are describing each of your response categories anyway as suggested
above (# 12) you do not need to show you know how to use this term. Plus, if
you use it, you will have to pronounce it correctly, which leads to even
more stress (tic).
- INDICES - a couple of things. For all your indices, report Cronbach's
alpha. Do not report the MSA alpha. Report the Philadelphia alpha. If your
index is based on z scored components, then you would use the standardized
alpha number. ALSO, explain which items were reversed, AND clearly state a
higher score means more _____ . ALSO - be clear if there is z scoring
involved when it happened. ALSO be sure it is clear exactly how the items
were put together. DON'T say the items were combined. Were they summed? Or
averaged? Be specific.
- At L2, many of you will want to do a multicollinearity analysis, and
report this. The report would include things like VIFs and tolerances.
- Demographic controls at L2: in the same way you have a full set of
demographic control variables at L1, you will want the same at L2. So you
want status, stability, and racial composition or racial heterogeneity. If
you are using crime at L2 (here is the exception) you cannot use both status
and crime in the same model.
- One detail on the weighting: by randomly selecting one individual over
18 from each PUMS household, Dote was attempting to mimic having a sample of
household heads, rather than having a sample of all adult household members.
The Census does not even use the term head of household. Household heads
were eligible respondents in the surveying procedure.
- INDEX vs. SCALE. An index combines several items. A scale describes a
series of response categories. You have multi-item indexes or indices.
- Don't forget key information on your DEPENDENT variables as well. Your
table 1 should have the same info you have for predictors: min, max, mean,
Md, sd. Also, don't forget somewhere in the text to tell the reader about
outcome variable skewness and how it was acceptable - if it was.