GUIDE FOR GRADUATE STUDENTS ASSEMBLING EMPIRICAL RESEARCH PAPERS
Ralph's (arbitrary) Rules for Paper Assembly
Ralph B. Taylor
Department of Criminal Justice
Temple University
3/24/06
(revised: 4/25/06)
(comments welcome: tuclasses@fastmail.fm)
PURPOSE
This page seeks to provide boiler-plate guidance for assembling and organizing quantitatively-based, empirical research papers of journal submission length. This is probably not helpful for theoretical review papers, or for empirical papers based on qualitative approaches.
It is oriented toward graduate students.
The format suggested here is provided as a guide only.
These suggestions do NOT cover formatting. Details on formatting can be found in many journals. For example the Criminology guidelines can be found at
http://www.asc41.com/crim.guide.html
The guidelines for authors put out by the Journal of Criminal Justice has some interesting and thoughtful points. See:
http://www.elsevier.com/wps/find/journaldescription.cws_home/366/authorinstructions
INTRODUCTORY COMMENTS
The body of your paper, including title page and abstract, should be in the range of 25-30 pages of text. Endnotes, references and tables are on top of that. You should keep endnotes to a minimum, using these only to explain technical points not of interest to the general reader, or to describe results not shown in detail. If your paper is 15-20 pages it is probably leaving out a lot of important stuff. Ask someone to look at it.
If you are presenting a rather complex study you may find you need to stick to the main points, and summarize points that are not centrally relevant.
In addition to section headings use subheadings. For example, in a methods section you might have subheadings like: participants, procedures, data sources, analysis plan, and variables, to take just one example.
Remember, always use past tense. By the time you get to writing up the results, the study has been done the data were collected and the results showed this and that.
DESCRIPTIONS OF SECTIONS
Title Page
Title page includes paper title and author(s) and their affiliations, followed by contact information. A note at the bottom of the page should acknowledge contributions from those reviewing earlier drafts, should acknowledge funding sources (if appropriate), data sources like ICPSR or elsewhere (if appropriate), and should report if earlier versions were presented at regional or national conferences. The note at the bottom ends with the first author's address and contact information including email address.
Think hard about your title. It should be specific and informative and pique the reader's interest. Have the title link in a memorable way to a question you are asking or a finding that you have produced.
Abstract
You want to work harder on this than on any other page of the paper. It should start with a key question posed. The body should explain how what you did linked to earlier work, what data sources you used, what your key findings were, and what it means. All in less than 200 words. A now-deceased former professor, Clinton B. DeSoto had a line about abstracts that went like this: try and have one line in there which is theoretical and thoughtful and moves the reader beyond the immediate questions of the study, helping him/her see bigger connections. Good advice. I should be more conscientious about following it.
Make your abstract good and you will avoid the following problem.
|
|
Introduction
Traditionally this section is the hardest for students to write. The reason it is hardest, I think, is because it asks you to do two things which are challenging: organize the work in an area by highlighting the main threads pursued and questions asked; and, drill down to very specific -- but not nitpicking -- criticisms of the work which has come before. It should be no more than 8 - 12 pages in length.
It requires, therefore, a lot of thought.
Organizing. You need to think enough about the area to organize it. A series of paragraphs starting
"Abel (2003) found that the widgets were over produced when times were hard...
"Cain (1999), interestingly, found that widgets are smaller than they used to be .....
"Horatio and Alger (2004) asked a slightly different question....
will send all your readers scrabbling for the scotch and soda. The above is nothing more than a series, with unclear connections between the different paragraphs, and no sense of leading the reader anywhere.
Once you have thought enough about an area to organize it, then use those ideas as subheadings. Begin each section with an opening point and end with a summary point.
Criticizing. Your introduction must contain some criticism of earlier work, and these criticisms must be specific, sensible, and non-trivial. Say you find that work in this area has failed to consider the effects of a spatially lagged outcome on the predictors of fear of crime. That is an important and cogent point. Point to examples of studies where it has been left out, and where it would have made sense to include it. Explain to the reader how, lacking that information, studies might be misleading.
Pointing out directions the work has not yet explored is part of the picture, and if your study is going to go in new and uncharted directions .... to boldly go .... then also tell the reader why this might be important.
All things methodological. If there are methodological limitations of past studies -- e.g., they forgot to think about error covariances -- and these are relatively trivial points, these can be included as a motivator of your study, but they should not be the main motivator. Beware the study that is primarily methodological. It is unlikely to get accepted. You need theoretical touchstones. Not a lot. But they need to be clear
Theory, theory, theory. The most important point about your study will be how it elaborates or tests or expands current theory, unless your paper is grounded theorizing, which is rather different. But if it is not grounded theorizing, it is crucial that your study highlight what theory is being addressed, and how your study will add an important piece.
Sections. At the risk of being formulaic, your introduction should contain the following sections:
An initial page or two which sets the stage. Think of it as the prologue. You want an excellent opening line, an excellent opening paragraph. Not something hackneyed: "Most studies show..." A couple of recent lines we used for a paper under review were "Urban alleys are often avoided even during daylight ours. This was not always so." This page or two will identify key theoretical ideas, and outline the contribution of your study in broad terms. It can tell the reader what sections follow next in the introduction.
Using subheadings skillfully, take the reader through the literature. Begin and end each section with overview statements. You can end each section with what the implications are for your study: "Thus, since no one has yet examined the impact of sending delinquents on free trips to the rock and roll hall of fame, this study reports a three year follow up of those who made such a trip." Or: "As can be seen from the work in this area, almost all of the work on the connections between local crime rates at the community level and perceptions of local crime have been cross sectional. What remains unexamined are the effects of changing crime rates on changing perceptions of crime among residents." These sections can contain hypotheses and rationales but they should be blended in with the work, not stand out like street signs. If you are introducing hypotheses, be SURE each one has an accompanying rationale describing the PROCESSES behind the connection.
At the end, in a page or two, take the reader back through your key questions, and provide a re-statement in summary form of your model. The summary form means just that -- a summary. It can be extremely useful here to complement your model summary with a path diagram or some other type of heuristic device. Dan Stokols at UC-Irvine had several publications which used simple 2 X 2 tables to get his framework and heuristics across to readers. Back in graduate school I wrote a jejeune research proposal draft in search of an organizing framework. With some help from a researcher at the university, a professor, Dr. Lois Verbrugge, was prevailed upon to read it. She did, and we sat down and the first thing she did was sketch out the questions using a path analytic model. Eye opening indeed. For an example, see the first figure in Robinson et al (2003) available at http://www.rbtaylor.net/pubs.htm . For another example see Taylor and Hale (1986) also available in the same place. An organizing heuristic laid out schematically can be a useful way to summarize information, like in a literature review, or in a proposed model. These are highly recommended. These also help you think more clearly about your models.
Methods or Data
If your study is primary data analysis, tell the reader about
the sample or the respondents: who are they?
how did you get hold of them?
what specifically were they asked to do?
If your study is secondary data analysis, tell the reader
from what source were these data obtained?
who originally collected these data and for what purposes?
important features of the sample?
If these are administrative data:
who collected them?
for what purposes?
over what period of time?
using what types of categories?
with what types of reliability checks?
If these are survey data:
What was the sampling frame?
What was the sampling design?
When were the data collected by whom under whose auspices?
What was the response rate?
Give the original wording of questions and the response categories.
Explain for each specific variable: what is it and how does it work?
If you are doing multivariate work, there may be a large number of side issues that need to be discussed in this section: skewness of variables, data transforms done if needed, missing data, checks on multicollinearity, and the like.
Some Specific Additional Comments for Graduate Students Working in the Statistics II Course and Doing Multilevel Papers with Secondary Data Analyses
You will want to either summarize in your own words, or quote directly from the primary source material, a description of the data collection procedures. People need to be able to understand: what was the sampling frame?; what was the sampling strategy?; what was the response rate?; how was the survey conducted -- telephone or in person for example?; what kinds of sections were there in the survey?
Be sure to explicitly describe your outcome variable, and each of your predictor variables. For the outcome, be sure the reader understands its specific distribution. For each of the other predictors, at the minimum you want a table with: mean, median, min, max, and sd, and n if there are varying Ns because of missing cases. Remember the ASC guidelines: each table goes on a separate page. You will have predictor variables to describe at L2 as well as L1. You will need to tell the reader about the L2 units: what were they?; how were they constructed? Each predictor should have its own subhead.
If there are any special analytical things you did before you got started with the analyses, either in terms of missing values, or special recoding, tell the reader about those
If you have developed an index, tell the reader about each item that went into each index. Tell him/her about Cronbach's alpha. Be as specific as you can about how the index was constructed.
If you are using the crime data, be absolutely clear about the time period covered, the offense in question, and the rate.
Your table needs to be a formatted, word processed document, not just a bunch of patched together spss printout. Each table needs to be totally self sufficient - each variable is clearly and fully explained.
Be sure you can clearly explain the weighting variable; you also might want to report on the range of weights applied.
You want to provide the reader with a rationale for the
analyses you use - what are the reasons that hlm is being used here, and why
is it better than another approach - do not need a lot here, just a short
paragraph.
Results
Walk the reader through your results. Be specific but not tedious. Tell the reader what is significant, which direction impacts are going, and help the reader interpret these impacts. For example: "The difference between white and nonwhite respondents on sense of community, after controlling for other predictors, was .5, with whites reporting more sense of community."
When you are discussing a specific table, tell the reader in the text which table the results can be found in.
If impacts are not statistically significant, they need not be mentioned in the results section unless there is something really really surprising about their being non-signficant. Non-significant means essentially zero. Sometimes there is a temptation to make a big deal about things which did NOT come out. The mantra here is: it is always extremely hazardous to make inferences from negative results (null findings). The reason? Because there can be so so many reasons why things did not come out.
You can end this section with a summary of key results.
Discussion - at least 5 pages
A discussion is NOT simply a re-hashing of results. Rather, what it does is look back and look ahead. Again: organize this section. Think about main points and use those to organize your material.
Start with a brief summary of of your main findings, if you did not end the results section with such.
Look back. Return to each of the major theories or major theoretical questions introduced in the first part of the paper. Revisiting each one: how does each look differently in light of the new information you have gathered?. Imagine you are looking at characters at the beginning of the 4th act of a play. The 4th act, naturally, comes after the 3rd act, which contains all the dramatic action. Which characters look triumphant ("results seem to provide a robust expansion of the Gromit theory in the following ways")? Which ones look bedraggled ("Although the Wallace theory predicted large impacts of X on Y1 and Y2, those did not emerge here") ? How or in what ways are each of these theories altered by the results which have been presented since the introduction?
Look ahead. Given what you know now, what are the next steps? Do NOT just end the discussion with a vague "clarion call for further research." Do NOT just give lists of ways the generalizability could be tested. Instead, be specific, and lay out specific avenues which need to be investigated in the future, and explain why each of these avenues may be important. Imagine you were going to be researching this problem for the next five years. In a nutshell, what will you be pursuing and why?
You also want to honestly acknowledge study limitations. My preference is to list the limitations, but then also remind the reader of the study strengths immediately following.
It is easy to think that the lack of external validity is a study limitation. If you say this, which many commonly do, you will show that you do not understand external validity. External validity is always an empirical question. Which means before you try and see whether or not the results replicate, and before you fail to replicate, there is no limitation. This is an important and widely misunderstood point.
End with a strong summary paragraph.
References
Formatted preferably using something like Endnote. If you are not using Endnote or Reference Manager you are wasting a lot of time. Learn how to use this tool. Follow the format of the journal to which you are submitting.
Tables
Do NOT use Word table templates. These just put in lines which are hard to get out. I STRONGLY recommend using Excel to build the table. The main advantage is you can alter things like how many decimal places to show. It also allows you to get results in easily and in a way you can edit them from there. For example, you can get SPSS tables into Excel, although it may take some fiddling. From HLM you can copy and paste lines from the printout.
From Excel, paste into Word, and using the table function in Word play with things like column widths, merging cells, setting row height and the like.
Each table should start on a separate page.
Some arbitrary rules:
NEVER re-key results from a table, with one exception. Your table should report anywhere from 1 to 5 p levels if these are being reported: p < .05; p < .01; p < .001 are the most typical. E.g., * = p < .05; ** = p < .01 and so on. You can re-key probability levels from tables so they match these levels. SO .038 becomes < .05.
NEVER report .000 as a probability level. Think about it for a second.
ALWAYS include an informative table title telling the reader in some detail about what is happening.
First table should have solid descriptive information about respondents and/or key variables: minima, maxima, means and standard deviations at a minimum. If someone else seeks replicate your study, this is one of the first things they will cross check.
Figures
Nothing special to say here - just be sure they are good and legible. You want clear titles, clear legends. Data source should be indicated on the figure legend. If the figure is based on weighted data, that should be indicated as well.
Some Specific Additional Comments for Graduate Students Working in the Statistics II Course and Doing Multilevel Papers with Secondary Data Analyses and Doing Maps
If you are going to include maps, be sure there is lots of information on the map so the reader can clearly see: what are the organizing spatial units, what is the region shown, what is the variable being mapped, and what are the groupings used for the variable being mapped.
The mapping program many of you are using seems to have a default option for grouping when you request a chloropleth map such that "natural breaks" in the numbers are sought, and used to decide how to define the levels of the variable depicted. NOTE that this will RARELY result in groupings which have an equal number of neighborhoods or police districts in each grouping. For example, under the "natural breaks" option you may have only one neighborhood or police district in the highest group, or only one in the lowest group. The advantage of natural breaks is that there is at least some separation, on the variable, between the neighborhoods/districts in the different groups. So you do not have tiny differences between where one group leaves off and the next begins, in terms of scores on the variable. The disadvantages are two fold. First, the breaks are specific to the data set itself. There is nothing theoretically meaningful about the groupings chosen. Second, it is very unlikely under this scenario that you will obtain roughly equal numbers of neighborhoods in each district or neighborhood. For example, if you were mapping the neighborhoods or districts by quartiles, which amounts to about four roughly equal groups, the middle two groups would contain about half of your neighborhoods (around 22) or districts (around 11 or 12). This would then correspond to the interquartile range -- scores from the 25th percentile to the 75th percentile -- at the ecological level. This is theoretically a pretty useful description. If you were mapping by quintiles (5 about equal size groups) then the middle three groups would correspond roughly to about 60% of your Level 2 units, which represents, roughly, the mean + / - one standard deviation. This also is theoretically useful. But most importantly: tell the reader which option you have chosen.
Whichever option you do choose, be sure to label the real limits for each interval.