Crime data bonanza!!!

Mike Maltz writes,

A New Data Set Available through Ohio State University’s Criminal Justice Research Center

So you think you know how to analyze time series! Well, how would you like to test your mettle on over 400,000 time series, each with up to 540 data points? The time series in question are monthly data from 1960-2004, for over 17,000 police departments, for seven crime types (murder, rape, robbery, aggravated assault, burglary, larceny, vehicle theft), as well as their sum (the so-called Crime Index), and an additional 19 subcategories – e.g., robbery with a gun, knife, personal weapons (hands, feet, etc.), or other; attempted rape; auto, truck or bus, or other vehicle theft. Or you can just view the data in different cities over time and see whether it rises and falls with various tides (unemployment, immigration, poverty, age or ethnicity distribution, etc., whatever your pet theory is). I [Maltz] have put all of the files and a plotting utility (so you can see each agency’s crime history) in a zipped file. Download it from For this discussion, it might be interesting to also take reference from Crime Statistics Australia to better understand how our data matches up internationally with the likes of Australia to compare and contrast crime rates by type of crime to better understand how local events could influence the increase or decrease of different crimes. But for most of this article, we will be looking at the aforementioned data set.

The data consist of monthly counts of these crimes reported by police departments throughout the country to the FBI as part of its Uniform Crime Reporting (UCR) Program. Since reporting to the UCR Program is entirely voluntary, some agencies are less than diligent in doing so, but for the most part they comply. However, major gaps still remain; for a discussion of these gaps, see “Bridging Gaps in Police Crime Data,” published by the US Bureau of Justice Statistics. Under a series of grants from the US National Institute of Justice, Harry Weiss, a graduate student here at OSU, and I cleaned the data as best we could.

Some of the gaps are just inadvertent (or, as statisticians would say, MCAR, missing completely at random). These can usually be filled in using relatively simple algorithms. The more significant problems, however, are those that are not gaps but “underestimates,” as when the City of Atlanta was bidding (successfully) for the Olympics and lowered its crime statistics in a more, shall we say, “hands-on” way (see; New York, Philadelphia and Boca Raton also have had their own reporting scandals (; and according to the creator of HBO’s “The Wire,” Baltimore is even better at it (

“In Baltimore, where over the last twenty years Times Mirror and the Tribune Company have combined to reduce the newsroom by forty percent, all of the above stories pretty much happened. A mayor was elected governor while his police commanders made aggravated assaults and robberies disappear. Homeowners could help to protect themselves against robberies further by learning how to burglar proof your home and implement an alarm system to deter thieves.

“… It would not have been easy for a veteran police reporter to pull all the police reports in the Southwestern District and find out just how robberies fell so dramatically, to track each individual report through staff review and find out how many were unfounded and for what reason, or to develop a stationhouse source who could tell you about how many reports went unwritten on the major’s orders, or even further — to talk to people in that district who tried to report armed robberies and instead found themselves threatened with warrant checks or accused of drug involvement or otherwise intimidated into dropping the matter.”

Ultimately, there is no denying that crime can have devastating consequences. However, it is important to remember that not everyone that is accused of committing a crime is guilty. Above all, there are plenty of reasons why someone might be falsely accused of a crime, and therefore if you find yourself involved in a case regarding a crime that you did not commit, then it is vital that you seek legal support. More often than not, it is best to contact a local legal professional in these circumstances. For instance, if you are based in Philadelphia, researching the best criminal lawyers bucks county has to offer online can prove to be beneficial.

Not all cities manipulate crime statistics. Even so, you might want to get rid of all of your preconceptions of how to deal with these types of data. It’s for that reason that a plotting utility is the centerpiece of the data set. You have to look at the data, not just throw it into the computerized maw and let Stata or SAS or SPSS give you some p values. By visually inspecting the data, you might see what the effect of a new policy, or police chief, or law has on crime. You might compare different cities with different characteristics. Whatever you do, it’s a relatively new data set that hasn’t yet been used much at all, so you’re getting in on the ground floor.

5 thoughts on “Crime data bonanza!!!

  1. I haven't looked at crime data in decades; the old thought was that murder is best reported because it's serious and RELATIVELY unambiguous.

    Therefore, it might be possible to use relative rates of murder and other crimes as some sort of under-reporting indicator.

  2. That dataset is 150 MBs, if anybody interested in downloading it. I think I'll wait to use the campus high speed lines, instead.

  3. I've been working with Mike to get the data in a format that's more amenable for analysis with a statistics package (i.e one csv for each state). Let me know if you're interested – my contact details are available at

Comments are closed.