Conflict over conflict-resolution research

Mike Spagat writes:

I hope that this new paper [by Michael Spagat, Andrew Mack, Tara Cooper, and Joakim Kreutz] on serious errors in a paper on conflict mortality published in the British Medical Journal will interest you. For one thing I believe that it is highly teachable. Beyond I think that it’s important for the conflict field (if I do say so myself). Another aspect of this is that the BMJ is refusing to recognized that there are any problems with the paper. This seems to be sadly typical behavior of journals when they make mistakes.

Spagat et al’s paper begins:

In a much-cited recent article, Obermeyer, Murray, and Gakidou (2008a) examine estimates of wartime fatalities from injuries for thirteen countries. Their analysis poses a major challenge to the battle-death estimating methodology widely used by conflict researchers, engages with the controversy over whether war deaths have been increasing or decreasing in recent decades, and takes the debate over different approaches to battle-death estimation to a new level. In making their assessments, the authors compare war death reports extracted from World Health Organization (WHO) sibling survey data with the battle-death estimates for the same countries from the International Peace Research Institute, Oslo (PRIO). The analysis that leads to these conclusions is not compelling, however. Thus, while the authors argue that the PRIO estimates are too low by a factor of three, their comparison fails to compare like with like. Their assertion that there is “no evidence” to support the PRIO finding that war deaths have recently declined also fails. They ignore war-trend data for the periods after 1994 and before 1955, base their time trends on extrapolations from a biased convenience sample of only thirteen countries, and rely on an estimated constant that is statistically insignificant.

Here they give more background on the controversy. They make a pretty convincing case that many open questions remain before we can rely on survey-based estimates of war deaths. In particular, they very clearly show that the survey-based estimates provide no evidence at all regarding questions of trends in war deaths–the claims of Obermeyer et al. regarding trends were simply based on a statistical error. The jury is still out, I think, on what numbers should be trusted in any particular case.

Here’s a summary of the data used by Obermeyer et al.:


This graph is excellent, except that I don’t think they should label the axes in thousands! It would be much better to label the numbers directly (1 thousand, 10 thousand, . . ., 10 million). Also I’m surprised to see the low number for Guatemala; I’ve always heard that 200,000 people died in the civil war there.

Oddly enough, after making this graph, Obermeyer et al. fit a regression on the original scale. As Spagat et al. point out, fit a linear regression on the untransformed data is not such a good idea; here’s their graph, which emphasizes that almost all the deaths in these data came from the single case of Vietnam, and, after you take out Vietnam, almost all the deaths came from the single country of Ethiopia:


This incredibly ugly plot indicates a bit of a conflict of goals, I think: On one hand, Spagat et al. are publishing a scholarly article and would like their conclusions to be clear. On the other hand their goal is to shoot down the analysis of Obermeyer et al. and so they have a motivation to make the data look as messy and inconclusive as possible. This second goal seems to have won out in this case.

Here’s another ugly graph, this time with so many tick marks on the x-axis that it’s impossible to tell which year is which:


On page 939, Spagat et al. write of “‘battle-deaths’ in ‘state-based conflicts’–that is, those in which a government is one of the warring parties. Battle-death counts include deaths of soldiers from combat-related injuries and of civilians caught in the crossfire . . .” I’m not clear on how you count “battle-deaths” in a war like Guatemala’s, where the government was one of the warring parties, but most of the deaths were not of soldiers. Also, note that in the above graph, the two different estimates from Guatemala are pretty similar, and both are much lower than the 200,000 figure that is usually stated.

P.S. We last heard from Dr. Spagat when he shot down the notorious article in the Lancet by Burnham et al. that estimated post-invasion deaths in Iraq using a sample survey.

P.P.S. More here.

10 thoughts on “Conflict over conflict-resolution research

  1. Andrew,

    Thank you very much for you post and apologies for the undeniably ugly pictures. I think that your description of the trade-off in the first picture is basically accurate but still might lead a few readers astray. The pretty picture, which is in the orginal article we criticize, does a great job of displaying all the datapoints, allowing everybody to see where each country lies.

    The goal of our picture was, rather, to expose the nature of the regression that the original authors ran and which formed the basis for their transformations of the trends shown in the second (ugly) picture you show. The whole point is that the authors fit a line through Vietnam, Ethopia and the indistinct blob of points and then base their transformation of trends on where this line hits the y axis. Our data display was meant to capture the silliness of this procedure.

  2. Michael: I thought that was a wonderful graph and I don't quite see the conflict as Andrew does- as you say, part of your conclusion was that the particular regression is silly – and I think readers get that very quickly from that graph (plus a good idea of _why_ it's silly).
    In a way it's a classic diagnostic plot that shows that something is indeed wrong with that regression- so why not publish it?

  3. The cynic in me is suspicious. If the original authors were smart enough to do the first, log-scale graph, shouldn't they have been smart enough to do the regression that way?

    Who's this quote from? "Never assume malice where lack of competence is a possible explanation?" (That's not quite the right quote, either.)

  4. "Never ascribe to malice that which is adequately explained by incompetence". I thought this came from Napoleon, but Wikipedia has a close variant as "Hanlon's Razor" and the above version "attributed to Napoleon."

  5. Hi Prof. Gelman –

    I didn't work on the Spagat et al. piece, but I'm behind the original battle deaths data.

    Your readers might be interested in the "Documentation of Coding Decisions" available at the Battle Deaths Database website:… The complete definition of "battle deaths"–admittedly a tricky concept–starts on page 5. The discussion of Guatemala starts on page 219.

    The goal of the Documentation is to preserve all the sources we used and the logic of how we interpreted them. If you or any of your readers know of sources we haven't consulted, for any conflict, it would be terrific to hear about them: [email protected]


    Bethany Lacina

  6. This debate is missing a key part — namely, any sort of awareness that there are estimation methods out there that improve on both surveys (usual stalking horse of Spagat, et al.) and convenience data such as press reports (usual stalking horse of many other people).

    Spagat et al. are more or less correct about all of the many, many problems with survey data. They're right to criticize OMG (OMG!). But this isn't, or at any rate shouldn't be, a debate between survey and convenience methods.

    The authors dismiss (at page 936; again in footnote 2) estimation techniques other than retrospective mortality surveys and "collation of other reports". But while it's true that demographers often (usually? Help me out here, demographers) use retrospective survey data in their analyses, there's also a long-standing literature that uses census data instead, matching across sources in order to model (a) patterns of inclusion in convenience sources and (b) the number of uncounted cases. This method accurately counts deer, rabbits, residents of the United States, children with various genetic disorders, and HIV patients in Rome (to name a few examples I can think of) — and, yes, also conflict-related deaths.

    Bethany Lacina's link to the PRIO documentation is really interesting on this point. For El Salvador, the case with which I'm most familiar, PRIO's best estimate is 75,000 total deaths — 55,000 battle deaths and 20,000 "one sided" deaths. I think this is reasonable-ish (maybe the total is between 50,000 and 100,000?), but there's no actual evidence to support such a number. The sources PRIO cites are expert guesses, rather than statistical analyses of any sort.

    PRIO's El Salvador estimates are based on *neither* documented/documentable convenience data (e.g., press reports, NGO reports) *nor* survey data. The United Nations-sponsored Truth Commission for El Salvador's list of documented (and partially documented) deaths includes about 14,000 total deaths, many of which are duplicates. Two other NGO databases include about 6,000 and about 1,500 deaths, respectively. Again, there's significant overlap and many duplicates. Yet no one imagines that the total deaths in this conflict were 21,500. In the Salvadoran case as in many others, inclusion in the data is incredibly biased toward urban, educated, and politically active victims. (They're also biased in any number of other ways, of course.)

    Prof. Gelman is right to point out the discrepancy between the Guatemala survey numbers, the Guatemala convenience (PRIO) numbers, and the number that most people cite as the best approximation for Guatemala (200,000). Importantly, that "200,000" is based in large part on census numbers. (See and… statistical analyses from the Commission for Historical Clarification, Guatemala's Truth Commission.) So why ignore census correction methods?

    Given that discrepancies between survey and convenience data are very often dwarfed by discrepancies between those numbers and the numbers we believe to be correct, I worry that the surveys-versus-convenience-data fight isn't more about protecting academic projects and prerogatives than about actually finding the correct answer.

  7. Here's a small corrigendum.

    The claim that demographers often/usually use retrospective mortality surveys in their analyses is a bit off the mark. Looks like it is borne out of some confusion in some parts of the academy between the methods of demographers and epidemiologists…
    Broadly speaking, demographers use a wide array of sources including population censuses, vital registration systems, demographic surveillance systems, and surveys (of all flavors:  longitudinal, panel, and retrospective).
    In the field of conflict-related mortality, demographers have actually relied almost exclusively on sources other than surveys. For example, Patrick Heuveline and Beth Daponte have used population censuses (and voter registration lists) in Cambodia and Iraq, respectively, and demographers at the ICTY (Helge Brunborg and Ewa Tabeau) have used various types of "found data" which equate to (incomplete) registration lists along side census correction methods. Distinguished demographers Charles Hirschman and Sam Preston were in the minority amongst demographers, when they used a household survey to estimate Vietnamese military and civilian casualties between 1965 and 1975.
    The folks who routinely use surveys in the field of conflict-related mortality are epidemiologists, not demographers. The folks at Johns Hopkins, Columbia's Mailman School of Public Health, Harvard Humanitarian Initiative, Physicians for Human Rights, MSF, Epicentre, etc use variants of the SMART methodology with a 2-stage cluster design are epidemiologists. This design and methodology has been coarsely adapted from a back-of-the-envelope method used to evaluate vaccination coverage in least developed countries. However, epidemiologists at the London School of Hygiene and Tropical Medicine have recently noted that this method “tends to be followed without considering alternatives” and “there is a need for expert advice to guide health workers measuring mortality in the field” (See

  8. as an aside, i guess i wonder if the comparisons between the nice graph and the ugly one suggests a 'deeper mechanism' involved, namely that violent conflict is actually (as some argue for utility of income) logarithmic. if so it would seem the nice graph is the correct one.

    eg it might be a 'large n' effect, such that life gets cheaper (and casualties greater) the more you get.

  9. I think the point that Bethany is trying to make above that Amelia is missing is that the PRIO battle-deaths data is focused narrowly on *battle deaths* rather than all violent deaths than occur in the context of war. Because a considerable percentage of the fatalities in Guatemala were one-sided violence against civilians, the PRIO estimate will naturally be smaller than the Ball et al. estimate.

    Just an aside: the Uppsala Conflict Data Program (where I'm affiliated) uses press reports only as a first cut. They are then supplemented by all types of NGO and INGO reports (INSEC in Nepal, UN reports in DRC, etc.), human rights investigations, etc. to the extent that these sources exist and they provide enough detailed data to establish the extent to which the fatalities were battle-related or not. The various Benetech studies are excellent in this regard and have been incorporated into the fatality estimates we provide in our database ( The same is true of various surveys: to the extent that they provide data that fit our parameters (i.e. the estimate can be disaggregated into fighting-related deaths vs. victimization of civilians), we consider those estimates.

Comments are closed.