Update on estimates of war deaths

I posted a couple days ago on a controversy over methods of counting war deaths. This is not an area I know much about, and fortunately some actual experts (in addition to Mike Spagat, who got the ball rolling) wrote in to comment.

Their comments are actually better than my original discussion, and so I’m reposting them here:

Bethany Lacina writes:

I didn’t work on the Spagat et al. piece, but I’m behind the original battle deaths data. Your readers might be interested in the “Documentation of Coding Decisions” available at the Battle Deaths Database website. The complete definition of “battle deaths”–admittedly a tricky concept–starts on page 5. The discussion of Guatemala starts on page 219.

The goal of the Documentation is to preserve all the sources we used and the logic of how we interpreted them. If you or any of your readers know of sources we haven’t consulted, for any conflict, it would be terrific to hear about them: [email protected]

Ameilia Hoover writes:

This debate is missing a key part — namely, any sort of awareness that there are estimation methods out there that improve on both surveys (usual stalking horse of Spagat, et al.) and convenience data such as press reports (usual stalking horse of many other people).

Spagat et al. are more or less correct about all of the many, many problems with survey data. They’re right to criticize OMG (OMG!). But this isn’t, or at any rate shouldn’t be, a debate between survey and convenience methods.

The authors dismiss (at page 936; again in footnote 2) estimation techniques other than retrospective mortality surveys and “collation of other reports”. But while it’s true that demographers often (usually? Help me out here, demographers) use retrospective survey data in their analyses, there’s also a long-standing literature that uses census data instead, matching across sources in order to model (a) patterns of inclusion in convenience sources and (b) the number of uncounted cases. This method accurately counts deer, rabbits, residents of the United States, children with various genetic disorders, and HIV patients in Rome (to name a few examples I can think of) — and, yes, also conflict-related deaths.

Bethany Lacina’s link to the PRIO documentation is really interesting on this point. For El Salvador, the case with which I’m most familiar, PRIO’s best estimate is 75,000 total deaths — 55,000 battle deaths and 20,000 “one sided” deaths. I think this is reasonable-ish (maybe the total is between 50,000 and 100,000?), but there’s no actual evidence to support such a number. The sources PRIO cites are expert guesses, rather than statistical analyses of any sort.

PRIO’s El Salvador estimates are based on *neither* documented/documentable convenience data (e.g., press reports, NGO reports) *nor* survey data. The United Nations-sponsored Truth Commission for El Salvador’s list of documented (and partially documented) deaths includes about 14,000 total deaths, many of which are duplicates. Two other NGO databases include about 6,000 and about 1,500 deaths, respectively. Again, there’s significant overlap and many duplicates. Yet no one imagines that the total deaths in this conflict were 21,500. In the Salvadoran case as in many others, inclusion in the data is incredibly biased toward urban, educated, and politically active victims. (They’re also biased in any number of other ways, of course.)

Prof. Gelman is right to point out the discrepancy between the Guatemala survey numbers, the Guatemala convenience (PRIO) numbers, and the number that most people cite as the best approximation for Guatemala (200,000). Importantly, that “200,000” is based in large part on census numbers. (See http://shr.aaas.org/mtc/chap11.html and http://shr.aaas.org/guatemala/ceh/mds/spanish/toc.html, statistical analyses from the Commission for Historical Clarification, Guatemala’s Truth Commission.) So why ignore census correction methods?

Given that discrepancies between survey and convenience data are very often dwarfed by discrepancies between those numbers and the numbers we believe to be correct, I worry that the surveys-versus-convenience-data fight isn’t more about protecting academic projects and prerogatives than about actually finding the correct answer.

Romesh Silva writes:

The claim that demographers often/usually use retrospective mortality surveys in their analyses is a bit off the mark. Looks like it is borne out of some confusion in some parts of the academy between the methods of demographers and epidemiologists…

Broadly speaking, demographers use a wide array of sources including population censuses, vital registration systems, demographic surveillance systems, and surveys (of all flavors: longitudinal, panel, and retrospective).

In the field of conflict-related mortality, demographers have actually relied almost exclusively on sources other than surveys. For example, Patrick Heuveline and Beth Daponte have used population censuses (and voter registration lists) in Cambodia and Iraq, respectively, and demographers at the ICTY (Helge Brunborg and Ewa Tabeau) have used various types of “found data” which equate to (incomplete) registration lists along side census correction methods. Distinguished demographers Charles Hirschman and Sam Preston were in the minority amongst demographers, when they used a household survey to estimate Vietnamese military and civilian casualties between 1965 and 1975.

The folks who routinely use surveys in the field of conflict-related mortality are epidemiologists, not demographers. The folks at Johns Hopkins, Columbia’s Mailman School of Public Health, Harvard Humanitarian Initiative, Physicians for Human Rights, MSF, Epicentre, etc use variants of the SMART methodology with a 2-stage cluster design are epidemiologists. This design and methodology has been coarsely adapted from a back-of-the-envelope method used to evaluate vaccination coverage in least developed countries. However, epidemiologists at the London School of Hygiene and Tropical Medicine have recently noted that this method “tends to be followed without considering alternatives” and “there is a need for expert advice to guide health workers measuring mortality in the field” (See http://www.ete-online.com/content/4/1/9).

I just thought it might help to put this all in one place.

4 thoughts on “Update on estimates of war deaths

  1. War Deaths Revisited

    Amelia Hoover and Romesh Silva of the Benetech Initiative draw readers’ attention to the important research pursued at Benetech, as well as the work of various demographers in determining the death tolls resulting from gross violations of human rights and warfare. We are, of course, aware of both research endeavours and certainly agree that they are valuable. But their work differs both in scope and purpose from that of International Peace Research Institute, Oslo (PRIO) battle death research. Plus Benetech's approach requires the approval, and sometimes support, of governments, PRIO's does not. It would be difficult, for example, to imagine a Benetech investigation taking place in Chechnya.

    The PRIO dataset provides estimates of battle-deaths, i.e., violent deaths resulting directly from “contested combat”. Benetech’s focus is on deadly violations of human rights––primarily assaults against civilians. Although there is some overlap between these different categories, it is important to note that the PRIO dataset does not count the intentional and unopposed killing of civilians as battle deaths since they do not involve combat. Benetech's findings are thus of limited value to PRIO's research agenda.

    More importantly, the main purpose for creating the PRIO dataset in the first place was to study global and regional trends in the deadliness in warfare since the end of WWII, not particular cases of human rights abuse. Benetech's Multiple Systems Estimation methodology is far too time- and resource-intensive for this purpose. Data collection for particular studies can take years and analysis and reporting even longer. There is no way that PRIO, or any other conflict research organization, could employ Benetech-type research methods for every country in conflict and repeat the exercise every few years. Nor is there any need to do this.

    The findings of the PRIO's battle-death research are coarse-grained compared with Benetech’s long and painstaking investigations, but they are wholly appropriate for their primary task in tracking trends. And the results have been impressive. The PRIO dataset has revealed dramatic – though still not widely understood – changes in the deadliness of warfare. It has demonstrated that there has been a remarkable, though highly uneven, decline in the deadliness of warfare over the past fifty years––a finding that many find counterintuitive. The average war in 1950 generated more than 33,000 battle deaths; in 2007 the average was less than a thousand.

    The PRIO data are sufficiently accurate to demonstrate the broad contours of this decline––and where the violence takes place. In the 1950s, 60s and 70s most battle deaths were in East and Southeast Asia, but the 1990s sub-Saharan Africa had become the world’s greatest killing ground and East and Southeast Asia was relatively peaceful. Neither surveys, census-based estimates, nor Benetech can track these sorts of change in a timely manner.

    So while Amelia is correct that Benetech’s methodology is superior to PRIO’s for Benetech’s particular purposes, it is neither practical nor sufficient for accomplishing the very different task of mapping global and regional trends in the deadliness of contemporary warfare which is one of PRIO's signal contributions to our understanding of contemporary warfare.

    Andrew Mack, Michael Spagat

  2. I wouldn't know how actually to assess the claim that PRIO's data are sufficient to get a handle on long-term casualty trends ("the results have been impressive"), given the total lack of global benchmark data. However, without speaking for my Benetech colleagues — as Andrew and Michael write, looking at big macro trends is decidedly not our usual work, so I don't know whether they'd agree — I suspect that PRIO does an OK job of tracking macro changes over time.

    However: if the task is estimating macro trends, divorced from specific magnitudes, then why bother taking issue with survey methodologies per se? This was (or should have been) my key point in the original comment, but Andrew and Michael's comment underlines it: as I read their remarks, PRIO estimates are designed to capture macro trends, and criticisms of particular PRIO numbers are therefore beside the point. Fair enough. If that's the case, though, I'd expect that the only pertinent criticism of survey numbers is that there aren't enough of them across time and space, or that surveys are difficult and expensive. Their accuracy in any given case (e.g., Iraq, much-cited in the JCR piece originally posted) should not be at issue.

    On the other hand, if specific magnitude estimates are under debate (as somehow they always seem to be), then (as I originally wrote) debates about surveys vs. expert guesses or convenience data are a little pointless. In the (admittedly few) cases for which there's evidence on this question, neither performs particularly well. Hence my original question: why ignore methods that actually perform better?

    I don't like the implications of arguing against MSE or other census-correction methods on practicality grounds. Macro trends are undoubtedly important — no debate there — but getting the "right answer" (i.e., a scientifically defensible estimate rather than a group of sort-of-converging guesses) is both morally and scientifically important as well.

    Statistics can be an invaluable aid to historical memory — there's a reason that many HRDAG partners have been truth commissions. Scientifically, it's increasingly clear that studying micro variation quantitatively can tell us things about conflict dynamics that macro, or qualitatively measured micro, variation cannot. Given that the technical underpinnings of MSE are increasingly quick and accessible, I don't think it's correct to imply that getting a defensible estimate isn't worth the effort.

    Last but not least, one specific and very important nitpick: we never ask for, and never get, government approval (that I know of). Truth commissions != governments. On occasion, governments have offered (and by offered I mean grudgingly allowed) access to some data. But the majority of our projects proceed without government approval, cooperation, or data.

  3. “the technical underpinnings of MSE are increasingly quick and accessible“

    I hope by this isn't meant the cartoonish explanation given at HRDAG'S website, http://hrdag.org/resources/mult_systems_est.shtml

    The final three bullet points put me in mind of nothing so much as this rather more famous cartoon: http://www.sciencecartoonsplus.com/gallery/math/m

    Can Benetech offer more compelling demonstrations of Multiple Systems Estimation ”actually performing better” in determining war deaths?

  4. whoa there, curious cat. harsh indeed! believe it or not, that "cartoonish" explanation is — get this — an explanation! for, you know, non-statisticians. it gives a good sense of the intuition of the method. for a real understanding of the technical underpinnings, i'd suggest reading the entirety of _Data Mining_ by Witten and Frank, in order to understand the generalities of the data matching process. (we use an implementation of the Weka process). for notes on the estimation procedures, you could read Fienberg and Manrique-Vallier (2009), or Fienberg et al (1999) (here: http://www.jstor.org/pss/2680485). there's also the appendices to any of the HRDAG projects. good luck! (w/r/t the ease of the method, no, it's not easy. it is, however, increasingly manageable, as some of the more complex tasks have been incorporated into R packages like Rcapture and datamining softwares like Weka.

Comments are closed.