War, Numbers and Human Losses

Posted on September 13, 2015 9:49 AM by Andrew

That’s the title of Mike Spagat’s new blog.

In his most recent post, Spagat disputes the the claim that “at least 240,000 Syrians have died violently since the civil war flared up four years ago.”

I am not an expert in this area so I offer no judgment on these particular numbers, but in any case I think that the sort of open discussion offered by Spagat is useful.

7 thoughts on “War, Numbers and Human Losses”

jonathan on September 13, 2015 10:52 AM at 10:52 am said:

This issue comes up outside of war: a person is injured by another, hangs on for a while and eventually dies. Is it murder? This question doesn’t fit well into the law’s usual treatment of time and cause: the concept of proximate cause is generally taken to mean proximate by linked action and proximate in time. Or we end up with “butterfly effect” guesses.

Reply ↓
Mike Spagat on September 13, 2015 10:56 AM at 10:56 am said:

Thanks for the plug Andrew.

I just want to put a slight caveat on Andrew’s formulation above.

I showed in my post how a casualty recording organisation (the Violations Documentation Center) had documented about 123,000 death in Syria yet an article in the Guardian cited the VDC as documenting at least 240,000 deaths. All you have to do is click on the link in the Guardian article to see that this claim is wrong. Yet it wouldn’t surprise me if in the future people cite the Guardian as showing that the VDC documented more than 240,000 deaths.

It is the Syrian Observatory for Human Rights that says there have been at 240,000 violent deaths in Syria. I’m not saying that they are necessarily wrong (which is why I’m splitting hairs with Andrew here). However, if you visit the SOHR web page you will see that while they say that they’ve documented more than 240,000 deaths they don’t really show their documentation. So I’m skeptical of the SOHR.

Reply ↓
- Rahul on September 13, 2015 10:19 PM at 10:19 pm said:
  
  @Mike:
  
  In Aug 2014, UN analysts at OHCHR reported their estimate of “191,369 people killed in Syria between March 2011 and the end of April 2014”. Further, they admit that their estimation protocols are “probably an underestimate of the real total number of people killed”
  
  Given the passage of another 12 months, 2,40,000 does not seem too unlikely?
  
  Reply ↓
  - Josh Dougherty on September 14, 2015 11:35 AM at 11:35 am said:
    
    I’m in agreement with Mike on the UN numbers. He’s already addressed some of the possible problems (no validity assessment of the input data, transcription inaccuracies). The potential problems go beyond these though.
    
    The validity question is important. The UN approach did not explore validity questions and instead assumed 100% reliability for all datasets used. But some of the datasets are likely to be more or less reliable than others and this can be very important. Say for example that 5% of the cases in each dataset were false or erroneous (which is not a big number). This wouldn’t be too much of a problem for any individual database, but the merge could compound it all, potentially making it 20% or more in the final merged number. More likely however is that the percentages vary from one to another dataset, possibly by a lot.
    
    The merge can be problematic too, and beyond the issues of validity or transcription errors raised by Mike. Irregular combatants often use aliases or multiple names, so the same person could appear in multiple databases under two or more different names. A machine merge would always see these as different people. Human spot checks of the machine could also fail to notice such problems unless they were being very thorough and working with more information than just name, date and location variables.
    
    Taken together, all of this means you certainly have the potential to overstate the number of unique documented deaths using this approach by a considerable amount. It is not justified to treat *coverage* (undercounting) as being the only potential problem, but that is what is required to interpret the results of this process as a minimum the way the UN report did.
    
    A report by Oxford Research Group in 2013 used a similar approach to the UN one, taking all the databases at face value and merging them together using a machine algorithm designed for this purpose. The results from this came out pretty similar to the UN numbers, but the claims ORG made about these results were a bit different, and I think more accurate:
    
    “…simple totals throughout this study and elsewhere should be treated with caution and be considered provisional: briefly put, it is too soon (and outside the scope of this study) to say whether they are too high or too low.”
    
    and
    
    “It cannot be stated with certainty at this time whether these numbers should be considered too low or, owing to deficiencies in the original data or our merge process, too high. They should certainly not be taken as exact, definitive, or without scope for improvement.”
    
    Compare to the claims made by the UN report:
    
    “The total 191 369 can be understood as a minimum bound of the number of killings“
    
    This has been expanded on by the authors and journalists in other contexts, with the consistent message that this should be understood as the minimum possible number.
    
    In my opinion, this claim of the UN report is not really factual. It’s wishful thinking. Such claims assert a level of certainty and authority for the numbers which is not backed by the current evidence or the work and analyses done. Making such a claim is also not an “admission” of weakness. It is instead making a claim to strength, and one that’s not really justified. The statements of ORG on this same question above, however, do accurately reflect the current state of knowledge.
    
    None of this means that a number like 240,000 is necessarily wrong, but many of the claims made regarding how much support such a number has have been overstated. Claiming such a number is a minimum is not at all a cautious claim. It’s instead a very bold one.
    
    Getting back to the point Mike was making in the original blog post. You actually have documentation groups like VDC and a few others that are now showing around 120,000. You then also have one or two groups like SOHR claiming numbers over 250,000 or 300,000, and claiming that this too is documented, but where nobody is allowed to see this supposed documentation. This is made even more strange by the fact that in these earlier reports by the UN and ORG, it was always VDC that had the highest number of documented deaths among all the individual groups. So how exactly did SOHR’s documentation suddenly more than double VDC’s in the interim (and for that matter, suddenly coming more or less into alignment with the UN merge numbers, or extrapolations thereof)? Without any transparency in this documentation, it certainly seems some skepticism is warranted here.
    
    Reply ↓
Mike Spagat on September 14, 2015 8:51 AM at 8:51 am said:

Hi Rahul.

First of all I would not be surprised if the true number is bigger than 240,000. In fact, I could endorse your formulation that “240,000 does not seem too unlikely”. I wouldn’t even object if someone wanted to make 240,000 into a central guesstimate as long has he or she recognizes that there would be a huge plus or minus around that.

People reading this thread may want to have a look at this article which explains where the idea of a 191,369 lower bound through April 2014 comes from:

http://www.ohchr.org/EN/NewsEvents/Pages/DisplayNews.aspx?NewsID=14959&LangID=E

The research underlying this number is of great interest however I just can’t buy into treating the number as a reliable lower bound.

The researchers start with five databases that document violent deaths on a case by case basis. They then integrate the databases, trying as hard as they can to eliminate double counting. In a semi-ideal world you could imagine that the end result of this work is a fairly accurate list of nearly all the violent deaths in the Syrian conflict that have been documented by someone. Since there are, undoubtedly, further deaths that no one has documented you might feel entitled to view the count of deaths in the integrated database as a lower bound.

Furthermore, the researchers threw out a fair number of deaths before they performed the integration because the details for these cases were too sparses to allow decent chances for matching against the other databases. This fact gives additional license for thinking that 191,369 is a lower bound.

However, I’m still not convinced mainly because the research team made no attempt to assess the validity of the five underlying databases. This amounts to saying that pretty much any tweet or facebook posting someone has made about a death in Syria is accurate. This is a pretty bold assumption. It is likely that some reports are fabrications. It is possible that many reports are fabrications.

Even leaving aside the fabrication issue, small inaccuracies in transcriptions of names, locations or dates of deaths can cause failures to match deaths across databases. The researchers think such problems are small but I’m not so sure.

So I resist treating 191,369 as a lower bound for the period through April 2014.

Reply ↓
- Rahul on September 14, 2015 12:27 PM at 12:27 pm said:
  
  So what do *you* think is a good estimate of the number killed? For one, the OHCHR does sound like a dependable source that must employ qualified analysts (as opposed to some of the other highball estimates that come from some passionate lone activist sitting in a room collating data). OTOH I can imagine the OHCHR has some subtle incentives in overstating the crisis as well.
  
  But the OHCHR does outline their methodology in good detail and overall they seem to have taken a lot of pains to remove the double counting errors. Indeed, not every tweet is accurate but do we have reason to believe people are systematically false tweeting? And even if they are, I find it entirely believable that this is abundantly compensated by the deaths that just fell through the cracks.
  
  Most importantly, it is extremely easy to criticize any estimate in the abstract, but I’d find more weight behind your criticisms if you (say) took a random sample of the OHCHR datasets and actually showed the fabrication rate etc. i.e. Your critique is far too vague, far too “easy”, especially when you are contradicting a number, into which went apparently went a lot of effort, thought and man hours.
  
  PS. I’m sure estimating something like a fabrication rate is very hard, but then again so is estimating deaths in a crisis. Ergo, in such cases I’d prefer to just discount critiques unless they come packaged with their own estimate, and their own (better?) methodology of estimation.
  
  Reply ↓
  - Mike Spagat on September 14, 2015 1:57 PM at 1:57 pm said:
    
    But I’m a two handed economist….
    
    OK, I’ll answer the question you ask although I have misgivings about throwing out any one number because such guesses acquire undeserved significance.
    
    250,000.
    
    As a side point I agree that the people OHCHR hired to do this are serious, took the job seriously and put a lot of effort into it. Errr…on the other hand….I wouldn’t assume this is true of any particular UN sub-contract.
    
    So far as I’m aware there isn’t an established methodology for screening out unreliable social media reports of war deaths. I have some thoughts on how you could go about doing this but I haven’t tried to implement them. I agree I’d be in a stronger position if I had.
    
    Reply ↓

Statistical Modeling, Causal Inference, and Social Science

War, Numbers and Human Losses

7 thoughts on “War, Numbers and Human Losses”

Leave a Reply Cancel reply