Skip to content

Selection bias in the reporting of shaky research: An example

On 30 Dec 2016, a reporter wrote:

I was wondering if you’d have some time to look at an interesting embargoed study coming out next week in JAMA Internal Medicine, which seeks to show that gun violence is a social contagion. I know that a few years ago, social contagion studies were controversial and I’m wondering if this work has any significant flaws – and in particular whether it controls for homophily or shared environment. If you’d any time to look at it this before its embargo lifts at Tues 11 am, and talk about this or offer a few thoughts by email, I’d greatly appreciate your time.

My response:

I don’t fully understand everything that’s going on in the paper.

For example on the second column of page E3, they say, “We restricted our analysis to the network’s largest connected component, which contained 29.9% of all arrested individuals (n = 138 163) and 89.3% of all the co-offending edges (n = 417 635). Consistent with previous research on the concentration of gun violence within co-offending networks, the largest connected component contained 74.5% of gun-shot violence episodes of arrested individuals…” First, it’s not clear to me why they would want to throw out 70% of the people in their sample and 25% of their cases of gun-shot violence. That’s a lot of data to toss out, and I don’t see why they feel the need to do this. Second, it’s not clear to me how gun-shot violence episodes fit into the data. The network is defined by arrests, so what does it mean for an episode of gun-shot violence to be “contained” by a component. I’m not saying they did anything wrong here, I just am not clear on what they are saying. Figure 1 doesn’t help because it only shows arrests, not violence episodes.

On page E4 they write, “For each gunshot subject who was influenced primarily by contagion, we identified which peer (the infector) was most responsible for causing him or her to become infected (ie, a subject of gun violence).” It’s hard for me to believe that you can do this using a statistical analysis. How could you know that someone is “influenced primarily by contagion”? I don’t even know what this means.

I’m suspicious of this: “The results of these experiments suggested that homophily and confounding were insufficient explanations for the data, leaving social contagion as a more likely explanation.” It should be all three! Just because homophily and confounding don’t explain everything, that doesn’t mean they’re nothing, right? (See page 962 of this paper for more on the general point that there’s no need to choose among explanations.)

They write that 63% of the violence episodes “were attributable to social contagion.” I don’t know what they mean by this. It sounds weird to me. But maybe it all makes sense, I don’t know. They refer to eMethods and eFigures but those have not been included here.

Also, this isn’t the whole story, but . . . it’s bad news that they approvingly cite the discredited study of Christakis and Fowler on the contagion of obesity (that’s #23 in their reference list). Oddly enough, they cite some critics of Christakis and Fowler (references 50 and 51) but it appears they didn’t internalize these criticisms or else they wouldn’t have, with a straight face, written earlier in their paper that “social networks are fundamental in diffusion processes related to . . . obesity.”

In any case, the topic is important. I don’t really buy statements such as 63% of episodes being attributable etc., as written. Somehow this all has to be untangled. If you’re connected with someone in this network, it means you’ve been arrested at the same time as that other person. People in this dataset who have been involved in gun violence are disproportionately likely to have been arrested at the same time as someone else who’s been involved with gun violence—I guess this makes sense, but I don’t know why it has to be called contagion.

Also I don’t see why it’s published in an internal medicine journal! No big deal, it just seems a bit off-topic!

You might also want to ask Andrew Thomas at CMU, a statistician who’s looked critically at some of these social contagion issues in the past.

The reporter then replied:

Thanks for your very detailed response. I spoke with another public health person who focuses on gun violence research who raised many similar questions – am going back and forth with the authors, but may skip writing about this since it seems mostly to be perplexing, even to experts.

Remember that selection bias we were talking about awhile ago, that when shaky science gets published, credulous reporters are more likely to just run the equivalent of the press release, while skeptical reporters might just skip the story entirely? The result is that what does get published is more likely to be positive and uncritical.


  1. Clyde Schechter says:

    “Also I don’t see why it’s published in an internal medicine journal! No big deal, it just seems a bit off-topic!”

    It’s not as off-topic as it seems. While the treatment of the victims of gun violence is primarily the responsibility of surgeons, prevention, though anticipatory guidance about the risks associated with gun ownership, is considered part of primary care. Consequently much of the research about the epidemiology of gun violence appears in internal medicine, pediatrics, and family medicine journals.

  2. Elin says:

    Having tried to look at some network effects, I can say that one issue that happens is that you get this one big component that you can analyze and then you have all these singletons. As people always say, a lot of violence/murders involve family and are very different (narratively, demographically, in terms of CJ processing) than the ones that involve people embedded in co-offending networks. For example, they are not involved with the drug trade or gangs. Maybe there are network impacts for those people, but they are not defined by co-offending networks.

  3. Steve Sailer says:

    The number of homicides within a city tend to go up and down quite a bit from year to year, which may suggest that they aren’t wholly random, but are instead due to disturbances such as Gang B trying to push Gang A out of some turf, or Convict X being released from prison and going gunning for Snitch Y, or whatever.

    On the other hand, it’s difficult for outsiders to recreate chains of cause and effect, since criminals tend to be either close-mouthed or untruthful.

  4. Terry says:

    Everything in this study seems completely believable, even unsurprising. It also seems that the paper could be useful in identifying at-risk individuals. The authors’ interpretation as a contagion phenomenon, though, seems like it could be improved upon.

    First, the paper is very believable because the logic is pretty straightforward:

    1. Criminals are connected by social networks.

    2. Sometimes violent events cluster within a social network. A gang war is an obvious example. A gang war flareup can produce a cluster of violence among a small group of socially-connected people.

    3. These events are clustered in time as well – gang wars die down over time.

    4. During these flareups, the probability of violence against any individual in the cluster is higher than would be expected from factors that do not vary over time, such as demographics and homophily.

    Therefore, it is not surprising that the authors find an increase in the probability of violence within a social network after a violent event in the network, and that this increase falls off over time.

    Second, interpreting this as “contagion”:

    It doesn’t seem unreasonable to cast this result in terms of contagion because it is a natural way for epidemiologists to think about this.

    But, is contagion the best way to think about this? Contagion has a very specific mechanism because germs are physically spread from one person to another. But, in this study, are we instead seeing a temporary change in “background” risk, which is more like a temporary change in background radiation after something like the Fukushima reactor disaster? When we see one case of radiation poisoning, we expect to see other cases in people who were also exposed to the radiation, but it isn’t “contagion” in the sense that one victim gave it to the others.

    • Elin says:

      I agree that contagion is the wrong metaphor, but I don’t think this kind of effect is about background level either (I think that is what they mean when they say they looked at homophily and confounding; they are actually trying to separate out the background level from the network effects). What’s important though is that caring about co-offending means that you stop treating the individual actors as independent from other individual actors in the sample. Still, the decision making around being shot is really fundamentally different than catching strep throat. Sure there is individual decision making with strep throat too, but the mix is different. Just because you can model something as if it was a contagion process doesn’t mean it actually was. It could be that one gun incident triggered a conscious response, which triggered another etc. Or it could be that one incident involving someone you know that a person witnesses or hears about lowers the barrier to engaging in the same thing. Or that basically it changes your micro-environment as represented by your personal networks to a more gun using one (which could be thought of as changing the background environment. Or it changes your behavior in that now you carry a gun whereas before you didn’t.

      @Andrew about one of your questions at the beginning I’m pretty sure what they mean is that the had this large component of a co-offending network defined by co-arrest (which is not perfect obviously since not everyone involved may get arrested for a given crime and there is no arrest in lots of crimes) and then they are looking at which of those component members end up being shot whether or not the shooting was related to the arrest the brought them into the component or involves other people in the component.

      What I think it amazing about this is the large component. About 20 years ago I tried matching up co-offenders in a city based on date, time, location of arrests and the biggest component was so large I really didn’t believe it, and I put the whole analysis aside. This makes me want to dig up that data again. Still I’ll be curious to understand how this was constructed.

      • Terry says:

        I’m not surprised at the size of the large component. What they have found is the core criminal component, many of whom have dozens of arrests. Such a network is like Six Degrees of Kevin Bacon. The network spreads quickly and captures most career criminals.

        • Elin says:

          I’m going to say that I wish I could see the supplement but I’m curious how the defined incident. In particular many arrests happen at police stations and hospitals and I’d like to know how they operationalized incident.

  5. Allan Cousins says:

    The world is a small place! I know of Andrew Thomas through his work on hockey analytics. I didn’t have the faintest clue about his work on contagions. In fact, until this post I hadn’t heard that term used in such context before.

    Anyways, he’s not at CMU anymore he’s the head hockey researcher for the Minnesota Wild.

  6. Alex says:

    IIRC there’s a standard result that a network with enough scale eventually has one giant component that incorporates all the mutually reachable nodes.

Leave a Reply to Steve Sailer