Alexey Guzey plays Stat Detective: How many observations are in each bar of this graph?

How many data points are in each bar of the top graph above? (See here for background.)

It’s from this article: Milewski MD, Skaggs DL, Bishop GA, Pace JL, Ibrahim DA, Wren TA, Barzdukas A. Chronic lack of sleep is associated with increased sports injuries in adolescent athletes. Journal of Pediatric Orthopaedics. 2014 Mar 1;34(2):129-33.

Here’s the information we have to work with:

This study was conducted at a combined high school/middle school in a large metropolitan area. . . . Eligible participants included any male or female student at the school who was entering grades 7 to 12 who had participated and planned to continue to participate in at least one sport during the previous and upcoming year. . . . Informed consent for participation in our study was obtained from 160 student athletes and their parents. All consenting students were sent a copy of the survey by school-registered email. Of the 160, 112 student athletes (54 male and 58 female athletes) completed the survey . . . Of the 112 athletes studied, 64 athletes (57%) sustained a total of 205 injuries; 48 athletes (43%) were not injured. . . . Sixty-five percent of athletes (56/86) who reported sleeping < 8 hours per night were injured, compared with 31% of athletes (8/26) who reported sleeping ≥ 8 hours per night.

This is like one of those logic puzzles they gave us when we were kids!

Let’s label the proportions in the bar graph as y5/n5, y6/n6, y7/n7, y8/n8, y9/n9.

What do we know? From the text:

• n5 + n6 + n7 + n8 + n9 = 112

• n5 + n6 + n7 = 86

• n8 + n9 = 26 (this is redundant information but it’s good to check that 26 + 86 = 112 so we’re not missing anybody)

• y5 + y6 + y7 = 56

• y8 + y9 = 8

From the graph:

• y5/n5 = 60%

• y6/n6 = 75%

• y7/n7 = 62% (Guzey counted the pixels: 310/499 = 0.621)

• y8/n8 = 35%

• y9/n9 = 16.7%

• n5 + n6

Given all this information, Guzey solved the puzzle. I’ll paraphrase:

• The easiest place to start is with the last two bars, which represent a total of 8 injuries out of 26 kids. It’s gotta be 7/20 and 1/6, as there are the only numbers that give anything close to 35% and 16.7%. For example, 6/14 and 2/12 won’t work.

• Next you can figure out the first three bars, which represent a total of 56 injuries out of 86 kids. The only numbers that are consistent with the above information is 3/5, 15/20, and 38/61. You can play around with other possibilities and they just don’t work out.

So now we can ask, why did Matthew Walker in his book Why We Sleep remove the left bar of that graph (3 injuries out of 5 kids) but keep the right bar (1 injury out of 6) so as to produce the second graph shown above?

The answer is clear. He cut off the left bar because it was based on N=5. He kept the right bar because it was based on N=6. Walker was following the well known remove all bars for which N is less than 6 rule.

Ok, just kidding. I think the real reason he removed the left bar and kept the right bar is that the left bar didn’t support his story that lack of sleep is bad for you, but the right bar did support his theory. He didn’t want to confuse the reader with the complexities of reality.

Not wanting to confuse the reader with the complexities of reality is ok if you know the underlying truth, but it’s not such a great idea if you don’t have a direct line to God and if, like everyone else, you have to rely on empirical evidence to make your conclusions.

To step back a moment, how much can you really conclude about sleep and injuries based on a one-time study at one school? So, from that point of view, you’d have to say that Walker must have already had strong beliefs about the danger of sleeping less than 8 hours a night, and these data provide confirmation rather than evidence. They’re an illustration of his point rather than evidence for his point. Fair enough; still, if you want to present data, you shouldn’t cheat.

The larger point

The larger point here is not about cheating or research misconduct; it’s about research more generally. It’s about learning from data.

We’ve been talking a lot about Walker because his book and Ted talk have received a lot of attention. But the practice of misrepresenting evidence to make a clearer point when teaching . . . that happens all the time. Tomorrow I’ll post another example.

I think that the people who do this really don’t think they’re doing anything wrong: they’re just cleaning up the data to tell a better story. Just like Marc Hauser did when he insisted on coding all his data himself: he knew the story and he didn’t want any random variation to get in the way of it.

The trouble is, once you start mucking with the data, you move from what Thomas Basbøll and I called “stories” to what we called “parables.” Data, and good stories, are immutable—indeed, it’s this immutability that allow us to learn from data. When you start trimming your data to fit your preconceptions or to fit the story you want to tell, you’ve given up your ability to learn. You’re no longer doing science—and you might not be doing good education either. In the immortal words of John Clute, “End of novel. Beginning of job.”

24 thoughts on “Alexey Guzey plays Stat Detective: How many observations are in each bar of this graph?

  1. I like the concept of parables! My field is fisheries science and I have an example of a published article that trimmed a 50 year data set to only the most recent 6 years or so because those years showed a trend that fit with the authors preconceptions and favoured story line. That it wasn’t pointed out in peer review is another debacle. I fully support the concept that messing with the data to fit your narrative means you have basically given up science. You are now a lawyer/advocate.

  2. I think part of the problem is the use of a simple bar chart. The lower rate for 5 hours is probably a fluke. Putting error bars would likely show that the 5hr injury rate is probably not smaller than the 6hr injury rate, and could easily be larger (higher upper bound on CI). And then he would be less likely to throw out the 5hr bar.

    • +1

      I can understand leaving off the error bars in a popularization, but I do not understand how it’s allowed in serious professional journal. I guess surgeons don’t talk to statisticians much?

    • “Putting error bars would likely show that the 5hr injury rate is probably not smaller than the 6hr injury rate, ”

      I don’t see the point in putting error bars on data that’s self reported from memory. The whole chart is a fluke. It’s not worth even discussing in a scientific context.

      Talking about this chart in terms of the missing bar is also a waste of effort. The bar is just as meaningless as the whole chart. It’s all garbage from square one. Not science.

      If people want to do science, then they need to get reliable measurements, not self-reported and metaturk junk data.

  3. Concerning Andrew’s more general concern – I think the problem goes even deeper. I don’t like most textbooks (especially American ones – the British textbooks are often better). One of my main complaints is that the examples are mostly cleaned up ones so that the point (e.g., conducting a T test; interpreting a regression coefficient, etc.) is clear and students can show they have mastered the basic concept. While those goals are worthwhile, there is an unintended side effect – which is not so harmless in my opinion. By providing examples that remove the messiness of real data, we teach that conclusions are often unambiguous. But they never are. The “real” story is always ambiguous. So, even if you don’t muck up your data, when you provide clean data that leads to unambiguous results, the effect is similar. You can justify it by thinking you are just making things clearer for the reader/student. But you may inadvertently be creating a dangerous illusion of certainty.

  4. But when does using a simulation as a simplification of reality become a parable instead of a story? If one is clear that the simulation’s purpose is to simplify, then intentionally adding interventions to understand the results, is that a story or a parable? Perhaps a good parable?

    • Chris,

      I’m not at all clear on what you’re trying to say. Can you explain what you mean by “using a simulation as a simplification of reality”?

  5. Maybe I can make this fit an actual parable. Take The Good Samaritan. It’s a simple story of a guy lying in the road. The first two men to come along dont do anything but the 3rd one does. The story is often presented as the old ‘law’ being wrong, while the new law is good, as seen by the Samaritan’s actions, which takes the Jewish context and twists it into Christian meaning. If you restore the Jewish context, the point shifts to mirror closely the Story of Ruth, in which a foreigner, an actual Moabite, one of the few groups listed as an enemy, is accepted into the ‘people’ because she demonstrates her worth through devotion to her mother-in-law Naomi. The Jewish point would be that the Samaritan should be ‘accepted’ because he did the right thing. A key difference is that in the Jewish context, the first two did nothing wrong: they couldnt touch the man because that would ritually defile them and their community. That left it up to the Samaritan to help, though he had no obligation other than as a person. So, by removing the data of the Jewish context, you can actually flip the meaning over so it becomes a condemnation of legalistic Jews instead of a plea to accept the non-Jew as a righteous person.

    This seems fairly similar to the way the story of the data at least says that 5 hours of sleep means you’re less likely to have an accident, in direct opposition to the claim that shorter sleep means more. (I’d love to see if staying up all night improved things! It never worked well for me in college.) Presentation of the data is thus like channeling a story into a version that says what you want it to say, even if or though that version is actually an if not the opposite of the original meaning.

    For non-Jews, in the Jewish context, the Samaritan story is similar to any number of stories about rules and who needs to abide by them and when. One of my favorites is Hassidic and it involves a great rabbi who doesnt go to worship on Yom Kippur because he was taking care of someone who was ill. That’s an explicit ranking of obligations. Since obligation is equivalent to commandment to mitzvot, the religion constantly focuses on the meaning of adherence. One of those meanings is how you treat people who act righteously though they may also disagree with you. Moabites were actual enemies, but of course Ruth as a woman wasnt the same as a male Moabite. Samaritans are still around in small numbers, still fundamentalists who only accept a few parts of scripture. They’ve been at odds with ‘Judaism’ all along.

  6. Andrew said,
    “Not wanting to confuse the reader with the complexities of reality is ok if you know the underlying truth, but it’s not such a great idea if you don’t have a direct line to God and if, like everyone else, you have to rely on empirical evidence to make your conclusions.”

    Maybe “Direct Line to God” needs to go into The Lexicon?

  7. “I think that the people who do this really don’t think they’re doing anything wrong: they’re just cleaning up the data to tell a better story.”

    I’ve been calling this “story-first thinking” contrasted with “data-first”. It’s widely practiced. the data are fitted to the story rather than the other way round.

    Nice example!

  8. My bad, I did not remember it well, being from way back in 2007 but it is interesting anyway:
    https://science.sciencemag.org/content/318/5857/1772.abstract – original article
    https://salmonfarmscience.files.wordpress.com/2012/02/sealice_2008_sea_lice_extinction_hypothesis_fails.pdf – comment on original article
    https://www.tandfonline.com/doi/abs/10.1080/10641260802013692?journalCode=brfs20 – response to comment
    https://science.sciencemag.org/content/322/5909/1790.2 – another rebuttal to original article
    http://www.math.ualberta.ca/~mlewis/Publications%202009/Krkosek-Ford-Morton-Lele-Lewis_WildSalmon—.pdf – another rebuttal to comment

    Maybe too much? but it is another example of the back and forth and digging in/defending the original conclusions as not wrong…

Leave a Reply to JDK Cancel reply

Your email address will not be published. Required fields are marked *