“Boston Globe Columnist Suspended During Investigation Of Marathon Bombing Stories That Don’t Add Up”

I came across this news article by Samer Kalaf and it made me think of some problems we’ve been seeing in recent years involving cargo-cult science.

Here’s the story:

The Boston Globe has placed columnist Kevin Cullen on “administrative leave” while it conducts a review of his work, after WEEI radio host Kirk Minihane scrutinized Cullen’s April 14 column about the five-year anniversary of the Boston Marathon bombings, and found several inconsistencies. . . .

Here’s an excerpt of the column:

I happened upon a house fire recently, in Mattapan, and the smell reminded me of Boylston Street five years ago, when so many lost their lives and their limbs and their sense of security.

I can smell Patriots Day, 2013. I can hear it. God, can I hear it, whenever multiple fire engines or ambulances are racing to a scene.

I can taste it, when I’m around a campfire and embers create a certain sensation.

I can see it, when I bump into survivors, which happens with more regularity than I could ever have imagined. And I can touch it, when I grab those survivors’ hands or their shoulders.

Cullen, who was part of the paper’s 2003 Pulitzer-winning Spotlight team that broke the stories on the Catholic Church sex abuse scandal, had established in this column, and in prior reporting, that he was present for the bombings. . . .

But Cullen wasn’t really there. And his stories had lots of details that sounded good but were actually made up. Including, horrifyingly enough, made-up stories about a little girl who was missing her leg.

OK, so far, same old story. Mike Barnicle, Janet Cooke, Stephen Glass, . . . and now one more reporter who prefers to make things up than to do actual reporting. For one thing, making stuff up is easier; for another, if you make things up, you can make the story work better, as you’re not constrained by pesky details.

What’s the point of writing about this, then? What’s the connection to statistical modeling, causal inference, and social science?

Here’s the point:

Let’s think about journalism:

1. What’s the reason for journalism? To convey information, to give readers a different window into reality. To give a sense of what it was like to be there, for those who were not there. Or to help people who were there, to remember.

2. What does good journalism look like? It’s typically emotionally stirring and convincingly specific.

And here’s the problem.

The reason for journalism is 1, but some journalists decide to take a shortcut and go straight to the form of good journalism, that is, 2.

Indeed, I suspect that many journalists think that 2 is the goal, and that 1 is just some old-fashioned traditional attitude.

Now, to connect to statistical modeling, causal inference, and social science . . . let’s think about science:

1. What’s the reason for science? To learn about reality, to learn new facts, to encompass facts into existing and new theories, to find flaws in our models of the world.

2. And what does good science look like? It typically has an air of rigor.

And here’s the problem.

The reason for science is 1, but some scientists decide to take a shortcut and go straight to the form of good science, that is, 2.

The problem is not scientists don’t care about the goal of learning about reality; the problem is that they think that if they follow various formal expressions of science (randomized experiments, p-values, peer review, publication in journals, association with authority figures, etc.) that they’ll get the discovery for free.

It’s a natural mistake, given statistical training with its focus on randomization and p-values, an attitude that statistical methods can yield effective certainty from noisy data (true for Las Vegas casinos where the probability model is known; not so true for messy real-world science experiments), and scientific training that’s focused on getting papers published.

Summary

What struck me about the above-quoted Boston Globe article (“I happened upon a house fire recently . . . I can smell Patriots Day, 2013. I can hear it. God, can I hear it . . . I can taste it . . .”) was how it looks like good journalism. Not great journalism—it’s too clichéd and trope-y for that—but what’s generally considered good reporting, the kind that sometimes wins awards.

Similarly, if you look at a bunch of the fatally flawed articles we’ve seen in science journals in the past few years, they look like solid science. It’s only when you examine the details that you start seeing all the problems, and these papers disintegrate like a sock whose thread has been pulled.

Ok, yeah yeah sure, you’re saying: Once again I’m reminded of bad science. Who cares? I care, because bad science Greshams good science in so many ways: in scientists’ decision of what to work on and publish (why do a slow careful study if you can get a better publication with something flashy?), in who gets promoted and honored and who decides to quit the field in disgust (not always, but sometimes), and in what gets publicized. The above Boston marathon story struck me because it had that same flavor.

P.S. Tomorrow’s post: Harking, Sharking, Tharking.

37 thoughts on ““Boston Globe Columnist Suspended During Investigation Of Marathon Bombing Stories That Don’t Add Up”

      • It seems there are two kinds of cheaters. Lance Armstrongs who are genuinely very good but whose cheating pushes them to world-class level, and Rosie Ruizes who are laughably bad even with the advantage of cheating.

  1. Andrew, I’m not sure if the reason for science (or journalism) is as important to understanding any of this as the motivations of scientists (or journalists). I suppose scientists are motivated (at least in part) by a desire to “learn about reality” — whatever that entails — but even if we suppose that goal has shared meaning across a scientific community, it’s hardly the only motivator. I think you get at the important stuff near the end, where you suggest incentive structures that are likely to be influencing these people (in this case, scientists).

  2. I love “Gresham” used as a verb in the last part. The meaning is obvious (at least with an economics background) but it is also a term with no clear synonyms, so it’s a valuable neologism. Have you seen it used elsewhere?

  3. +1 for commending Greaham as a verb. I was about to.

    I heard about a natural language processing talk years ago where the speaker was said to have said in response to a question that “it’s well known that in English you can verb any noun.”

    • Ethan:

      “it’s well known that in English you can verb any noun.” I have the impression that category theory goes a long way in showing that the same can be done in mathematics, but this might be more a function of the amount of time that has passed since my encounter with that subject.

  4. Andrew wrote:

    “The problem is not scientists don’t care about the goal of learning about reality; the problem is that they think that if they follow various formal expressions of science (randomized experiments, p-values, peer review, publication in journals, association with authority figures, etc.) that they’ll get the discovery for free.”

    I guess that’s exactly what Feynman was referring to when he talked about cargo-cult science.

    (Source: http://calteches.library.caltech.edu/51/2/CargoCult.htm — Please note that the psychologist identified only as “Young” was in all likelihood P. T. Young, a pioneer of motivation science and one of the very few who dared use terms like emotion and affect during the behavioristic ice age.)

    • From Feynman, “I would like to add something that’s not essential to the science, but something I kind of believe, which is that you should not fool the layman when you’re talking as a scientist. I’m not trying to tell you what to do about cheating on your wife, or fooling your girlfriend, or something like that, when you’re not trying to be a scientist, but just trying to be an ordinary human being. We’ll leave those problems up to you and your rabbi. I’m talking about a specific, extra type of integrity that is not lying, but bending over backwards to show how you’re maybe wrong, that you ought to do when acting as a scientist. And this is our responsibility as scientists, certainly to other scientists, and I think to laymen.”

      Today it seems that many scientists are doing the exact opposite.

  5. Andrew said,

    “What struck me about the above-quoted Boston Globe article (“I happened upon a house fire recently . . . I can smell Patriots Day, 2013. I can hear it. God, can I hear it . . . I can taste it . . .”) was how it looks like good journalism. Not great journalism—it’s too clichéd and trope-y for that—but what’s generally considered good reporting, the kind that sometimes wins awards.

    Similarly, if you look at a bunch of the fatally flawed articles we’ve seen in science journals in the past few years, they look like solid science. It’s only when you examine the details that you start seeing all the problems, and these papers disintegrate like a sock whose thread has been pulled.”

    I think there may be something going on here that is in some ways analogous to the “continuous vs discrete” thinking preferences. I don’t really know how to describe the two preferences/styles/defaults involved well, but to make a rough stab at it: one type is strong on feelings and impressions, and the other is strong on details (including both factual details and details of reasoning). Looking at things this way, I can’t help but wonder if people with the first preference genuinely believe they are being accurate, or scientific, or whatever, because what they say focuses on what they believe to be the crux of the matter, whereas someone with the second preference instinctively focuses on the missing details (which they see as the crux of the matter).

    PS I’m in the second category, where “The devil is in the details”.

    • > I can’t help but wonder if people with the first preference genuinely believe they are being accurate, or scientific, or whatever, because what they say focuses on what they believe to be the crux of the matter

      I agree that I think something like this attitude underlies a lot of junky work in science and elsewhere. Of course, the problem that people with this attitude seem to miss is that dishonesty in the service of truth ultimately undermines that truth.

      I’m thinking of the guy from several years ago who made up the stories about witnessing horrible acts of child abuse in Chinese cell phone factories. But when his dishonesty came out, as it invariably does, it effectively gave cover to those factories which almost certainly were and are abusing children. So maybe he had altruistic motives, but even so probably wound up hurting the children he was trying to help.

      • gec said,
        “Of course, the problem that people with this attitude seem to miss is that dishonesty in the service of truth ultimately undermines that truth,” in response to my comment, ” I can’t help but wonder if people with the first preference genuinely believe they are being accurate, or scientific, or whatever, because what they say focuses on what they believe to be the crux of the matter”

        I do not consider what I described as an “attitude,” but as a “perspective” — in particular, I don’t think they see what they are doing as dishonesty.

        • I see, apologies for my misinterpretation!

          I still don’t understand how they reconcile their belief that they are serving truth given that the ultimate consequence of their actions is undermining truth. But maybe I should hope I never understand that!

        • gec said, “I still don’t understand how they reconcile their belief that they are serving truth given that the ultimate consequence of their actions is undermining truth.”

          I think the point is that what you see as “given” is something that they don’t see as “given”.

  6. Martha:

    Also in this ecosystem I think it’s vitally important to have insider/outsiders: people who know enough about the system to see the problems and be disturbed by them, but who are disconnected enough that they can scream about the problems.

    In the above case, the Deadspin reporter Samer Kalaf was willing to talk about the problems with this junk news story, without feeling any social reason for going easy on the criticism.

    It’s a similar thing in science. There’s a web of people who rely on each others and don’t criticize others’ research, even really bad research (see here for an extreme example). It’s that alliance of celebrities. So it helps to have outsiders such as Nick Brown, Anna Dreber, James Heathers, etc., who are willing and able to speak up.

  7. These are arguments over the construction of representational accuracy. So the journalism argument is that ‘area’ of endeavor has a root in a form of veracity which says ‘must not make up too much’. You can, of course, make up plenty in journalism: you tell any story by selecting or de-emphasizing narratives threads that you may connect inventively, and that you may connect hypothetically, and as long as you note you are being hypothetical – or at least hinting at that – then your inventive connections may actually explain away what might otherwise be considered factual by viewing it in a larger explaining context. Like a 9/11 Truther does. In other words, all of ‘this’ , meaning this form of representational construct, is some form of confabulation, a term borrowed from dementia. Journalism is a type of confabulation which roots in a specific form of ‘don’t make this part up without being clear you’re making this part up’. It wasn’t, for example, that Cullen wrote bad stuff but that he left out the few words necessary to root in journalism, that being some form of ‘as if’ statement. Nothing more than a bit flipped to set the story to ‘false’ so the journalism could read it as ‘true’ to type.

    Another construct of truth would be legal truth. I saw a preview on 60 Minutes of a young woman saying the justice system doesn’t process victims through ‘whole’, meaning they’re diminished. Outside the legal system, that probably sounds terrible. Inside the legal system, that’s a cost of making sure people are guilty. We as a society have for hundreds of years believed that the individual case of the defendant takes precedence, and thus that we impose a cost on victims to protect the rights of the accused. It’s not only normal but it’s a feature: a central reason this developed is so people avoid being victims. In other words, there is a deterrent effect in punishing victims. It’s tough to acknowledge because it’s uncomfortable but the point is: don’t put yourself in a position where you can be victimized. This is not the same as a judge saying a girl asked for it by dressing in tight clothes: that’s an idiotic blaming of a victim. I like to imagine it this way: you walk into court and say, ‘But that little boy was just so good looking, he provoked my into raping him.’ Abhorrent blaming of the victim when the law should be punishing the accused’s failure to comport as society wants.

    I noted a similarly interesting construct on the news tonight. I can phrase it this way: if a woman in college had a few to drink and lifted her shirt to show her breasts, should she be disqualified from being a judge later in life? Is drunk behavior excusable in a female because the female is considered sexually desirable just because she has breasts? Doesn’t that institutionalize a male perspective? That is, a man can take off his shirt but a woman can’t because her breasts are considered sexual. Does that then mean a woman exposing her breasts gets automatic forgetaboutit? It kind of evens things out, right? What if she lifted her skirt and wasn’t wearing underwear? What if it wasn’t welcome, if it was the kind of drunk moment that makes people cringe? It’s all just different constructions that fit to priors we generally don’t examine and more often don’t understand even if acknowledged.

    • There are two aspects to baring breasts while drunk: Baring the breasts, and being drunk. Let’s consider it without being drunk: Suppose on a very hot day in Texas, two women are running around the track outdoors. All the men are running topless. So the women decide to go topless too. How do people react? (I haven’t done that myself, but knew a woman who did, together with another woman who was running with her.)

      • seems like the discussion about “continuous thinking” is really just a reaction against NHST. That’s appropriate in a sense, but characterizing it as “continuous” vs. “discontinuous” thinking seems misleading to me. Really it’s just about using a cut-off value and/or NHST, it’s not the continuity or discontinuity of the processes being modeled. There are lots of discontinuous processes and boundaries in the physical world and presumably in the social world.

        • Are you obese or not, do you have type two diabetes or not, are your cholesterol levels above X or not, do you have high blood pressure or not, do you have a High School degree or not? Are you below the “threshold for poverty” or not….

          binary thinking, or classification into a small set of categories, is rampant everywhere.

          Rarely do we see something like “plug in your values for age, cholesterol, blood sugar, weight, height and number of cigarettes smoked per week, and get an absolute risk for death in the next 5 years, and a life expectancy…

        • Daniel,

          Re: Rarely do we see something like “plugin your values for age, cholesterol, blood sugar, weight, height and number of cigarettes smoked per week, and get an absolute risk for death in the next 5 years, and a life expectancy…

          —–
          You characterize the above as ‘continuous’?

        • Sure, age is “continuous”, cholesterol levels are “continuous”, blood sugar levels are “continuous”, weight is “continuous”, height is “continuous”, cigarettes per week is probably discrete but at least probably between 0 and 100, risk for death is a continuous number (probability between 0 and 1), and life expectancy is a continuous number…

          of course many of these measures are actually discrete (age is probably measured in years or rarely days), height is probably rounded to the nearest inch, etc but sufficiently gradiated that the continuous approximation is much better than a binary or trinary or 4 way type classification.

        • You are comparing the description of some biological states [as continuous or discrete] with the results of some tests that are answered as ‘binaries. Jim’s point is not negated: ‘ There are lots of discontinuous processes and boundaries in the physical world and presumably in the social world.’ What I mean is that that in some respects processes and boundaries can constitute both continuous and discrete, which, I think, is what you may be implying?

        • No my point is that while the decision is often discrete (I should prescribe drug x, or I should not prescribe drug x) often the underlying questions are continuous. How much longer will you live on average if we prescribe the drug given your specifics). however rather than thinking about this underlying question and the tradeoffs associated, even the models that are used are discrete: if the patient has high blood pressure and cholesterol over x then prescribe drug y)

          there are many problems with thinking about the underlying problem discretely, including inappropriate choice of cutoffs, and amplification of noise in measurements etc

          yet this is extremely common in many fields.

        • In the earlier examples, I inferred that results are cast in binaries while the processes toward the results are cast discrete and continuous. I am probably also mindful of the terms ‘linear’ and ‘non-linear’, which then, at least for me, points out how nearly all terms have their limitations. I don’t see, for example, how someone can suggest that they think ‘continuously’. I really haven’t seen that proclivity to any great extent in anyone. Never mind how ‘continuous’ and ‘discrete’ are conceived in different domains.

          Finally, Do you think that doctors/patients would be benefitted by understanding and conveying ‘absolute risk’ to their patients?

        • Sameera: you’re thinking continuously if a small change to something produces a small change in something else. So if your cholesterol score goes from 194 to 195 and this changes your model from outputting “don’t Rx drug” to “do Rx drug” you don’t have a continuous thing… but if it changes your model’s output of absolute risk from 2% per year to 2.04% per year then you have a continuous model.

          If your thinking about how to approach a problem is already “we have 3 possible choices: no drug, drug X, or drug Y and we want a decision rule that outputs one of the 3 possibilities” you have short-circuited some of the most important parts of the model.

          >Finally, Do you think that doctors/patients would be benefitted by understanding and conveying ‘absolute risk’ to their patients?

          Yes they would. Every treatment has costs and benefits… if we don’t estimate the benefits, we can’t balance them against the costs. Suppose you have a model that chooses between “do nothing” and “Rx drug X” on the basis of some risk factors… it does nothing other than answer the question “which of these actions to take”.

          Now suppose the patient falls in the “Rx drug X” category, but they have unpleasant side effects. How do they decide whether to discontinue the drug or not? The right way to decide is to look at the benefits the drug is giving you, let’s say “reduced risk of stroke” vs the costs involved: “out of pocket dollar cost plus loss of life satisfaction due to side effects of drug”

          Suppose we have two of these patients, one of them just barely falls in the “Rx drug X” category, the other one is deep into the weeds of high risk…. both have the same unpleasant side effect.

          A continuous model says: “patient A has 25/100000 risk of dying of stroke per year which is about 10x the risk from driving a car, while patient B has 2500/100000 risk of dying of stroke per year which is about 1000x the risk from driving a car”

          isn’t this information the two patients should use to decide whether discontinuing the drug to avoid side effects is a good idea?

          But the discrete choice models will not give such information.

        • Daniel said,
          “Suppose we have two of these patients, one of them just barely falls in the “Rx drug X” category, the other one is deep into the weeds of high risk…. both have the same unpleasant side effect.

          A continuous model says: “patient A has 25/100000 risk of dying of stroke per year which is about 10x the risk from driving a car, while patient B has 2500/100000 risk of dying of stroke per year which is about 1000x the risk from driving a car”

          isn’t this information the two patients should use to decide whether discontinuing the drug to avoid side effects is a good idea?

          But the discrete choice models will not give such information.”

          Good example.

        • Daniel, thanks for the response

          RE: ‘Now suppose the patient falls in the “Rx drug X” category, but they have unpleasant side effects. How do they decide whether to discontinue the drug or not? The right way to decide is to look at the benefits the drug is giving you, let’s say “reduced risk of stroke” vs the costs involved: “out of pocket dollar-cost plus loss of life satisfaction due to side effects of the drug”

          Suppose we have two of these patients, one of them just barely falls in the “Rx drug X” category, the other one is deep into the weeds of high risk…. both have the same unpleasant side effect.’
          ——
          I may be wrong. There seems to be an implicit assumption that a blood pressure prescription can reduce high blood pressure for the duration that it is taken. I am skeptical about that given what I have heard from those who have been on them. They do have detrimental effects on the heart and other parts of the body. So how the benefits and harms [as distinct from costs] are conveyed is critical. Comparing the risk of high blood pressure to the risk of driving a car doesn’t strike me as all that helpful. I just wonder how a doctor actually conveys the risks and what the doctor counts as good evidence for the efficacy of drug prescribed. Moreover, it would be curious to learn what are the financial incentives in prescribing a particular drug.

          Whenever there are conflicts of interests, the resort to these cutoffs persist, I venture. It’s the script that goes along with the prescribing.

        • My example was meant to discuss the usefulness of the difference between a continuous vs a discrete model, not to be a realistic description of how modern medicine, or a specific drug does or should work.

          When it comes to decisions, you can ask the question: what utility is implicit in the decision rule? For example you have some decision about whether to take a drug or not. Someone creates a decision tool for doctors to use… You could build a Bayesian model of outcomes and then ask under what kind of utility would the decisions come out the same? You could for example model the utility of the *drug manufacturer* who wants to maximize their sales of the drug while minimizing risk of being sued over side effects etc. Then you could model the utility of an insurer who wants to minimize sales of expensive drugs, while also minimizing costs of covering diseases the drug is designed to treat… Then you could model the utility of an uninsured consumer… who wants to trade off health vs dollars. Then you could model the utility of an insured consumer… who wants to trade health vs dollars but has already committed to some fixed premium payments, and whose marginal costs are relatively small (copays etc)….

          each one will result in different recommendations. Which one matches the actual decision tools published and in-use by doctors?

        • Jim said, “seems like the discussion about “continuous thinking” is really just a reaction against NHST.”

          It’s not for me. I just tend to think of many things in continuous scales, but am aware that a lot of people don’t. So for me, this is an observation about individual differences in thinking (or in seeing the world) that had origins way before I ever heard of NHST. (I didn’t know anything about NHST until I was in my fifties, but remember thinking in continuous terms when I was in high school.)

Leave a Reply to gec Cancel reply

Your email address will not be published. Required fields are marked *