Freakonomics 2: What went wrong?

Following up on Kaiser’s death-by-a-thousand-cuts (see here and here),
Mark Palko adds an entry in the “What happened with Freakonomics 2?” sweepstakes.

Palko’s theory is that Levitt and Dubner’s most logical decision, from a cost-benefit perspective, was to avoid peer review (here I’m using the term generally, considering statisticians such as Kaiser Fung as “peers” whether or not the reviewing is done in the context of a formal journal submission) so as to get a marketable product out the door with minimal effort:

I [Palko] am not saying that Levitt and Dubner knew there were mistakes here. Quite the opposite. I’m saying they had a highly saleable manuscript ready to go which contained no errors that they knew of, and that any additional checking of the facts, the analyses or logic in the manuscript could only serve to make the book less saleable, to delay its publication or to put the authors in the ugly position of publishing something they knew to be wrong.

I think this theory has a lot going for it, although maybe it could be framed in a slightly more positive way. Consider my favorite of Kaiser’s comments on Freakonomics 2:

Would like to see a study of Internet substituting pimps. As it stands, this is an assertion without proof.

Palko’s reasoning definitely applies here. Levitt and Dubner already had a chapterful of material on prostitution, and they’re not preparing a treatise on the subject, so no point in going further. Similarly, the notorious offhand remark on drunk driving (which they attribute to “brilliant economist Kevin Murphy”) is worth a page. No point in looking at it in detail and ruining the story.

Overall, though, I’d still go with a slightly different theory, which is that Levitt and Dubner believe that everything in their book is correct. Well, not quite. At the end of chapter 1, they have a disclaimer that they are writing the book to start conversations, not to end them, and that they want to stir up controversy and discussion. Still, I think they’re a bit more gullible credulous than professional statisticians such as Kaiser or myself. The problem, I think, is that they trust their friend/experts too much. Trusting friends and experts makes a lot of sense, I think, but if you’re not careful it can lead to some silly mistakes.

I have to be careful when making this claim, given the controversy that ensued after I speculated on how Dubner could’ve endorsed the ridiculous (to me) claim that “most (if not all!) deaths are to some extent ‘suicides.'”

More principles of economics

The other thing that’s going on, I suspect, is division of labor. To start with, I doubt Dubner would be checking the sorts of statistical points that Kaiser is raising–really, that would be the job of Levitt, who’s one of the most qualified people in the world to evaluate quantitative claims in social science. (For example, I think that if Levitt had ever carefully read the papers by Monica Das Gupta, he wouldn’t have been so quick to jump to conclusions about the causes of China’s missing girls.) But, to continue with the division-of-labor argument, I don’t think Levitt thinks it’s his job to check these sorts of statistical arguments either. He’ll trust his friends and colleagues–after all, they know the various topics better than he does, right?

Usually it makes sense to trust the experts–especially those that you personally know and trust–but not so much if you’re writing a whole book full of counterintuitive examples. It’s selection bias: people’s most controversial statements are the most likely to be mistaken. If these statements never get checked, then you have little more than a set of press releases linked with engaging commentary. Perhaps Kaiser Fung could be added as a coauthor for Freakonomics 3, to add some of that close reading that could make all the difference?

P.P.S. I better repeat my disclaimer from last year: I’ve been picking on Freakonomics a lot recently, but really this is the result of selection bias: when Freakonomics has material of its usual high quality, I don’t have much to add, and when there’s material of more questionable value, I notice and sometimes comment on it. Those of us who’ve contributed to the burgeoning “what went wrong with Freakonomics 2?” literature are doing so only because we believe its authors could do better, if they were to put in the effort.

6 thoughts on “Freakonomics 2: What went wrong?

  1. Very interesting. Thank you.

    May I add:

    1. Not academic writing and I would easily conclude that any trained person would look at so much stuff in so many areas and say, "Let's go with what looks good. It won't be particularly accurate but we're trying to use these points – wrong as they may be – to illuminate a way of thinking."
    2. Money.

  2. I think I need to explain to you about how "new social economics" (henceforth NSE) works Mr. Gelman and what it has done to my generation of economists. I'm technically only a sophomore so you might want to take my beliefs with a grain of salt, but I've been reading the economics literature since I was 14 so I've seen the evolution of the field. I'm also very much into marco (and think a lot of macro literature is also willfully lazy) and think it's terrible that my generation thinks that amateur-hour sociology crap (in general, if you aren't able to judge who is correct based on the merits, if an economist says something about cultural or societal norms and sociologists say something else – go with the sociologist) is the best way to get a career in the field.

    There are 5 problems with the NSE literature.

    (i) thinking that if you can run a regression you can open your mouth about any topic, especially if it confirms your priors or if it's contrarian.

    (ii) not particularly caring about evidence that counters yours. It's a hallmark of NSE literature to marginalize literature that disagrees with your point of view, even if you have no coherent rebuttal to that previous research.

    (iii) not even considering alternative hypotheses

    (iv) dramatically overstating the generality of your results.

    (v) plain laziness.

    You see each of these problems in the NSE literature, and Superfreakonomics. Look at the drunk driving / drunk walking example, it reflects (i) beautifully. Levitt (and Murphy) never really thought about the situation. They just did some seemingly straightforward math and since the result was contrarian they went with it. They didn't bother to research the topic, or thoroughly think it through (I immediately saw the problem with their analysis so it really surprises me they didn't catch that). This is the NSE hallmark. Look at the global warming section. It wasn't well researched. It was lazy and contrarian, an NSE staple. Or his gang chapter. Studying the financial records of a single gang let's you talk about how gang's function? No, that's overstating the generality of their results. It let's you say that this gang acts this way.

    You can see the same type of trivial mistakes published in academic journals all the time by the NSE crowd. Look at the case of Emily Oster you cited. Her results were already on very, very shaky methodological foundations. The sample sizes were very small, not a single thing was done about confounding effects, the coefficient sizes were very small, the results were heavily dependent on particularly countries that are known to show extreme gender bias, her cross-country evidence does nothing about controlling for culture etc etc. It was obvious it was a very weak foundation. Again, if you can do a regression, not understand what you are talking about, be lazy and contrarian – you have economics gold! And of course Oster and Levitt didn't carefully examine the work of Das Dupta (or Klassen or Coale or Sen), they addressed it only superficially without real consideration. It was literature that disagreed with the thesis, that means it must be marginalized. Also, Oster overstated her results given her massive methodological flaws. Again, NSE literature as it's finest.

    Superfreakonomics is just how the academic literature of the NSE school works, except it's even sloppier since it's made more accessible to the public.

  3. Asking "What happened with Freakonomics 2?" is probably the wrong question. The right question would be why was there a fad over Freakonomics 1?

    Was Freakonomics I all that great? Most of it was fairly trivial except for the famous part about abortion and crime … and, as Foote and Goetz showed a few months after the book's publication, that turned out to be based on a programming error!

    I skimmed SuperDuperFreakonomics (or whatever they called it) and it looked about as good as their first book.

    Both Freakonomics books are a lot better than their foremost competition, Malcolm Gladwell's books.

    A better question should be: Why doesn't Gladwell hire a statistician the way Levitt hired a writer?

  4. Jonathan:

    "Not academic" is important, I think. My impression is that Levitt is much more careful in his academic writing than in his blogging and book writing. Perhaps one reason is that in the blog or the book, he can rely on the work of others, but in a research paper he has to defend his own claims, so he's going to be more careful and critical.

    Ted: A related point is that, in many ways, peer review in econ is much more rigorous than peer review in other fields. I've published papers in econ, poli sci, sociology, and of course statistics, and I've found reviewers in econ journals to be generally the most critical and demanding. If you're Steven Levitt and have successfully published many peer reviewed papers in top econ journals, it's natural to think that your work must be correct–or at least more plausibly correct than the equivalently published work in sociology, public health, and other fields. This may also be a reason that Levitt assumed that Oster's work was correct–if it was published (or even submitted) to an econ journal, that's a badge of approval, so why bother checking papers in the public health literature. Sadly, though, even the rigorous peer review of econ journals doesn't stop errors from finding their way into print.

    Steve:

    I agree that Freakonomics 1 had some problems; still, I think that its message (there are hidden statistical patterns in life) was more interesting, and better done, than the cruder messages of Freakonomics 2 (pretty much pure contrarianism with some facts sprinkled in).

    When comparing Levitt/Dubner to Gladwell, I think it can help to step back from the details of the correctness of their various claims about drug dealers, quarterbacks, and the like, and look at their larger messages. Consider Gladwell's big ideas: The Tipping Point. Blink. Etc. These really are Big Ideas. Maybe they're wrong, and they're not original to Gladwell (nor does he claim that they are) but they're paradigm-changers. In contrast, Freakonomics 1 has the Big Idea that Levitt is a genious, and Freakonomics 2 has the Big Idea that Up is Down (in whatever area you happen to look at). On the Big Idea scale, Gladwell beats Levitt/Dubner hands down.

    Finally, is it really true that Levitt hired a writer? My vague impression (not based on any particular information) was that Dubner was introduced to Levitt–I don't know how–and wrote a magazine article about his research. The magazine article was so popular it led to a book. But I thought this was mostly Dubner's doing, not Levitt.

  5. Andrew, I'm with Ted. Economists may look carefully at the methods, but not so much at the data. And they tend to assume that the data has it right. In reading Freakonomics 1 (I haven't read 2) I recalled my impression of Charles Murray's book, Losing Ground. I felt his analysis of education and poverty policies were good, but in his analysis of crime-related policies he misused the data. It was as if he found two somewhat related data sets, and divided one by the other and called it a rate. [I haven't looked at his book in decades, so I can't cite chapter and verse.] Then I found out that colleagues versed in poverty/welfare policies felt the same way: Murray was spot on in his analysis of crime and education, but had mangled the poverty policies. Not that Murray is an economist, but he had the same macro approach that the two Steves have. In some sense, it may just be that they are falling afoul of Simpson's Paradox, and that analyses using smaller units of analysis would give lie to their findings.

  6. This is Mankiw's fourth principle at work: "People Respond to Incentives. Behavior changes when costs or benefits change." One can speculate about the particular mechanisms at work (time pressure, indifference, hubris, whatever), but Levitt would not have written the F2 that got published absent the huge financial incentive to produce a sequel quickly.

Comments are closed.