The winner’s curse

If an estimate is statistically significant, it’s probably an overestimate of the magnitude of your effect.

P.S. I think youall know what I mean here. But could someone rephrase it in a more pithy manner? I’d like to include it in our statistical lexicon.

21 thoughts on “The winner’s curse

  1. I can't quite get this right, but something like: "If p is too small for not our effect to be true, our b is probably too big."

  2. I guess you're looking for something like "If it sounds too good to be true, then it probably is" or "Anything that can go wrong will go wrong." Well, you could always just use those as models. How about?

    Statistically significant estimates are usually overestimates.

  3. I really like that paper on sex ratios. It's good to be reminded to always check previous research, and see what effect sizes and levels of variation/precision can be expected, before you try to make sense of a new claimed result.

    Here's a paraphrase from the paper that may or may not be pithy enough for the lexicon:

    Large estimates often do not mean "Wow, I’ve found something big!" but, rather, "Wow, this study is underpowered!"

  4. But it's subtly different from the Winner's Curse in economics, right?
    http://econ.ucdenver.edu/beckman/Econ%204001/thal
    There, you're cursing the fact that you spent too many resources to win a competition against other people in order to win a prize that's worth less than you had thought.
    Here, you're cursing that you spent too few resources (i.e. had too small a sample) to know whether you can trust your "significant" effect. (Except that in many cases, you're cheering that you can slip this problem past reviewers, and then rigorous-minded people like Andrew are left cursing instead of you.)

  5. "Statistically significant estimates are usually overestimates."

    Since, on the whole, the residuals sum to zero, then statistically insignificant estimates are usually underestimates.

  6. anon: Yes! but what percentage of people get this on their own, versus after it has been pointed out once versus multiple, multiple times.

    Also the reason as Andrew once put it you need to keep one eye on the power – the less power in the studies the larger those two cancelling _biases_ become.

    (Even dressing it up in a topic of "beauty and sex" may not be enough)

    K?

  7. The scandal of (low) power

    "Effects seen in the significance lens are (much) smaller than they appear"

    "Effects seen in the non-significance lens are (much) larger than they appear"

    K?

  8. In my mind, one of the issues is what is publishable (and how that shapes the literature). In epidemiology, an unexpected large association is immediately publishable whereas many null studies are extremely hard to publish. If power is low (due to a rare outcome and the difficulties in getting data) then the published associations are almost certianly dramatic over-estimates.

    It's not an easy problem as (for example) ignoring a safety signal sometimes leads to very unfortunate outcomes.

  9. i am curious about the fact that the sentence starts with significance testing and ends with effect estimation. Does that mean we can avoid the problem by avoid testing (but still screening a lot of effects by effect estimation)?

  10. What the about the "filter fallacy"? People pass studies through a filter that excludes small effects (fixed p, fixed n) and are then surprised that you've overestimated the effects …

    … or the "too big to be true" effect.

    There is also the "file drawer problem" in which excluding ns effects biases published study effects upwards.

Comments are closed.