Skip to content
Archive of posts filed under the Miscellaneous Statistics category.

Harking, Sharking, Tharking

Bert Gunter writes: You may already have seen this [“Harking, Sharking, and Tharking: Making the Case for Post Hoc Analysis of Scientific Data,” John Hollenbeck, Patrick Wright]. It discusses many of the same themes that you and others have highlighted in the special American Statistician issue and elsewhere, but does so from a slightly different […]

My math is rusty

When I’m giving talks explaining how multilevel modeling can resolve some aspects of the replication crisis, I mention this well-known saying in mathematics: “When a problem is hard, solve it by embedding it in a harder problem.” As applied to statistics, the idea is that it could be hard to analyze a single small study, […]

My talk at the Metascience symposium Fri 6 Sep

The meeting is at Stanford, and here’s my talk: Embracing Variation and Accepting Uncertainty: Implications for Science and Metascience The world would be pretty horrible if your attitude on immigration could be affected by a subliminal smiley face, if elections were swung by shark attacks and college football games, if how you vote depended on […]

The methods playroom: Mondays 11-12:30

Each Monday 11-12:30 in the Lindsay Rogers room (707 International Affairs Bldg, Columbia University): The Methods Playroom is a place for us to work and discuss research problems in social science methods and statistics. Students and others can feel free to come to the playroom and work on their own projects, with the understanding that […]

What’s the origin of the term “chasing noise” as applying to overinterpreting noisy patterns in data?

Roy Mendelssohn writes: In an internal discussion at work I used the term “chasing noise”, which really grabbed a number of people involved in the discussion. Now my memory is I first saw the term (or something similar) in your blog. But it made me interested in who may have first used the term? Did […]

Beyond Power Calculations: Some questions, some answers

Brian Bucher (who describes himself as “just an engineer, not a statistician”) writes: I’ve read your paper with John Carlin, Beyond Power Calculations. Would you happen to know of instances in the published or unpublished literature that implement this type of design analysis, especially using your retrodesign() function [here’s an updated version from Andy Timm], […]

More on the piranha problem, the butterfly effect, unintended consequences, and the push-a-button, take-a-pill model of science

The other day we had some interesting discussion that I’d like to share. I started by contrasting the butterfly effect—the idea that a small, seemingly trivial, intervention at place A can potentially have a large, unpredictable effect at place B—with the “PNAS” or “Psychological Science” view of the world, in which small, seemingly trivial, intervention […]

You should (usually) log transform your positive data

The reason for log transforming your data is not to deal with skewness or to get closer to a normal distribution; that’s rarely what we care about. Validity, additivity, and linearity are typically much more important. The reason for log transformation is in many settings it should make additive and linear models make more sense. […]

“The issue of how to report the statistics is one that we thought about deeply, and I am quite sure we reported them correctly.”

Ricardo Vieira writes: I recently came upon this study from Princeton published in PNAS: Implicit model of other people’s visual attention as an invisible, force-carrying beam projecting from the eyes In which the authors asked people to demonstrate how much you have to tilt an object before it falls. They show that when a human […]

“I feel like the really solid information therein comes from non or negative correlations”

Steve Roth writes: I’d love to hear your thoughts on this approach (heavily inspired by Arindrajit Dube’s work, linked therein): This relates to our discussion from 2014: My biggest takeaway from this latest: I feel like the really solid information therein comes from non or negative correlations: • It comes before • But it doesn’t […]

What can be learned from this study?

James Coyne writes: A recent article co-authored by a leading mindfulness researcher claims to address the problems that plague meditation research, namely, underpowered studies; lack of or meaningful control groups; and an exclusive reliance on subjective self-report measures, rather than measures of the biological substrate that could establish possible mechanisms. The article claims adequate sample […]

Amending Conquest’s Law to account for selection bias

Robert Conquest was a historian who published critical studies of the Soviet Union and whose famous “First Law” is, “Everybody is reactionary on subjects he knows about.” I did some searching on the internet, and the most authoritative source seems to be this quote from Conquest’s friend Kingsley Amis: Further search led to this elaboration […]

Here are some examples of real-world statistical analyses that don’t use p-values and significance testing.

Joe Nadeau writes: I’ve followed the issues about p-values, signif. testing et al. both on blogs and in the literature. I appreciate the points raised, and the pointers to alternative approaches. All very interesting, provocative. My question is whether you and your colleagues can point to real world examples of these alternative approaches. It’s somewhat […]

You are invited to join Replication Markets

Anna Dreber writes: Replication Markets (RM) invites you to help us predict outcomes of 3,000 social and behavioral science experiments over the next year. We actively seek scholars with different voices and perspectives to create a wise and diverse crowd, and hope you will join us. We invite you – your students, and any other […]

Are supercentenarians mostly superfrauds?

Ethan Steinberg points to a new article by Saul Justin Newman with the wonderfully descriptive title, “Supercentenarians and the oldest-old are concentrated into regions with no birth certificates and short lifespans,” which begins: The observation of individuals attaining remarkable ages, and their concentration into geographic sub-regions or ‘blue zones’, has generated considerable scientific interest. Proposed […]

Causal Inference and Generalizing from Your Data to the Real World (my talk tomorrow, Sat., 6pm in Berlin)

For the Berlin Bayesians meetup, organized by Eren Elçi: Causal Inference and Generalizing from Your Data to the Real World Andrew Gelman, Department of Statistics and Department of Political Science, Columbia University Learning from data involves three stages of extrapolation: from sample to population, from treatment group to control group, and from measurement to the […]

I don’t have a clever title but this is an interesting paper

Why do we, as a discipline, have so little understanding of the methods we have created and promote? Our primary tool for gaining understanding is mathematics, which has obvious appeal: most of us trained in math and there is no better form of information than a theorem that establishes a useful fact about a method. […]

Just forget the Type 1 error thing.

John Christie writes: I was reading this paper by Habibnezhad, Lawrence, & Klein (2018) and came across the following footnote: In a research program seeking to apply null-hypothesis testing to achieve one-off decisions with regard to the presence/absence of an effect, a flexible stopping-rule would induce inflation of the Type I error rate. Although our […]

Swimming upstream? Monitoring escaped statistical inferences in wild populations.

Anders Lamberg writes: In my mails to you [a few years ago], I told you about the Norwegian practice of monitoring proportion of escaped farmed salmon in wild populations. This practice results in a yearly updated list of the situation in each Norwegian salmon river (we have a total of 450 salmon rivers, but not […]

What’s published in the journal isn’t what the researchers actually did.

David Allison points us to these two letters: Alternating Assignment was Incorrectly Labeled as Randomization, by Bridget Hannon, J. Michael Oakes, and David Allison, in the Journal of Alzheimer’s Disease. Change in study randomization allocation needs to be included in statistical analysis: comment on ‘Randomized controlled trial of weight loss versus usual care on telomere […]