“Yes, not only am I suspicious of the claims in that op-ed, I’m also suspicious of all the individual claims from the links in these two sentences”

Someone pointed me with suspicion to a newspaper article that reported a cool-looking social science result, and asked me for my thoughts.

I replied, Yes, not only am I suspicious of the claims in that article, I’m also suspicious of all the individual claims from these links. And I pointed to a bunch of links in the news article.

I continued by saying that I’m started to get exhausted fighting these things. It takes some work to figure out the flaws in all these papers. I’m starting to see the appeal of preregistered replications, just in reducing the number of spurious findings that we have to waste time talking about.

My correspondent replied:

Yeah, it was those sentences that triggered my “alert response”. Then the rest of the article looked bogus too.

I agree, it’s like “whack-a-mole”.

The frustrating thing is that correspondent and I don’t feel like spending the time going through each of these papers and figuring out exactly what made me suspicious, and figuring out alternative explanations for the published findings. These particular studies are on a topic I’ve never written about (at least, not that I remember).

The news article in question is here, and I was also suspicious of all the individual claims from the links in these two sentences:

For example, in France and the United States, a study showed that judges give shorter sentences when defendants show up to court on their birthdays. Judges are also less likely to grant asylum to a refugee if they have done so for the previous asylum applicant — or when it’s hot outside, or their alma mater has unexpectedly lost a football game.

Again, I don’t have the energy to read these papers carefully enough to figure out what went wrong—or, for that matter, to be convinced that the claims in question are actually backed by solid research—so let me be clear that I’m not here offering any specific arguments as to the flaws of these research claims; I’m just skeptical for the usual reasons that it’s all too easy to look at data and find dramatic patterns that don’t replicate. From a statistical standpoint, the general concerns are confounding (these are not randomly assigned treatments) and forking paths (lots of ways to find patterns).

But, again, don’t take this as a criticism of these particular claims; it’s just a general warning.

38 thoughts on ““Yes, not only am I suspicious of the claims in that op-ed, I’m also suspicious of all the individual claims from the links in these two sentences”

  1. Not surprising that red flags would wave for readers (and writers) of this blog.

    “Judges are also less likely to grant asylum to a refugee if they have done so for the previous asylum applicant — or when it’s hot outside, or their alma mater has unexpectedly lost a football game.”

    This reads so much like this passage from https://statmodeling.stat.columbia.edu/2017/12/15/piranha-problem-social-psychology-behavioral-economics-button-pushing-model-science-eats/

    “…just about anything can have a large effect—from various literatures, there’s menstrual cycle, birth order, attractiveness of parents, lighting in the room, subliminal smiley faces, recent college football games, parents’ socioeconomic status, outdoor temperature, names of hurricanes, the grid pattern on the edge of the survey form…”

    that I clicked the link to convince myself that the Times article wasn’t listing these effects skeptically. (It wasn’t.)

    • Jeff:

      Yeah, it’s like there are two worlds out there: the world of people who have some understanding of social science, statistics, and the recent history of both and understand the problems of NPR/PNAS/Ted-style science, and the world of people who just have no clue. Too bad that some influential people are in that second category: people who can affect government policy and the education of future generations of leaders.

      On the plus side, if people in the second category are labeling those of us in the first category as “terrorists” and “the Stasi,” at least it tells us that they’re afraid. The next step is for them to realize they have nothing to be afraid of: they can relax, enjoy their comforts, and just have a more modest, less push-button view, of human attitudes and behavior.

      • In a world where BS beats honesty on a regular basis, how do people make political decisions? An uncomfortable question to contemplate for any length of time.

  2. Andrew said, “The frustrating thing is that correspondent and I don’t feel like spending the time going through each of these papers and figuring out exactly what made me suspicious, and figuring out alternative explanations for the published findings.”

    It was last month on this very website that I learned of Brandolini’s law, i.e., “The amount of energy needed to refute bullshit is an order of magnitude bigger than to produce it.”

      • Terry:

        I don’t know Sunstein nor do I know his legal scholarship, but just speaking in general terms: Maybe there’s no contradiction between being a brilliant legal scholar and a sloppy social scientist. One trait of a brilliant legal scholar, I’d think, is the ability and willingness to entertain bold ideas, going beyond what is known to envision new forms of social organization. One trait of a careful, non-sloppy social scientist is to be critical of claims made by oneself and others and to not overstate what is in the data. I could well imagine that sloppiness as a social scientist could enhance one’s ability to brilliant legal scholarship; or, conversely, that if you’re a very careful social scientist you’d have difficulty becoming a brilliant (or, at least, influential and brilliant) legal scholar, as you’d always be questioning yourself so you wouldn’t be inclined to offer strong, doubt-free prescriptions.

        • There is definitely a tension between being a good lawyer and a good social scientist. Lawyers are taught from day one to be vigorous advocates which can be problematic for social scientists. The advantage of the legal advocacy mindset is that it is out in the open and no one in the system expects cool neutrality, so the other side gets to speak too. Then spectators decide. If advocacy is so ingrained in human nature that social scientists cannot be cool and neutral, then one way to deal with it is to acknowledge the problem and require that the other side get to speak too.

          Legal scholarship is in a bit of a bind because it is pulled between the two. Legal scholars are expected to be scholarly and honest which goes against the grain of the profession.

  3. “Again, I don’t have the energy to read these papers carefully enough to figure out what went wrong—or, for that matter, to be convinced that the claims in question are actually backed by solid research—so let me be clear that I’m not here offering any specific arguments as to the flaws of these research claims; I’m just skeptical for the usual reasons that it’s all too easy to look at data and find dramatic patterns that don’t replicate. From a statistical standpoint, the general concerns are confounding (these are not randomly assigned treatments) and forking paths (lots of ways to find patterns).”

    Perhaps what we need now is to push toward requiring all statistics textbooks to include discussion of confounding and forking paths, including real examples, and with exercises that have students look at current research claims to find more instances.

  4. You dont need to refute claims that are obviously ridiculous (whether alma mater won or not, etc). the fact that such foolish and unlikely claims are made so boldly with no other supporting evidence should be sufficient to discredit the paper.

    The problem is that this paper was published at all. While there’s nothing necessarily wrong with the data, such silly conclusions should inspire the authors to assess their methods to find the problem before it gets submitted, much less published.

  5. On average, about 40 percent of drivers who were ticketed for speeding up to 20 miles above the speed limit received a mercy ticket, defined as a ticket for driving just nine m.p.h. above the speed limit instead of a ticket for the speed they were actually driving. That probability rose significantly — by a full 2 percent — when a driver shared a first name with the ticketing officer.

    Was anyone else brought up short by this passage in the NYT article?

    It struck me as proving that the injustice being decried is quite small. When stopped 50 times, an oppressed driver will get only 19 mercy tickets while a privileged driver will get 20 mercy tickets. This the magnitude of injustice today? That could be a rather comforting result.

    And that is taking the results at face value. The researchers were clearly looking for this type of result, so who knows what omitted variables there might be and what forking paths there were? These results are therefore an upper bound on this injustice.

  6. “For example, in France and the United States, a study showed that judges give shorter sentences when defendants show up to court on their birthdays.”

    It seems to me that, unless the sample of sentences-made-on-birthdays is quite large, it would be highly likely to find statistical significance for either (1) judges give shorter sentences on their birthdays or (2) judges give longer sentences on their birthdays. Either way, it shows that judges are influenced by factors besides the crime and the criminal’s prior history.

    Bob

    • Of course judges can be affected by external factors but in the aggregate, the main things that determine sentence length are conviction severity, plea deals (95% of all convictions are plea deals), prior criminal history, and gender. Anything else is just a quirk of the sample or is specific to the case at hand (i.e., a woman convicted of murder getting probation due to her being a domestic violence victim).

      And while I haven’t studied ticketing, I’d guess that severity of the offense, attitude of the driver towards the cop and the cop’s general likelihood for giving warnings instead of tickets are all gonna be more important that sharing a first name or birthday.

      I’m always amazed that people who should know better in a given field, like lawyers, can be seduced by such little idiosyncrasies.

  7. > unless the sample of sentences-made-on-birthdays is quite large, it would be highly likely to find statistical significance

    Why would it be more likely to find statistical significance for a small sample?

  8. Andrew – In case you harbor a molecule of discomfort over not having the time or energy to deconstruct published sillyness, let me point out that “hard” sciences like physics and biology have their own sideshows. For example, after I learned all about special relativity I began to notice a population of
    special relativity skeptics for whom the “theory” was not “settled science.” For fun I began to collect the papers and books that sought to refute or dramatically amend SR in order to enjoy a good chuckle during my idle hours.

    Turns out none of those revolved around obvious howlers in the science or math that I could easily point to and laugh at. In fact I became impressed by the ingenuity of the thinking and the subtlety of the arguments. Rooting out their flaws was not going to be easy or fun, and in fact would require a deep
    dive into the primary literature of SR and its foundations and its “paradoxes.” Since SR was not how I made my living, I let it slide.

    Quantum mechanics is enormously fertile ground for alternate theories. Just to scratch the surface, check out the combination “quantum mechanics” and “vortices.” Again, refuting these is rarely straightforward.

    When it comes to the labor of repudiation nothing beats the example of evolution. This is not the place to rehearse the pros and cons, but let me point out here what’s required to enlighten a denier with more than “because.” You would need to marshal an array of essential information from biology,
    paleontology, genetics, molecular biology, and, well, you get the idea: If you’re going to attack a prejudice with facts, you’re going to need a lot of them. And again, my point is that this exercise is often way more trouble than it’s worth.

    I suggest that you, Andrew, are giving these subjects exactly the amount of attention they deserve, usually, in my opinion, because the stakes are low. Another silly sociology experiment, yawn. Another naive statistical analysis, it must be Tuesday. But with enough motivation, say a bogus analysis
    supporting anti-vaxers, I expect to see thunder and lightning.

  9. Or it could be that you read something totally preposterous and, rather than spending hours detailing the problems, you, rightfully, judge it poorly and move on

      • Explanation please?

        Is this because SR and quantum theory is incompatible? Is this because reality itself is wacky and not amenable to our intuition? Is it because there are a lot of fudges on top of the basic theory to extend the theory?

        • Terry – All three of your observations are fine, in fact are majority opinions. But the thing about string theory that gives me a bigger chuckle than a Far Side cartoon are the volumes of world building and speculation that are unverified and, worse, unverifiable by any speculative measure in the foreseeable future.

          Now having said (boldly) that, I confess what keeps me chewing my fingernails at night is the prospect that fifty years from now that will change, and our descendants will point to that arrogant buffoon who shot his mouth off in a public forum and laugh with derision at his naive lack of vision.

    • Not quite. The problem is that most of the alternate theories are not obviously bogus. You know they have to be wrong, but their arguments are often sophisticated and subtle. Their authors are not “crackpots” in the usual sense. They frequently know as much about SR or QM as anyone, but somewhere along the line they developed a blind spot or a prejudice and ran down a rabbit hole.

      My original point was not about the transparently erroneous, but rather the kind of thing that makes you ask “I know this must be wrong, but I’m not sure why.” Answering that can be exhausting, and after a while you lose your enthusiasm.

      I felt that my personal experience along these lines mirrors Andrew’s, so I thought to add a word or two of solidarity.

        • Anoneuoid – The usual method in the hard sciences: boatloads of experimental evidence that square with the underlying theory and no evidence that contradicts it. The alternate ideas I have studied eventually produce a condition that is not natural. Your theory of electrons that denies their wave nature is, bottom line, simply wrong in spite of the fifty pages of closely argued propositions to the contrary. In fact, one of the clues that you’re about to read a load of hooey is an absence of experimental predictions.

          Andrew brings to our attention materials he suspects are flawed in some way. Given the stochastic nature of statistical analysis, the problems can range over a spectrum from fraud to simply sloppy. Given that you likely don’t know the exact problem going in, parsing every damn thing that crosses your desk is not in the cards.

          In other words, most of those papers do not have a bright standard as a basis of their validity so that ferreting out their weaknesses cannot follow a shining path that must lead to an irrefutable conclusion. For that reason alone Andrew’s occasionally leaving the heavy lifting to his readers is justified. One of my points in the original comment was to emphasize that that’s o.k. because even in disciplines where you know what the final answer must be, wading through the tall grass is still a drag, at least in my experience.

          Geez, all I wanted to do was drain the swamp.

        • The usual method in the hard sciences: boatloads of experimental evidence that square with the underlying theory and no evidence that contradicts it. The alternate ideas I have studied eventually produce a condition that is not natural.

          This doesn’t make much sense in terms of your original post. Why do you need to look and understand the details of a theory if the predictions don’t match observation? The argument is then about what could have possibly went wrong with interpreting the observations, no one needs to care about the theory’s details. I thought you referred to theories without any known inaccurate predictions.

          Another point is I have heard people say “no evidence contradicts” general relativity, and “it is the most verified scientific theory” and such which is clearly false since it predicts the incorrect galactic rotation curves, etc…

        • Anoneuoid – This got tangential to my original intention, but I’m game. In order to keep my previous comments brief I have perhaps painted a picture that is too black and white. In my experience the unorthodox SR and QM papers and books I have studied do not try to boldly overthrow those subjects. Rather they try to subtly amend them according to the authors’ personal insights. In the end they are not shy about comparing their calculations to conventional results and experimental data. Their results are invaluably second or third or fourth order perturbations on currently accepted values.

          In practice, most of these are beyond current experimental art. Thus the authors feel validated for two reasons. First, the new calculations lead to results undetectably congruent with standard values, even though based on alternate principles. And second, we cannot currently definitively refute them experimentally.

          The task (and the fun) of the debunker is to wade in and discover the (invariably) subtle flaw. Just one example I recall: An author concluded that the speed of light was not quite invariant. His mistake was an incorrect application of proper time in a third co-moving inertial frame. That’s the exercise that pretty much put me off pursuing these alternative theories (and they just keep coming). It became more work than fun.

          I cannot speak to general relativity other than to note that the consensus faith in GR is so strong that a minor industry has developed to find exotic reasons for galactic rotations. Tweaks to GR are also being explored, but these would be minor, and it would be ungenerous to describe GR as “clearly false.”

        • First post:

          boatloads of experimental evidence that square with the underlying theory and no evidence that contradicts it.

          New post:

          Their results are invaluably second or third or fourth order perturbations on currently accepted values.

          In practice, most of these are beyond current experimental art. Thus the authors feel validated for two reasons. First, the new calculations lead to results undetectably congruent with standard values, even though based on alternate principles. And second, we cannot currently definitively refute them experimentally.

          Right, this is what I thought you were referring to. All the same “boatloads of evidence” must support both theories. When the (currently available) evidence cannot be used to simply distinguish between the theories, how do you “know they have to be wrong”?

          You give one example where you found a math/logic error. Great, but from that you concluded they all contain such errors? I can understand not spending time on it unless it makes a worthwhile prediction, but not “knowing” it is wrong.

          The way I see it is it is the job of the person who comes up with the new theory to derive a prediction that can distinguish it from the current theory. Once that happens, others may find it worthwhile to look deeper at the details of the theory.

          it would be ungenerous to describe GR as “clearly false.”

          I did not describe GR as clearly false. I said it was clearly false to say stuff like “no evidence contradicts GR”, something I see come up all the time in discussions of it.

        • Anoneuoid – I put this here because I find no “Reply to this comment” option at the end of your last post.

          I take your point. How DO we know if some particular theory is THE TRUTH? What began as lighthearted (I hoped) support of Andrew’s not taking the time to deconstruct in detail some papers that come to his notice has drifted into deeper waters. My contribution was to point out that even in the sciences where one would expect little wiggle room there is an abundance of “fringe” addenda. Once upon a time I found it diverting and stimulating to discover where their versions departed from the standard. I use the word “standard” by way of backpedaling from my hastily worded distinction between RIGHT and WRONG. I rarely care about those absolutes in real life.

          When parsing the other theories I was only interested in spotting the differences, and it helped to assume that the standard theory was the benchmark for comparison. In the end, for that particular time-waster, it didn’t matter if SR or QM was ultimate truth, they were merely the working assumptions. So you’re right, the alternate theories don’t “have to be wrong” in actuality (but the astute better would note the pattern). And of course I don’t “know” that an alternate concept must prima facie be wrong, but my personal system of justice summarily sentences them to “crackpot” jail until further notice.

          Your point that competing theories producing practically the same results (“boatloads of evidence”) should be allowed equivalency is a truism until a logical or unphysical blemish is discovered. If the underlying structure is defective then matching predictions are coincidence. I found this to be the case a surprising (to me) number of times.

          You say “The way I see it is it is the job of the person who comes up with the new theory to derive a prediction that can distinguish it from the current theory.” I agree, but there is a population of thinkers who don’t work that way, whose goal is to produce theories with INDISTINGUISHABLE results based on alternate assumptions. Finding where the speculations wandered off base got to be a chore rather than fun partly because the authors were plainly well educated and were not about to make an easy-to-spot blunder and because their tomes were approaching 50 to 200 pages, which was more wading than my puny thighs could tolerate.

          Now, having said all that, allow me another word on general relativity. You say that it is false to claim that no evidence contradicts GR (I think I’ve got that right now). The discussions I’ve heard say exactly the opposite, the one possible wrinkle being galaxy rotation. Now I’m not the one to say that galaxy rotation is not shining a spotlight on a flaw in the current version of GR, but consensus opinion is that GR is just fine and that something else, something new, is going on in the galaxies. Clearly our experiences in this area have produced contrary attitudes, but I’d like to stop now.

          (However, in a different forum, you would find me a vigorous disciple of SR, QM and GR as currently constructed.)

  10. The editorial doesn’t mention any results with regard to gender. Why? Is the NYT unconcerned about oppression of women by the police? Surely women receive far more tickets than men due to their marginalized status in the oppressive patriarchy. Quite a mystery.

Leave a Reply to Torquemada in Training Cancel reply

Your email address will not be published. Required fields are marked *