When you believe in things that you don’t understand

Rolf Zwaan gives an excellent discussion of how superstition can arise and perpetuate itself:

A social-behavioral priming experiment is like rolling a 20-sided die, an icosahedron. If you roll the die a number of times, 20 will turn up at some point. Bingo! You have a significant effect. In fact, given what we now know about questionable and not so questionable research practices, it is fair to assume that the researchers are actually rolling with a 20-sided die where maybe as many as six sides have a 20 on them. So the chances of rolling a 20 are quite high.

Once the researchers have rolled a 20, their next interpretive move is to consider the circumstances that happened to coincide with rolling the die instrumental in producing the 20. The only problem is that they don’t know what those circumstances were. Was it the physical condition of the roller? Was it the weather? Was it the time of day? Was it the color of the roller’s sweater?

Indeed, we’ve been told first that it’s the color of her shirt and then that it’s the weather.

Zwaan continues:

Now suppose that someone else tries to faithfully recreate the circumstances that co-occurred with the rolling of the 20, from the information that was provided by the original rollers. They recruit a 23-year old male roller from Michigan, wait until the outside temperature is exactly 17 degrees Celsius, make the experimenter wear a green sweater, have him drink the same IPA on the night before, and so on.

Then comes the big moment. He rolls the die. Unfortunately, a different number comes up— a disappointing 11. Sadly, he did not replicate the original roll. He tells this to the first roller, who replies: Yes you got a different number than we did but that’s because of all kinds of extraneous factors that we didn’t tell you about because we don’t know what they are. . . .

Or, as Nick Brown puts it, “Hold on a moment which I move these goalposts.”

21 thoughts on “When you believe in things that you don’t understand

  1. This criticism isn’t particularly specific to social-behavioral priming experiments though, is it?

    I see stuff reminiscent of icosahedron die rolling in a wide variety of subject areas.

  2. As so often, Paul Meehl has already told us pretty much all we need to know, in this case in his 1967 article “Theory-testing in psychology and physics: A methodological paradox” (Philosophy of Science, 34, 101-115):

    “[T]here exists among psychologists … a fairly widespread tendency to report experimental findings with a liberal use of ad hoc explanations for those that didn’t “pan out.” This last methodological sin is especially tempting in the “soft” fields of (personality and social) psychology, where the profession highly rewards a kind of “cuteness” or “cleverness” in experimental design, such as a hitherto untried method for inducing a desired emotional state, or a particularly “subtle” gimmick for detecting its influence upon behavioral output. The methodological price paid for this highly-valued “cuteness” is … an unusual ease of escape from modus tollens refutation. For, the logical structure of the “cute” component typically involves use of complex and rather dubious auxiliary assumptions, which are required to mediate the original prediction and are therefore readily available as (genuinely) plausible “outs” when the prediction fails. … [A] zealous and clever investigator can slowly wend his way through a tenuous nomological network, performing a long series of related experiments which appear to the uncritical reader as a fine example of “an integrated research program,” without ever once refuting or corroborating so much as a single strand of the network.” (p. 114)

  3. A similar message is explored, almost simultaneously, by Terry Speed here. I was telling students just yesterday that there’s a trade-off between a really informative model that mimics the patterns in the data and a misleading model that… errr… mimics the patterns in the data. Disheartening for the young data analyst, but the real worry is that word counts and publication pressure and league tables and all sorts of other crap makes it an issue for the established analyst too, when research is published in a short, fudged form. A friend and housemate when we were undergraduates once wrote up his chemistry assignment which had gone wrong for unknown reasons with the words “c’est la vie”. It didn’t go down well with the tutor (but it was true). He is now director of a major lab at a world-top-five institution.

  4. If there is a signal and you know where to look for it, then it can be recovered reliably from amidst a great deal of noise. But if there is no signal, then all techniques will recover nothing reliably, though some trials may be suggestive. Your ears can do this all by themselves when you tune a short wave radio hunting for weak signals. But there is a difference between detecting a signal — no matter how weak, subtle, elusive, and buried in noise it is — and detecting suggestive patterns in the noise. But here we are wondering how many of our fellow scientists have been spending decades listening to patterns in the noise.

    It seems like there should be a branch of statistics which specializes in synthesizing causal illusions. Models with specified effects or no effects at all which look like something else entirely when examined under various conditions. What are the parameters of audible noise which most tempt our ears to hear voices that aren’t really there? And how susceptible to hearing voices are we?

    • “wondering how many of our fellow scientists have been spending decades listening to patterns in the noise” INDEED!

      While doing my PhD work, I discovered that for 50 years people studying soil liquefaction had assumed that water migration was ignorable, and had spent years and years perfecting truly “undrained” tests of small soil samples. No one in this field had ever written down the equations for water motion, and analyzed them in the context of shallow sandy soils and determined what the balance of effects was. When I did this, it took about a month to work out how to do it, and maybe a couple more months to clean it up and make it clear and presentable. It then took 2 years to convince my advisors that it was correct and publishable. When we sent it for publication, one of the reviewers simply wouldn’t engage the paper at all and rejected it outright as having ignored 50 years of laboratory experiments and that it should not be published.

      We put a section in the paper about how the results of those “undrained” laboratory experiments were probing only the interactions of a rubber membrane surrounding the sample with the grains and water inside. This was the ENTIRE effect being studied in tabletop experiments. For 50 years. Millions and millions of research dollars globally. A problem that causes BILLIONS of dollars in damages each year during earthquakes.

      People had noticed how the membrane could cause variations from one experiment to another and some even designed feedback control systems to compensate for the “membrane effect”…. They had been studying the properties of thin rubber membranes for 50 years, and the reviewer who had clearly been one of these small sample undrained testers wanted to keep my paper out of the literature at essentially all costs…

      Needless to say. This whole experience did not improve my general sense of the value of academic science. I am happy to say though that the other main reviewer was extremely supportive and did essentially the opposite…line by line provided commentary that could improve the paper and make sure it was top notch. I am very grateful to that person, whoever they are.

      • Daniel,

        What has been the community response to your work so far? Was the paper ever published? Have you mentioned the work at any conferences?

        • The paper was published only a few months ago, too early to say what the response will be to publication, other than the response by the referees. My personal belief is that it will be ignored by the soil mechanics community. My main hope is that eventually (before another 50 years) this view will become mainstream and we won’t see any more tabletop experiments.

          http://rspa.royalsocietypublishing.org/content/470/2165/20130453.short?rss=1

          There were 2 popular science articles published about the work:

          http://www.abc.net.au/science/articles/2014/02/19/3947389.htm??site=science/talkingscience&topic=tech

          and

          http://news.sciencemag.org/earth/2014/03/how-earthquakes-turn-ground-soup

        • Daniel,

          Thanks, I am interested in how this turns out. Although I am not familiar with soil mechanics, from what has occurred historically in biology research I would expect the limited/flawed experiments to continue until an alternative method is available. Alternatively, some completely different experiment could provide a new type of data and the current one fades into obscurity.

        • I would tend to agree with you, given a similar experiment I performed with my son. I had dismissed if as a known effect, but perhaps not. This is the method discovered by my son:

          take a bucket to the beach. Fill the bucket with water. Pile in dry sand until the sand is at the top. Now the top is mostly dry, but water is still in the sand below. Now smack the top with your hand, bringing the water up from below and loosening the top layers. This can be repeated surprisingly often. It makes sense to me from a physical point of view.

          Is this a known phenomenon?

        • Yes it’s a known phenomenon, namely “soil liquefaction”. In your example what’s happening is that the soil grains are collapsing together, and the water is squeezing out to the surface.

          But liquefaction can happen at 10 to 30 m depths for example, and the intuitive explanation in the soil mechanics literature (and in every soil mechanics textbook) is that at that depth the water is undrained because it would take much much longer for the water to “drain” through tens of meters of soil. Unfortunately, this intuition is based on the idea of draining large volumes of water all the way to the surface…. whereas pressure generation takes only small volumes of water because water is so stiff.

          Also, probably most natural soils are heterogeneous enough that low permeability layers will exist which prevent that drainage to the surface (a phenomenon I discuss in the paper also). However, sand fills like in many port facilities and bays are basically homogenous. I even have an example in the paper which shows that O(1) fluctuations in the permeability are completely averaged away by the diffusion phenomenon, only very low permeability layers have an effect.

          If you read the paper it shows that soil collapsing, water flow, and permeability gradients are all equally important on timescales of tenths of a second over depth scales of tens of meters.

          If you don’t have access to the royal society, I have a preprint available on my blog:

          http://models.street-artists.org/2013/09/04/whats-wrong-with-our-understanding-of-soil-liquefaction/

          The published paper becomes free access after a year if I remember correctly.

    • Roger:

      I do have a good case study if anyone has the time and funds.

      Most if not all researchers using correspondence analysis plots used SAS software and followed the suggested best way to make an adjustment as vaguely indicated here http://www.sciencedirect.com/science/article/pii/S0895435610001381

      The SAS programmer misunderstood different parameterizations of the same adjustments as being being different adjustments and recommended the mistaken one as being best. Even textbooks on the subject uncritically recommended the wrong adjustment.

      This wrong adjustment adds artifacts and of course they will have been interpreted and may well still be believed as important.

      If there is enough information in the published paper (e.g. what is called a Burt matrix) the correct adjustment can be recovered and one will know what the artifact was.

  5. One can find much larger studies, as done by the Princeton Engineering Anomalies Research (PEAR)- their website , or <a href="wikipedia.org/wiki/Princeton_Engineering_Anomalies_Research_Lab".Wikipedia's somewhat-different description:
    “PEAR’s primary purpose was to engage in parapsychological exercises purporting to examine telekinesis and remote viewing.”

    See descriptions of experiments, usually involving people’s ability to modify random behavior of computers or other devices by thought. After 28 years, they declared victory and shut it down, as described in “The Pear Proposition” Vol. 19.2 of Journal of Scientific Exploration. For context, readers may peruse the list of articles there. My favorite is the one on dog astrology, but many articles might be useful for statisticians to give to students as exercises.

  6. OK, as a non-academic layperson, I’ll assume the onus for disrupting the flow and thanking Andrew for advancing the discussion of this important topic and moving the conversation to higher ground.

    • For once in my life, the secret life of plants is relevant to this discussion too. As to “Jungle Fever” . . . I guess we’ll have to go back to the Nicholas Wade post for that one.

  7. Rolf Zwaan: “A social-behavioral priming experiment is like rolling a 20-sided die, an icosahedron. If you roll the die a number of times, 20 will turn up at some point. Bingo! You have a significant effect.”

    Well, no, yes, and no. Rolling a die some number of times and getting a 20 is not a significant effect. But let’s assume only one roll of the die. Then there are two questions here which we may distinguish.

    1) Is the die biased?
    2) If the die is biased, what is it biased towards?

    An unbiased die much come up something, so the fact that it comes up one number rather than another is not evidence that it is biased. It is irrelevant to question 1. But the number is relevant to question 2, because if it is biased towards 20, then 20 will come up more often than other numbers. Roll the die a second time, and now if it comes up 20 we have evidence that the die is biased, as well as what that bias is.

    If we want a statistic to test question 1, we need it to remain constant, whatever the result of the roll of the die.

Leave a Reply to Andrew Cancel reply

Your email address will not be published. Required fields are marked *