Marshmallow update

Gur Huberman points us to this interesting article by Dee Gill about a posthumous research publication. It’s about 80 zillion times better than the usual science press release.

P.S. I did some quick googling and found some fun links showing past credulity on the marshmallow thing from the usual suspects: Sapolsky, Brooks, NPR. None of them could resist the immediate gratification associated with sharing that study.

No Gladwell or Easterbrook, though. They showed self-control. Let’s give credit where due.


  1. Christian Hennig says:

    This was a very good read indeed, and nice to read the nuanced interpretation by Mischel himself of his own results – he may have had a point about observing and analysing actual behaviour rather than questionnaire responses, and I like the emphasis on a probably – in his view – existing correlation that is though small, not very predictive, and overinterpreted by many.

    One aspect is missing in my view. This is a nice example illustrating that researchers should not idealise themselves as “neutral observers”. Given that the marshmallow test and its interpretations became very popular, one should assume that some parents of children that were tested in later studies knew of this and it influenced their behaviour when educating their children, and their children’s behavior in turn. But this makes a difference to the original study. It is different to look for an effect in an environment where reasons for this effect are not influenced by certain studies or results, from a situation in which children may do or not do something because parents consciously try to influence them to show the behaviour that is presented as desirable based on the earlier study (I have general behaviour in mind here, not specifically regarding the marshmallow test, but who knows?). Already for this reason a proper “replication” would have been impossible. Observing an effect is one thing, consciously trying to provoke a previously observed effect is another thing. It may have been interesting to differentiate results according to whose parents knew of this stuff and tried to push their children in a certain direction even if the only way to find this out would probably have been questionnaires of questionable reliability. (This issue probably also comes up in Economics a lot.)

  2. Symmetry in Asymmetry
    This marshmallow research is an extreme case of asymmetry and had better been criticized decades ago.
    Walter Mischel (1968) had written an influential book “Personality and assessment” and demonstrated that broad personality traits did not predict specific behavior well. The predictive correlation of .30 was coined the “personality coefficient”. After that the consistency debate dominated personality research e.g. Mischel and Peake (1982) against Epstein (1983) who wrote the counterattacks and demonstrated the importance of aggregated behavior to improve reliability. Icek Aizen and Martin Fishbein wrote a wonderful book and demonstrated how to improve the predictability of behavior from attitudes, where there had been a similar debate about the utility of attitudes.
    Ajzen, I., & Fishbein, M. (1980). Understanding attitudes and predicting social behavior. Englewood Cliffs, NJ: Prentice-Hall.
    They made some important distinction between single acts, multiple acts, behavioral outcomes and principals of compatibility. Single acts are chronically unreliable, so no wonder that the prediction of unreliable criteria is a waste of time and energy. In my own research I was heavily influenced by Ajzen and Fishbein and found many helpful similarities to my favorite concept of the Brunswik lens model,which I had coined Brunswik symmetry. In a book chapter titled Multivariate reliability theory. Principles of symmetry and successful validation strategies I wrote:
    “From the position of a distant observer of that heated debate over consistency (Epstein 193b, Mischel and Peake 1982), I have sometimes had the impression that many eminent American psychologists either had no opportunity to learn their psychometric lessons or simply had not done so. This may sound unfair, arrogant, or inappropriate, but it is the only way to subjectively explain why, again and again, attempts were made to predict unreliable criteria.” (Wittmann,1988,p. 507)
    Predictors and criteria must be conceptualized at the same level of generality. I distinguish four cases of asymmetry. Mischel was fooled by two of the four cases. In the first part of his career he tried to predict from a broad trait level a specific (unreliable) criterion and in the second half of his career a broad life time criterion from a specific (unreliable) single act, the marshmallow test.
    It is an irony that there is so much symmetry in the asymmetry of Walt Mischel’s research.
    My critique does not mean that delay of gratification is a futile construct, no but it must be better assessed as a multiple act predictor, averaged over many situations and time to get a chance to predict the symmetric level of real life outcomes.
    In my research in intelligence, health and psychotherapy research and program evaluation, Brunswik symmetry was the golden key.
    Most of my ruminations with Brunswik Symmtery can be found at Researchgate for download, see:
    Ajzen, I., & Fishbein, M. (1980). Understanding attitudes and predicting social behavior. Englewood Cliffs, NJ: Prentice-Hall.
    Epstein S. (1983). Aggregation and beyond: Some basic issues on the prediction of behavior. Journal of personality, 51(3), 360–392.
    Epstein, S. (1983b). The stability of confusion: A reply to Mischel and Peake. Psychological Review, 90(2), 179–184.
    Mischel, W., & Peake, P. K. (1982). Beyond déjà vu in the search for cross-situational consistency. Psychological Review, 89(6), 730–755.
    Wittmann, W. W. (1988): Multivariate Reliability Theory. Principles of symmetry and successful validation strategies. In J. R. Nesselroade & R. B. Cattell (Eds.), Handbook of multivariate experimental psychology (pp. 505–560). New York: Plenum Press.

  3. Dzhaughn says:

    I always assumed that the Marshmallow Test was a valid predictor of future success, but not because it showed self-control. Rather, it showed willingness to trust arbitrarily in Authority’s ability to deliver in the future. (My counter study would have let children discover the Marshmallow Duplication System, which takes an house, and spread word by gossip, leaving the adults out of it.)

    But apparently there’s no evidence for that either.

    It’s all just fluff.

  4. Dunigan says:

    Surprised to Sapolsky included in the list of “usual suspects”. Does he commonly over-hype certain findings?

    • Wonks Anonymous says:

      I always think of him as the guy who wrote a book titled “Why Zebras Don’t Get Ulcers”, attributing it to their lack of chronic stress, years after Barry Marshall had shown (via dosing himself) that heliobacter pylori is responsible, for which he would win the Nobel in medicine.

  5. Martin says:

    I was interested to see no correlation for adult body mass. I would think a willingness to make sacrifices to get two marshmallows would predict higher weight.;)

