Skip to content

“Persistent metabolic youth in the aging female brain”??

A psychology researcher writes:

I want to bring your attention to a new PNAS paper [Persistent metabolic youth in the aging female brain, by Manu Goyal, Tyler Blazey, Yi Su, Lars Couture, Tony Durbin, Randall Bateman, Tammie Benzinger, John Morris, Marcus Raichle, and Andrei Vlassenko] that’s all over the news.

Can one do a regression like shown in figure 1?

That strikes me as rather bold – no pun intended. It seems that there are distinct groups, with no clear relation within group.

Am I missing something?

The regression in question is here:

I replied that I can’t really follow all the details, but it does seem that this “predicted age” is just predicting young/old, not age within each group. So I feel like I’m missing something. I mean, they have this thing they call “metabolic age,” which, among young people and among old people, is not correlated with actual age, and then they say it’s different for men and women . . . but I don’t quite understand how they get off calling this thing “metabolic age.” Does it increase monotonically as people get older?

I’m also confused about the word “persistent” in the title. How do you learn about “persistent” in a cross-sectional study that doesn’t follow people over time?

Maybe some neuroscientists in the audience can explain what’s going on here? Thanks in advance.

P.S. My correspondent wrote, “please do not quote me on this just yet. I don’t have tenure (yet) and have found out the hard way that not everyone in this business is as mature, professional and equipped to handle critical feedback as one would want.”

Tomorrow’s post: Junk science and fake news: Similarities and differences


  1. Alex says:

    My reading of it is that they did predict absolute age, not just age groups. They give deviation data (SD, range, absolute deviation) for the prediction in years.

    The study was largely between-subjects but they did have 19 people with two scans, which is what you see in that Figure 1B. Running their algorithm on those people found that the ‘metabolic age’ went up on average 1.1 year when the actual time between scans was average 1.6 years, so they declare ‘metabolic age’ to be persistent enough.

    Then they ran the algorithm again, training it only on males but testing on females (and then vice versa) and found that female metabolic age was systematically lower than their actual age. Hence, women’s brains age a bit more gracefully than men’s.

    So the story all goes together well enough, but I think that (not surprisingly given this blog) the gender difference is something they just came across. The analysis is out of sample, which is nice, but I would be shocked if they went into this analysis with a real hypothesis that women would have ‘younger brains’ than men. I think it’s SOP for that group/institution to take the huge amount of imaging data they have and occasionally dig around to see what’s there.

    Disclaimer: I used to work at that school, but never with those people or on projects like that.

    • yyw says:

      Training in females and testing in males seem a terrible way of studying gender difference. Why not train the model on the entire dataset and examine the implication of sex in the resulting model?

    • Thomas says:

      1) About the “vice-versa” model, the authors say:
      “we also trained the algorithm with female data only and found the predicted metabolic brain age for males to be 2.4 y older compared with females (P < 0.038 t test, one-sided"
      Note the one-sided test and the inequality for P, neither justified. "P<0.038" is more impressive than "P=0.076"

      2) I do not buy/understand the idea of training the model on one sample (men), and applying it to the second (women). It would be a miracle if the calibration was perfect in the second sample, especially when it's taken from a different population. And it wastes information, N is only 76 men. Why not test the sex effect directly in a global model.

      3) Their best predictor is something called AG: "aerobic glycolysis (AG)—the fraction of glucose use presumably not accounted for by oxidative metabolism—from these data, defined as the absolute molar difference between age-normalized CMRGlc and CMRO2."
      They use age-normalized variables to predict age – does that seem odd? wouldn't the prediction of age be better without age-normalizing the predictors? yet these variables are not sex-normalized, and voilà, there is a sex difference.

    • Martha (Smith) says:

      “Hence, women’s brains age a bit more gracefully than men’s.”

      As the owner of an aging woman’s brain — I’m not sure what to think about this.

  2. Adede says:

    If you follow the link, it seems other people also had issues with the paper (I don’t know how much overlap with this post because the letters are behind a paywall).

  3. jd says:

    In terms of the plot, it looks like they had relatively few participants aged 40-55

  4. Anoneuoid says:

    Can one do a regression like shown in figure 1?

    This is a simple plot of predicted vs actual values. What seems to be wrong with it?

    A supervised machine learning algorithm, random forest regression with bias correction (15), was applied to the quantile normalized brain metabolism data and trained and tested against the actual chronological age of the participants. Ten-fold cross-validation demonstrates that the predicted age based on this algorithm—defined as metabolic brain age—closely matches the actual chronological age of the participants (Pearson’s r = 0.88–0.90 over 10 runs) (Fig. 1A)

    They have no hold-out so it is pretty much guaranteed the skill of this model is being exaggerated. I’ve seen this in every single publication attempting to apply machine learning to biomed. They run a series of n-fold cross validations while tuning the model/data based on each result.

    • Anoneuoid says:

      Also, this main “finding” is ridiculous. All it means is that at least one of their features is weakly correlated with the sex of their subjects, so by adding sex as a feature their model would have made better predictions. Eg, they say:

      Despite these sex differences in metabolic brain aging, random forest was, surprisingly, relatively poor at distinguishing males from females using the brain metabolism data (n = 184, gender predicted accurately in 66% of instances).

  5. matt says:

    Why are they calling this exercise “machine learning”? They fit a model that doesn’t have causal implications; that’s all. No learning by a machine was accomplished here. I think the inaccuracy of the “machine learning” label has led the public to believe predictive models can do many things which they, in fact, cannot. Linear regression is now “machine learning”. What. A. World.

    • Andrew says:


      Linear regression is a form of learning by machine, no? It was my impression that:

      – “learning” in this context is a synonym for what in statistics is called “estimation,” and

      – “machine” refers to any automated or algorithmic process.

      Thus, algorithmic statistical estimation = machine learning. Arguably, the term “machine” is more accessible to people than “algorithmic,” and “learning” is more accessible than “estimation.” Thus, instead of “we fit a model to estimate beta,” it’s “we programmed a machine to learn beta.” I don’t see the problem with that.

      • Martha (Smith) says:

        I think using “learning” to mean “estimation” is misleading: “estimation” has connotations of uncertainty, while “learning” has connotations more in the direction of certainty.

  6. matt says:


    I don’t think “learning” is a synonym for estimation; people don’t use the phrase “machine learning” to describe statistical estimation in economics or psychology when the goal is parameter estimation and not pure prediction. The phrase “machine learning” seems to me to be interchangeable only with predictive statistical modelling.

    In articles on machine learning intended for the public, they almost always state that the “machine” is able to do things that it was not explicitly programmed to do. This is, apparently, what makes it machine “learning”. While this is technically true with linear regression (e.g. the authors above did not need to write a regression program that was specific to predicting age from metabolic brain PET data), I don’t think it matches up at all with most people’s notion of what “learning” constitutes. With linear regression in particular, you can just solve the normal equations to get beta; how is this learning? This is just math. If I keep applying the linear regression method to more and more problems, the method doesn’t get any better. The math just stays the same. In most formulations of learning, I would assume the subject needs to be improving at whatever task they are undertaking to be classified as having learned something. At least with numerical optimization, you can sort of imagine the machine “learning” as it tries different values for the optimal parameters (e.g. gradient descent), but that still seems like a stretch to me.

    I think when post people hear the phrase machine learning nowadays they think of it in the way researchers describe AGI.

    • Andrew says:


      I guess different people use these words in different ways. Many times I’ve heard logistic regression referred to as a simple or basic machine learning method, and many times I’ve heard people talk about estimating hyperparameters by saying they are “learning” them.

    • Martha (Smith) says:

      Matt said,

      “I think when post people hear the phrase machine learning nowadays they think of it in the way researchers describe AGI.”

      Am I correct in assuming that “AGI” means “artificial general intelligence”? (and that “post” is a typo for “most”?) If so, I agree with your statement.

    • gec says:

      In fairness, all “machine learning” methods amount to “just math”, it is just that the math is less transparent with respect to the relationship between the predictors and outcomes than in least-squares regression (a consequence of the definition of the problem and the cost function to be optimized).

      That said, I wholeheartedly agree with your opposition to the imperialism of the term “machine learning” applied to what we used to just call “statistics”. It is particularly bad from a pedagogical perspective because exposing students to a topic under the heading of “machine learning” tends to obscure the often long history of that topic. We even have graduate students now who genuinely believe that regression was invented in the last 10 years as a machine learning technique. This is bad not just because it leads to shallow understanding, but because when these students go off to try and solve things on their own, they won’t be able to draw from 100ish years of history that can result in useful inspiration and avoidance of dead ends.

      I agree that “learning” and “estimation” seem to emphasize different things. To me, “estimation” implies we know and/or are committed to a particular model structure and the question is about the parameters of the model. “Learning” implies that we do not know the structure of the model a priori and have to “learn” it from the data. For example, when I learned about PCA and MDS it was in a class called “statistical learning” (a term that seems to have fallen out of vogue?).

Leave a Reply to Anoneuoid