Kevin Lewis points us to this recent paper, “Can invasive species lead to sedentary behavior? The time use and obesity impacts of a forest-attacking pest,” published in Elsevier’s Journal of Environmental Economics and Management, which has the following abstract:
Invasive species can significantly disrupt environmental quality and flows of ecosystem services and we are still learning about their multidimensional impacts to economic outcomes of interest. In this work, I use quasi-random US county detections of the invasive emerald ash borer (EAB), a forest-attacking pest, to investigate how invasive-induced deforestation can impact obesity rates and time spent on physical activity. Results suggest that EAB is associated with 1–4 percentage points (pp) (mean = 37.0%) annual losses of deciduous forest cover in infested counties. After EAB detection, obesity rates are higher by 2.5pp (mean = 24.7%) and daily minutes spent on physical activity are lower by 4.9 min (mean = 51.7 min), on average. I show that less time spent on outdoor sports and exercise is one possible, but not exclusive, mechanism. Nationwide, EAB is associated with $3.0 billion in annual obesity-related healthcare costs over 2002–2012, equivalent to approximately 1.2% of total annual US medical costs related to obesity. Results are supported by many robustness and falsification tests and an alternative IV specification. This work has policy implications for invasive species management and expands our understanding of invasive species impacts on additional economic outcomes of interest.
Seeing this sort of thing makes me feel that causal revolution in econometrics has gone too far. The first part of the analysis involves invasive species and loss of forest cover. That part is ok, I guess. I don’t know anything about invasive species, but it sure sounds like loss of forest cover is the kind of thing the could cause. The problem I have is with the second part of the analysis, on obesity and time spent on outdoor sports and exercise. It just seems too much of a stretch, especially given that the whole analysis is on a county level.
To put it another way: there are lots and lots of things that could affect obesity and time spent on exercise, and invasive species reducing forest cover seems like the least of it.
From the other direction: the places where invasive species are spreading is not a random selection of U.S. counties. Places with more or less invasive species will differ in all sorts of ways, some of which might happen to be correlated with time spent on exercise, obesity, all sorts of things.
In short, I see no reason to believe the causal claims made in the article. On the other hand, it says:
A multitude of fixed effects and controls for socioeconomic and demographic confounders are used in order to isolate the EAB effect. I also estimate a suggestive first-stage model showing EAB’s impact to county-level deciduous forest cover, in order to preliminarily investigate the suspected mechanism by which EAB spread may translate into biological effects on obesity and physical activity.
The causal interpretation of my findings is supported by several checks, including: (i) an event study plot showing increasing marginal impacts of EAB over time, consistent with the biologically delayed timing of EAB-induced deforestation; (ii) falsification tests showing no impact of EAB on being underweight, no impact of EAB in the years prior to actual detection, and no impact of EAB on non-ash coniferous forest canopy; (iii) a robustness check that accounts for spatial autocorrelation in EAB detection using a Spatial Durbin Model; (iv) an investigation of biological mechanisms using daily time use diary data from the American Time Use Survey (ATUS); (v) results showing that changes in economic activity are likely not driving my findings, and; (vi) an IV specification that uses EAB detections as an instrument for deciduous forest cover to validate a suspected deforestation pathway of effect.
Sorry, but all the multitudes and Durbins and specifications and pathways don’t do it for me. Again, the pattern of invasive species is non-random, and it can vary with just about anything. So, no, I don’t agree with the claim that “This work contributes to the literature on the economics of invasive species by broadening our understanding of invasives’ true indirect costs to society.”
What’s going on here?
Remember that quote from Tukey, “The combination of some data and an aching desire for an answer does not ensure that a reasonable answer can be extracted from a given body of data”?
Another way to put it is that the story they’re trying to tell in this paper, starting with invasive species and forest cover and ending up with obesity and physical activity, is just too attenuated to be able to estimated from available data.
As I see it, there’s a misplaced empiricism going on here, an idea that by using proper econometric or statistical techniques you can obtain a “reduced-form” estimate. The trouble, as usual, is that:
1. Realistic effect sizes will be impossible to detect in the context of natural variation,
2. Forking paths allow researchers to satisfy that “aching desire” for a conclusive finding,
3. P-values, robustness tests, etc. help researchers convince themselves that the patterns they see in these data provide strong evidence for the stories they want to tell.
4. Given an existing academic tradition, researchers don’t notice 1, 2, and 3 above. They’re like the proverbial fish not seeing the water they’re swimming in.
Criticism as a collaboration between authors and audience
At this point it’s time for someone to pipe up that we’re shouldn’t be criticizing a paper we haven’t read, that we’re being mean to the author who we should’ve contacted first, who’s either a working stiff who does not deserved to be criticized by a bigshot, or else a bigshot himself who should be able ignore the pinpricks of the haters, etc etc etc.
To these (hypothetical criticisms), I reply that, no, I don’t think we should be required to spend $24.95 in order to criticize published work:

More generally, publishing a work makes it public. If you don’t want work to be doubted in public, there’s no need to publish it. Just to be clear, I’m not saying the author of the above-discussed paper is bothered by this criticism. I’m speaking more generically here.
Also, I’m fine with people publishing in paywalled journals. I do it too! Publication is a pain in the ass, and we’ll usually go with whatever journal will take our paper. It’s a weird thing because we’re providing the content and doing all the effort, and they’re then taking possession of it, but that’s how things go, and we’re typically too busy with the next project to want to buck the system on this on.
So, to continue, I hope we can see this criticism as a collaborative effort between authors and audience. The authors do the service of publishing their work rather than merely spreading it on the whisper network, and the critics do the service of posting their criticisms publicly rather than keeping it on the Q.T. and contacting the authors in secret.
Doing this in public allows everyone to be involved—including any third parties who’d like to argue that my criticisms are misplaced and we should believe the claims in the above-discussed article. Those of you who disagree with me—you should be able to see what I have to say too, not just have this locked in an email to the authors which you’ll never see.
As to my comments being critical: Yeah, I don’t think the published analysis is saying what is claimed. That’s too bad. It’s nothing personal. There are some dead-end paradigms in scientific research. It happens. We have to be looking at the big picture. We’re not doing researchers any favors by politely accepting claims that aren’t supported by the data. Indeed, take enough such claims and you can put them together and you end up with an entire junk literature which can be meta-analyzed into junk claims.
What, then, to do?
The final question is what would I recommend authors of this sort of paper to do? If I don’t believe their claims—if, indeed, I think the connection between invasive species and obesity is too tenuous for such an analysis to “work” in the sense of telling us something about the effects of invasive species on obesity, as opposed to turning up some correlations in observational data—then, given that they’re interested in this topic and they have access to these data, what should they do?
I’m not sure—maybe there’s nothing useful they can do at all here!—but, to if there is something to be gained here, my suggestion is to frame the problem observationally. These are the places with more invasive species, what’s been happening in these places, how do these places differ from otherwise-similar areas that did not have an invasive-species problem, etc. I’d say just drop the county-level obesity data entirely, but if you want to study it, look at the usual factors such as urban-rural, age, ethnic composition, etc. Learn what you can learn, forget about the big claims.
All the methodological stuff aside, the cited article does something that economists often do that drives me crazy: they take an event or factor that belongs to set S and frame their research question, “Does S cause blah blah blah?” But so often S is an immensely heterogeneous set, and even if this one element can be shown to maybe have a causal impact, it has little to no bearing on the rest of the set.
An example I recall vividly had to do with child labor in Vietnam. The labor in question was on rice farms, where hard-pressed parents bring their kids into the fields. Well, for a while Vietnam had export controls on rice, which depressed the price, which made the parents even more hard pressed and caused the kids to have to work more. Then the controls were eliminated, the price rose, etc. etc. Fine. But the article was framed as “The effect of trade liberalization on child labor” (Edmonds and Pavcnik, Journal of International Economics, 2005, 65/2), as if this were a typical example of movement toward free trade. Well yes, it was a form of trade liberalization, but quite singular, and the findings have nothing to say about what might be expected from import liberalization, abolishing trading boards, etc.
Can invasive species affect obesity? Well maybe sometimes. But there are lots of kinds of invasive species with all sorts of environmental impacts. Hell, I can remember joining volunteer work teams pulling Scotch Broom, beautiful but invasive, and I suppose for the less active members of my team this was obesity-reducing.
What is it that makes economists want to plaster huge categorical pronouncements on their work?
Oh, the broom, the bonnie bonnie broom, the broom of the Cowdenknowes …
This drives me crazy too. The example I always think of is Acemoglu Johnson and Robinson (2001). Their coefficient of interest is effectively “the effect of passing a law that causes Political Risk Services to increase a country’s “institutional quality” score by 1 point on that country’s income per capita”. Surely there is no such fact as this “effect”! It depends on what kind of law you passed – different laws which lead to the same PRS score might have wildly different effects on income per capita. For “the effect of X on Y” to be useful, surely we need some kind of cohesion among all the different counterfactuals that we summarize by a particular value of X – I don’t mean heterogeneous effects of different individuals to the same “treatment”, but heterogeneity in the treatment itself. Probably there is a good approach to thinking about this problem (and interpreting results such as those of AJR in light of it) out there somewhere – I would appreciate references if anyone has them! – but I don’t think it has reached economics, at least not the economics I’m exposed to.
The problem is that there are multiple versions of treatment, which violates the SUTVA assumption required in Rubin’s potential outcomes framework. This is the same problem with meta-analysis where people try to estimate an overall treatment effect that is some weighted average of different treatments. Datacoloda recently had some posts arguing that these treatment effects are often meaningless. There’s a lot of older literature on this though from say the 80s.
Thanks for these references!
I write as someone who lives deep in the heart of emerald ash borer (EAB) country. Never mind the esoteric ins and outs of EAB/obesity data collection; I am personally concerned with the huge and lovely ash tree planted many decades ago by my wife in our backyard: the last snowstorm of the season caused a wayward branch to bring down our power line, resulting in thousands of dollars of damage.
Therefore, so to speak, I have skin in this game and the data collection problem that really exists. Namely, does treatment of an ash tree do any good whatsoever regarding EAB? As far as I can tell, the only entities that claim treatment is worthwhile are those trying to sell a particular treatment. I have never been able to find an Independent laboratory not in the sales business which says any treatment works.
However, it could well be that I am entirely mistaken. So, I look forward to seeing if anyone posts something which indicates I should be more hopeful and less suspicious.
I don’t know the answer to your questions about the emerald as borer, but my colleague. Professor Melissa Fierke, almost certainly does. See https://www.esf.edu/faculty/fierke/index.php
Thanks for the tip! I just sent an email to her asking for her advice.
Very interesting post. Could you point us to resources discussing in more detail the problem that “Realistic effect sizes will be impossible to detect in the context of natural variation”? Thank you!
Andrew –
I’m a bit confused by this post.
I respect your “authority” for assessing statistical analyses resulting in portrayal of causality (since I lack a knowledge of statistics sufficient to make an assessment independently) – and certainly the putative causality here strikes me intuitively as pretty tenuous, causally linking together phenomena which seem many steps apart from one another.
But here the authors seem to describe an association, theorize a causal link, and then set about systematically to test that causal link from a number of angles – postulating about measures that would falsify their theory and then testing them.
Seems to me that there’s a kind of continuum here, with seemingly implausible causality asserted with no meaningful robustness checks at one end of the continuum and what seems like implausible causality turning out to be validated through robust checks against falsifiability at the other. I’m not sure how to see where some kind of dividing line between those two ands of the spectrum can be described based on some kind of inclusion/exclusion criteria matrix.
My biggest problem with this paradigm is the “reduced form + robustness checks” is very poor substitute for actual Bayesian model with a hypothesized mechanism that predicts a multitude of outcomes and is compared to predictions in a multitude of dimensions.
If you’ve got a mechanistic model about how certain areas people rely on outdoor activities, and forestation is important and beetles cause deforestation, and that alters local temperatures and pollution and aesthetic amenities and that drives people indoors and they do less exercise… Then build the model and collect the relevant data and show the fit and make some predictions for the future and then come back and follow up to show how well the future was predicted etc.
Daniel –
That’s am interesting response (as is Blackthorne’s below), and I think overlaps with our discussion downstairs…
It got me to thinking about how something like the casual chain I’m this paper strikes me instinctively kind of nutty – I mean really, how could these little insects eating trees affect heath outcome in humans?
But in the other hand, I’m less instinctively sceptical that minute differences in the vastly complex human chemistry, so as to differently affect the genes of one person compared to another, could lead to differences in personality or propensity to earn a lot of money or have career success as a computer programmer. I mean doesn’t that also seem like pretty big leaps across domains of phenomena?
Which makes me wonder if people use Bayesian modeling to inform the nature verses nurture debates.
A quick Google suggests there’s not much out there.
I really don’t have a problem with something like the hypothesis I laid out. Why would it be controversial to say “ecology affects human choice of activities?”. It doesn’t sound nutty to me. I’d much rather play soccer if I have a nice green park than a heat-islanded overheated fake-turf field. I’d much rather hike 40 minutes to get to a nice trout stream than 40 minutes to get to a paved parking lot.
I just think people need to take their models seriously and go for it. Economists don’t want to do that because a p value and robustness checks gets them publications and promotions and etc, whereas a messy model that shows you that there’s a lot of plausible mechanisms for all of this stuff which are consistent with the data … doesn’t.
Still, there are LOTS of situations where we should just go for it, work on building those models, because if we do understand what’s going on we can really benefit.
Let’s step back and get a broader perspective. How does plant and small animal ecology affect urban quality of life, energy consumption, local temperature variation (heat islands etc), air quality, excess mortality, traffic, etc.
Most of the time I see someone addressing these questions it’s to take some tiny slice of the question, fit some regression, claim there’s a notable important effect of A on B and B on C as Blackthorne mentioned, and then publish, put it on their CV claim victory, request promotion!
Compare that to the systems perspective of Donella Meadows and The Limits to Growth. I don’t know that I’ve ever read social sciences that’s anywhere near as convincing to me as that book. Sure it doesn’t have specific “the future will be like X” predictions, what it has is “if scenario A plays out, then the future would be more like X than if scenario B plays out where it will look more like Y, however in all scenarios we investigate we always see things Q,R,S eventually. We have tens of thousands of times more computing horsepower now… our limits are that we don’t have the scientists who know how to do that kind of science and the institutional/cultural systems that encourage it and fund it. Everyone wants a slam dunk little bullshit piddly asinine paper, no one wants to actually understand things deeply.
Right, the problem isn’t a nutty hypothesis, it is that we know it is true beforehand.
If A happened before B, then A influenced (partly “caused”) B. It is interesting to discover when this principle is violated, verifying it is true once again is not of interest.
Of course, the precise nature of the influence could be interesting. But in this case we know whatever effect is going to be negligible and will have trouble generalizing anyway.
Daniel
Regarding The Limits to Growth, it is worth noting that economists panned that book completely. Rather than appreciate the modeling approach, they focused on projected trends that lacked the market adjustments that would alter those trends. Since Andrew’s post concerns econometric modeling, I think it is worth noting that the example you find so compelling is one that economists found lacking. What ties together the example in this post and the economists’ reaction to The Limits to Growth (in my mind) is the lack of any sense of when a proposed causal chain is or is not worth studying. Economists love an obscure chain that is obscure with unintended effects – like invasive species leading to obesity. But they find more tangible chains where continued economic growth (as traditionally measured) leading to ecological collapse worthy of ridicule. What is the underlying methodology used to determine when such causal chains are worth taking seriously and when they are not?
Sometimes it seems like the underlying methodology depends on whether it is proposed by a well known economist or not. Often it seems like it depends on whether it conforms to some pre-established notions, such as the sanctity of markets or the irrationality of liberal policy advocates. But it is rarely stated explicitly or open to serious discussion.
Dale. As far as I’m concerned the Economists got it 100% wrong. Sure, Meadows etc al couldn’t build in all the feedback loops that economics could adjust things with but all that could possibly do is alter some timescales a bit. Meadows built in the important structure and the limitations that physics imposed etc.
Chinas population growth has cratered over the last 5 years and is presently negative. India surpassed China by number of people but it’s population growth is cratering as well and will go negative in a few years. The US is below replacement birth rate if I remember correctly but still growing due to immigration, slightly. The western united States has been in a long drought caused very likely by climate change and CA lost something like 50% of the trees living in the state over the last 10 years. Europe is in the middle of another land war with Russia and barely able to heat their homes last winter. The college graduates in China are lying down and taking pictures of themselves as zombies rather than celebrating. The population of fish in the oceans is something like 10% of what it was when Meadows wrote her book. Economic growth like we saw in the 1940-1970 period will not be coming back. Meadows had the last laugh.
Of course as far as I know Economists are still smoking the same hallucinogens they were back then.
Dale –
> What is the underlying methodology used to determine when such causal chains are worth taking seriously and when they are not?
This is what I was asking.
Seems to me the ‘seriousness test’ in many instances is pretty subjective. And I find that my own evaluation may be inconsistent. At first blush it might seem unlikely to me that invasive beetles “cause” obesity but then am I applying a consistent standard, say, if at first blush it seems plausible that relatively % tiny changes in genetic makeup might predict certain behaviors or psychological attributes? I make these calls but I don’t know what my own methodology is.
I usually consider checks against falsifiability as a pretty good filter but here Andrew argues they aren’t very informative.
So what’s the underlying methodology to use?
Daniel
No, economists are not using the same hallucinogens – the current ones are AI created. Seriously, I mostly agree with you – Meadows, et al made a significant contribution methodologically and economists dismissed it because it did not conform to their prejudices. We might debate particular examples about what limits are or are not binding: in many ways, the declining birth rates are just the type of feedback mechanism that neoclassical economics might predict and was not part of the Limits scenarios. My personal favorite would be species diversity (and whatever ecological changes it might be a good indicator of). You might find this analysis (https://www.brookings.edu/wp-content/uploads/1992/06/1992b_bpea_nordhaus_stavins_weitzman.pdf) of interest – notice that it was written 20 years after the Limits and 30 years ago. But it illustrates both how economists view the idea of Limits, both then and now.
I had to go look up the CA tree thing because I didn’t want to be way wrong. There are regions where 50% or more of trees have died, but that’s unlikely to be the correct overall figure. It’s probably closer to 20% but I couldn’t find any kind of official estimate. The us forest service estimates 36 million trees lost last year alone. Here’s some info that includes historical estimates as of 2017 https://www.fs.usda.gov/detail/catreemortality/trees/?cid=fseprd569608
That estimates from 2010 to 2017 there were 129M trees lost. In the following 6 years at least 36M additional because that was last year alone, but most likely more like another 100-150M Interestingly a big part of this is bark beetle attacks. I’m not sure what percentage it works out to, but let’s guess something like 15-30% range.
Dale, and Joshua,
Thanks for the article Dale, I’ll take a look. IMHO the problem with Limits to Growth was that it became political fodder rapidly, and that meant people looked to tear it apart from specific poor numerical values rather than looking at the qualitative results. It should be read 100% qualitatively, but probably wasn’t.
I don’t even think I’ve read the whole thing. It’s online, so I skimmed through sections, looked at graphs, read some discussions here and there. I got into it about a year ago having never heard of it before (I was born after it was published, and by the time I could read it and understand it, it was “old history”). A lot of the book just tries to teach “dynamical systems theory” to essentially people with a high school education.
What struck me about it was that they tried HARD to examine different scenarios and see what the changing scenarios meant for the end results. Dale for you to say “declining birth rates are just the type of feedback mechanism that neoclassical economics might predict and was not part of the Limits scenarios” seems wrong. Their chapter V is all about “negative feedback loops”. “Either the birth rate must be brought down to equal the new, lower death rate, or the death rate must rise again. All of the “natural” constraints to population growth operate in the second way — they raise the death rate. Any society wishing to avoid that result must take deliberate action to control the positive feedback loop–to reduce the birth rate.” (Meadows et al pg 159)
Of course how “deliberate” that needs to be is up for question. Ask 20 people age 30 today in the US whether they’re planning to have kids and you’ll find that a lot of them aren’t (https://www.newsweek.com/gen-z-millennials-put-off-having-children-same-reason-1794231). It’s not because they’re saying “I deliberately decided that the future needs fewer people” it’s because they feel they can’t afford to take care of their kids. But that’s exactly the kind of thing Meadows was thinking of! To imagine she didn’t understand that there would be economic pressures to reduce childbearing would be asinine! She then tries purely mathematical things to adjust birth rates, and finds out what the scenarios look like, but she’s very explicit about it being a pure device not any kind of political or mechanistic policy suggestion (pg 159, 160). She shows how fixing the population constant leads to constant population, but out of control growth in capital / capita, followed by depletion of physical resources. Her figure 46 shows what happens where in essence we wind up with major controls on growth through environmental regulation, technology production, etc. She still winds up with 3x the GDP/capita of 1970 world avg but it’s sustainable because it specifically prevents resource depletion.
We know that growth doesn’t go exponential forever. If it did eventually every atom on the planet would be used just in the production of golf balls. That’s just a mathematical fact. The question Meadows posed was **which path do we take, and what do the different feasible paths look like?**. Economists are smoking those AI hallucinogens which treat the whole thing as path-independent. It doesn’t matter which path we take, because we can’t put any judgement on different paths, we’ll automatically choose the path that approximates the one most beloved by the varied interests of the market” etc… Just like the battles in the Pacific Theater in WWII were the optimal and desired outcome according to the general equilibrium theory aggregating global preferences or whatever (heavy sarcasm here). The failure to judge different scenarios is imho a major problem in Economics. Even if your goal is to judge only by aggregated utility of some kind, you need to decide “is this future path the one people WANT, or just the one they HAPPENED TO GET?” You can’t do that unless you examine the system at a global level.
Joshua. “So what’s the underlying methodology to use?”
I argue it’s a Bayesian model of a dynamical system. A dynamical system because there are feedbacks and interactions, the state of the world next year is dependent on the rules for how things work and the state of the system this year. That’s just a fact of life. That isn’t even arguable! A fact that Economists appear to be desperate to ignore, but they’re hardly the only ones. Bayesian because while we may be able to model a reasonable structure there’s no way we can a-priori know the values of parameters for such a system, we can only decide which ones are more or less plausible given the observed trajectory. A proper analysis of forestry and human health would say “we think forest cover affects physical activity and that affects human health, and the system is somewhat complex, but we’ve simplified out the basic structure as follows… maybe 20 or 30 feedback effects would be reasonable. Fitting to data we find the following sample of 100 different parameter vectors for our dynamical system, rerunning the system under counterfactuals, we find that the range of changes in obesity expected if deforestation had not occurred to be X, actual observed changes in obesity were Y, the distribution of differences was D, therefore it’s plausible that deforestation in these cases was responsible for somewhere between A and B change in obesity…
The thing is, if you do this, given the actual available data, you’re going to find the distribution D is very broad because there are so many factors where you don’t know the proper parameter value and the data you have doesn’t actually inform you much. If you do this with some regressions and confidence intervals, you’ll find “reduced form estimates” that are completely bogus. The underlying assumptions of the model that lead to the reduced form are 100% wrong. Starting first with classical statistics assumption that data is drawn from a random number generator not a dynamical system with feedbacks.
Dale,
To avoid some of the complexity of replying to replys i’m starting over here. I skimmed through the 1992 article by Nordhaus. I liked it, though I confess to not knowing some of the jargon (Cobb-Douglas and Hicks-neutral for example).
I liked his point about chaotic dynamics (figure 1). But the one thing he fails to come out and say properly is simply this: “Exponential growth of human population and wealth per capita will not and can not continue FULL STOP”. Any economist that doesn’t acknowledge this can be safely pointed to and laughed at. I think he acknowledges implicitly, but I’m not sure economists in general do.
This is an indisputable and simple fact of physics. If it did continue at let’s say 1% per year for 1000 years then the number of people on the planet would be 10^14 which is 10^5 times as many people as today. Currently there are about 53 people per km^2 of land according to wikipedia. at 10^5 times as many there would be about 5M people per km^2. The density of Tokyo is currently about 1076/km^2 so every square kilometer of the earth would be covered at 5000 times the density of Tokyo. By long before that happened there would be zero oxygen on earth due to loss of any plant life.
The question is not “can exponential growth continue forever”, it’s not even “how long can exponential growth continue”, the real question is… how/in what way does exponential growth stop?
The 1992 Nordhaus paper still has hints of stuff like “we can potentially rejigger technology and etc and extend things a long time” in it. Maybe you can for some technology etc, but some stuff you can’t. Food for example, for every person in existence you need at a minimum with no waste about ~ 2000 calories per day of food production. This is linear in the number of people and will always be so. Energy consumption is similar to some extend. For a given living standard you need to consume some number of kWh per day even without climate change just to move thermal energy into or out of living spaces. For a given living standard you need ~ 1000 sqft of living space per person to protect from sun, wind, rain, etc. It’s no good if the world is entirely apartment buildings and grassland covered in cows, so we need species diversity that we’ve already lost, if only for stability of ecology so that we don’t have massive booms and busts like mouse plagues in Australia.
The stupidest thing about arguments against Donella Meadows et al is the attempt to attack just the basic inevitable incontrovertible physics of the situation, and to do so with a “ha ha what stupidity and ignorance” that seems to exude from the original 1970’s economic critiques.
Sure, you can do lots of stuff to say “hey we can ultimately come to a sustainable equilibrium with N people instead of K people and with a living standard of $G/day instead of $H/day or whatever.” Those are all up for discussion. The idea that we’ll have “growth forever” is just stupid.
Nordhaus goes so far as to say that “Ultimately the debate about future economic growth is an empirical one…these are empirical questions that can not be settled solely by theorizing” but one thing we can do through theorizing is say that there is some bayesian uncertainty about the upper bound of the number of people who can ever exist on earth, but it’s way more likely to be around 10-20B than it is around 1 Trillion. And there’s an upper bound on the number of kWH total we can utilize in a day, and it’s certainly less than 10% of the sunlight that falls on the world in a day… It’s possible for us to use a certain amount of Lithium, but it’s gotta be less than N times the known reserves at the moment. etc.
The question to ask ever since 1971 is “which policies lead to a stable prosperous future, and which lead to chaotic cycles of growth and despair or crashes from which we can’t recover?” It’s possible to discuss a wide variety of models for the future, the thing that’s not really possible to argue with is:
1) The model needs to be dynamic
2) There are real physical limits that no amount of technology can overcome (such as energy usage, and food usage per person).
3) There are real feedbacks that allow us to rejigger things but they must all stay within the limits of (2) and they all take time to rejigger.
sigh. apparently I didn’t select the right place to put the reply :-(
Daniel
Yes, you have messed up the order of these posts. I’ll bury my comment here. I think I agree with your larger point, but not your examples. I don’t think there is a fixed amount of energy or calories or land required per person – these can all change with technology. Also, I think economic growth is the wrong target to criticize – if humans valued meditation and performance art rather than material objects, then I think economic growth would have very different meanings. But what I do agree with is the failure of economics to deal with physical realities. Ecologists talk about cycles and systems – economists “solve” environmental problems by taxing particular outputs (CO2, nitrous oxide, etc.). I don’t think taxing an unwanted output can be reconciled with the fact that these are physical systems. The economics approach has always seemed to me to be partial at best.
I’ll agree with the species example. And the idea that the real question is how we adapt to the limits that do exist – there are many scenarios and some are much worse than others.
Dale, there’s no amount of technology that will change the average basal metabolic rate of a human. You’re always going to need to eat something on the order of 2000 calories a day (1500 for some 3000 for others or whatever). If there are 10B humans you will need to produce 2x as much food as if there are 5B humans.
When it comes to energy usage for climate control of human living spaces you might reduce this to some level, but a person living in the UK will need to heat their house in the winter, and a person living in 110F climate will need access to cooling spaces (consider how many laborers die each year in Qatar due to overheating).
“economic growth” in the context of meditation and soforth could look very different. but *population* growth wouldn’t, in the sense that you’d need space to put the people, and to produce food, and provide for their basic living requirements. You can get by with some pretty spare living spaces, but let’s face it no-one wants to starve, die of dysentary, freeze to death, die of overheating, die from hurricanes, or being hit by grapefruit size hail, or tornados, or drown, no matter how much they like meditation.
The existence of population limits simply shouldn’t be a controversial question. The precise value of the limit and the tradeoffs it produces of course are very up for discussion.
I was trying to make a point in a way that doesn’t always come across on the interwebs. Yes, it’s not nutty to say that an influx of a invasive species can affect the environment, which in turn can impact human health. (The point being that it’s not really the EABs that “cause” human obesity but that their invasion in an area serve to moderate the background relationship between the natural environment and human health outcomes (a fairly non-controversial causality). I’ve traveled in parts of the Rockies devastated by EABs. Seeing the devastation had a strong emotional impact to the extent it still has resonance for me some 15 years later. The magnitude of the impact felt dramatic and supports a predisposition for me to believe that the effects of EAB invasion are powerful.
But at another level it seems implausible that insects moving into new areas and then laying eggs under the bark of a particular species of trees “causes” people to gain weight. The distance between the cause and the effect seems too far, crossing too many seemingly distinct domains of phenomena.
I think of one way to assess what’s really too thin of a thread of causality is to look for hypothesis of what would falsify the theory and then test for those. That’s why I like Bradford Hill’s criteria for causation. In this case, you could hypothesize that if the association between EABs and weight gain occurred in some areas newly invaded by the EABs but not others, a causal link would be falsified.
So I can’t just go on my instinctive reaction and by Andrew’s argument, looking for tests of falsifiability won’t suffice either. I like your focus on Bayesian modeling, but I note that in something like the causality of genetics there’s widespread acceptance even though from one angle the causality could seem implausible, and tests of falsifiability don’t really come readily to mind and I suspect there’s little Bayesian modeling. So I’m mostly just ruminating on how cognitive biases can easily affect how I approach his issue.
> But at another level it seems implausible that insects moving into new areas and then laying eggs under the bark of a particular species of trees “causes” people to gain weight.
I don’t see anything implausible about that. You’ve got this fine-tuned equilibrium where stuff is going on regularly, and then you eliminate some of it (hiking in the forest). The equilibrium will shift. You know what’s really crazy? Putting ink on paper in Washington DC can one year later completely change the amount of money everyone in the US has, but only if one PARTICULAR person in the whole world does it! It’s totally unreplicable for anyone else to do it!
I’m being a little facetious obviously, but there are lots of way less plausible causal associations.
The thing about this one is that it seems implausible that you could get the right answer using the methods and data available for these guys. Also it’s somewhat implausible that you’d get the same size effect for each location the bugs invaded. If outdoor activities are not a major part of the local culture, the effect should be smaller. If they are, it should be bigger. If there are limitations on activities due to something like COVID, or wild-fires, or excessive snowfall, or whatever it’ll also modulate the effect size. There’s just a lot of stuff going on.
I won’t speak for Andrew but I’ve always viewed these types of papers as employing a sort of motte-and-bailey tactic. The causal link is something like A -> B -> C, A -> B is more robust but relatively uninteresting and usually not novel, B -> C is more interesting but tenuous/unproven. The paper spends most of it’s time on A -> B, but then at the end relies on B -> C for the headline results/policy implications. The papers then get picked up by journalists/agencies/justices who say things like ‘using causal identification methods x was able to show that A has this effect on C’. I’m not sure how much the causal revolution is to blame for this, but it has certainly made that A -> B robustness necessary for publication in certain subfields of Economics, and there are only so many situations out there that provide the robustness and involve questions people are actually interested in. As a result, I think it has encouraged Economists to tack on the B -> C part, but it’s hard to say whether the causal revolution or the entire way academia operates is more to blame.
There is good news, you are correct, and that method is known as instrumental variables, which revolutionized inferring causal effects from observational data.
It is very hard, perhaps impossible, to have a convincing study when the conclusion is a priori dubious. The best example would be Bem’s ESP stuff. He used frequentist stats to show that if people were guessing with random chance, the probability we’d have seen their success rate was really low. He never actually showed P(ESP | data). only P(data | random guessing). We’d have to do bayesian stats to get the probability of the hypothesis, and that requires enumerating all the hypotheses and assigning priors. With Bem, hypotheses included: random guessing, ESP, fraud, p-hacking. Unfortunately for Bem, p-hacking both explained the observations better than ESP and had a higher prior. But even if the data was so extreme that p-hacking couldn’t explain it, we’d sooner believe it was fraud than ESP. Bem couldn’t win. I would only believe it was ESP if the experiments were replicated by a lot of good researchers. And frankly, I’d probably need to replicate it myself to really believe it.
So here, the conclusion we’re asked to swallow is that the invasive species is having a large measurable effect on obesity. Andrew and I put a low prior on this. Our priors on some residual correlations among counties affected by the EAB is much higher and can account for the observed data reasonably enough. Sure, robustness checks try to show that confounds can’t explain the observations that well, but robustness checks just aren’t thaaat strong. At least, they aren’t strong enough to convince us to go along with the authors preferred conclusion when they are asking us to accept such an extreme conclusion.
I think from your post that you really think the revolution hasn’t gone far enough. ;)
Andrew,
I agree with the title “The causal revolution in econometrics has gone too far,” and I’d love to see you discuss economics papers more often. It seems to me that the field is chasing more and more speculative claims and somehow believing that econometrics can overcome the garden of forking paths, small effects, measurement error, etc. For example, one of the top economics journals recently published a paper that claims Trump rallies during 2015-2016 caused police to stop a higher proportion of Black drivers. Maybe this is true, but seems quite unlikely.
But we do know that Trump causes bridge players to defend worse in No Trump contracts… with a robustness test! (But no peer review.) https://www.stat.columbia.edu/~gelman/research/unpublished/notrump_falk_gelman_icml.pdf
Those conference proceedings — like most in computer science — are peer-reviewed.
https://icml.cc/Conferences/2022/ReviewerTutorial
I remember looking into the Trump rallies paper a while back. The authors really went hard on pushing their preferred explanation that Trump rallies were making racist cops act more racist against blacks. I remember the published paper, “Inflammatory Political Campaigns and Racial Bias in Policing”, didn’t really provide much evidence for their preferred mechanism, they mostly just showed there was an increase in the arrest of black drivers after a Trump rally. Kinda baffled me that they would talk so much about the mechanism without supporting it.
Turns out there’s a reason for this though! You can google the older version of the paper, “Whistle the Racist Dogs: Political Campaigns and Police Stops”. In this version, the authors did provide “evidence” to support their mechanism, and boy, they really went wild with misrepresenting evidence. Their main evidence was a survey (see figures 3–5, and A5–A9) that showed prejudiced people who read about a prejudiced trump speech found blacks to be more violent (but not hispanics or asians). Beyond hanging so much on a single barely significant 3-way interaction effect, the survey results were taken from a different paper that was looking into racism against latinos. So the “prejudiced people” were in fact selected based on stated negative stereotypes against latinos, and the trump speech was about immigration policy. How this would cause readers with prejudice against hispanics to have worse attitudes towards blacks and not latinos is beyond me, but the authors swept all this and more away to support their preferred mechanism.
Anyway, I guess they were thankfully forced to remove this garbage during peer review (and rename the paper), but now the reader is left a paper where the authors seem insistent on a specific mechanism and yet provide no support for it over alternative hypotheses. Weird stuff.
The authors of the paper you’re referenced devote considerable time to establishing support for their preferred mechanism. They examine how their effects vary as a function of three proxies for racial bias:
1. Explicit references to race in Trump rallies
2. County-level measures of racial bias (e.g. historical frequency of lynching of Black civilians)
3. Officers who stop a much greater share of Black civilians than their peers in the same county
They find that the effects of racial references in Trump rallies are most pronounced for those officers who stop a much greater share of Black civilians than their peers. The authors argue that this evidence supports a “priming” hypothesis: namely, that Trump rallies induce already-biased officers to engage in greater degrees of bias than they currently do.
Perhaps our priors are different, but this seems like a fairly believable mechanism.
One problem is that Trump rallies affect the local policing in a lot of ways. There are protests & counter protests, vandalism, an influx of people from outside the area, media and police presence, traffic, etc. It’s really hard to isolate the effect we’re interested in, and the authors seem content with just a bunch of weak downstream proxies. They don’t directly measure the mechanism.
The mechanism appears to be: Trump gives a speech racist towards blacks -> Racist cops are primed by the speech to be more racist -> more blacks are pulled over.
1) How racist is a trump speech? The authors count words like “crime”, “race”, and “drugs”. Context doesn’t matter, since the authors say in 5.3 that even anti-latino rhetoric leads to more blacks—but not latinos—being pulled over. That’s apparently “surprising”. I’d use “concerning” and “contradictory”.
2) How racist is a cop? Another proxy with the rate of warnings given to blacks vs. whites while controlling for county and time of day. Verifying this proxy is deserving of a paper by itself!
3) Did cops even listen to Trump’s speech? We don’t know! I’d have thought most would be on-duty during it given the circumstances.
4) Do Trumps speeches prime racism towards blacks? Well the evidence for that was mangled and had to be removed in the final version.
5) And to show just how robust their results are, they decide to look at how racist each county is. I’m unsure what value this even provides, given they already have how racist individual cops are. But anyway, here they use: Questions CC442a and CC422b from 2012 and 2014 Cooperative Congressional Election Surveys, the presence of slaves in 1860, cotton suitability as an exogenous predictor of slavery, and the local number of lynchings and executions of Black people between the Civil War and World War II.
The central problem I have with these kinds of econ papers is that the authors substitute weak proxies for actual knowledge of what’s going on at the ground level. It’s how you end up with labeling individual cops racist based on nothing more than a regression of how often they give warnings to blacks vs. whites. And how you conclude that local politicians didn’t affect policing decisions just because a comparison of traffic stops between state troopers vs. local police came back with p>0.05 (see 4.2). This stuff doesn’t work, and stacking a dizzying array of dubious foundations doesn’t magically create firm ground.
Also, these authors seem to be trying to prove their case, rather than actually understand what’s going on. I doubt they would include any metrics that don’t support their conclusion. I feel everything is cherry-picked.
That is not science. Such papers are versions of number crunching.
The strategy in empirical economics is often: find a subject, which resembles one of the known identification strategies (IV, diff-in-diff etc.). Or better, study a method then find a topic which may fit in. Fit the linear regression and count the starts. Add “control” variables one by one to “check” alternative mechanisms. If stars are still there, then you are done: discovery happens. I read so many papers (well do not read to the end) these days.
Reading this blog and Bayesian stats, I try in my work to think seriously about model selection/comparison, measurement issues, existing knowledge about the data generation process, effect size, collider bias etc.
Deep down, academic industry incentivizes to mimic the leaders, not to explore. People optimize for publishing as many as possible. If you really care about the deep issues, you are seen as being too pedantic. Result: everybody writes, nobody reads, nobody believes.
That is why, I am all the way in for the “slow science”.
Why are economists even studying this? They should be figuring out stuff like how to measure the money supply. Also how it flows through the economy causing inflation here, then there, finally reaching low-wage workers (price of labor) at which point the fed tries to stem the flow.
That type of thing is what people think economists are trying to figure out. Like, was this funded by an economics grant?
Can we take seriously the criticism from people that haven’t even bothered to read the paper? Sure it’s behind a paywall but Andrew’s institution surely has access. Is there evidence in the paper for Andrew’s claimed issues of forking paths or correlations fished out of a multitude of potential variables? Does the paper lack substantive causal theory (i.e. plausible mechanisms)? Does the paper claim anything about the number of invasive species (no, and Andrew seems to be very confused about this)? I have at least skimmed the paper and, given the related studies that find similar effects of deforestation on various aspects of outdoor activity and health outcomes (along with other ecosystem services) it seems fairly plausible actually from a mechanistic basis. The sample units here are providing a before-after comparison and the stochastic nature of spatial spread of the defoliating beetle species seems to provide a fairly effective natural experimental design providing good contrast between impacted and not-impacted locations. I’m sure there are likely to be some problems with the study design and analyses, as there are with all studies, but are they big enough to discount the work as merely finding random correlations? I’d be curious if a closer look would dampen some of the speculative knee-jerk reaction.
Skeptical:
1. Regarding forking paths, you’re getting it backward. The usual, completely normal pattern is that coding and analysis decisions are made in light of the data. Every once in awhile a study is preregistered, which doesn’t usually eliminate the forking paths but it reduces them. If the study is not preregistered, there’s just no reason to assume that all coding and analysis decisions had been made ahead of time; indeed, it would a bit bizarre for someone to make all coding and analysis decisions had been made ahead of time and then not preregister this. Just to be clear: I’ve published hundreds of applied statistics analyses, and in only one of these did I make all coding and analysis decisions had been made ahead of time. So it’s not like I think there’s something wrong with forking paths; they characterize just about everything I’ve ever done.
2. Regarding criticism more generally: One heuristic I often suggest is to suppose that a paper being discussed is just a “preprint” that has not been published in an official journal. In that case there is no presumption of correctness. More generally, I don’t think it’s appropriate for us to believe things that are published unless there’s a strong and detailed argument against. If a paper is making claims that are implausible and is justifying these claims using this sort of reduced-form analysis, this is a problem, and I see no reason to believe it.
3. I agree with your point that a more careful look at the problem would reveal more. In the above post, I shared some general concerns from a starting point of ignorance. I have no doubt that a subject-matter expert could have lots to add, and I’m also open to the idea that I could be convinced by further analysis.
Careful, those bugs also appear to commit more crimes, https://www.fs.usda.gov/nrs/pubs/jrnl/2017/nrs_2017_kondo_001.pdf. Apparently they are very bad social influences.
I really wonder if anyone, including the authors, takes a paper like this seriously.
I also wonder about “I show that less time spent on outdoor sports and exercise is one possible… mechanism.” Obviously it’s *possible* — it doesn’t violate laws of physics. Does the author actually quantify the linkage? I’m highly skeptical that the borer-induced loss of trees suddenly induces such a large change in obesity rates — what fraction of the population was going on hikes *and* suddenly stopped, drowning their sorrows in Pepsi and Cheetos? 2.5%? I’m certainly not going to bother looking at the paper.
I take a different view. What fraction of people were joggers, dog walkers, golfers, and then the number of days where it was too hot to jog, and/or too much smoke from fires to jog, or other disturbances caused them to stop doing their preferred moderate exercise? How many people were employed in forestry and then became unemployed? How many people were employed in transportation of forest products and then moved away elsewhere? Etc etc
Environment is an important aspect of life and big changes to environment make a difference in lots of ways. It would be great to actually study ways in which environmental factors affect health and activity level. Running a couple regressions just doesn’t do it.
Also note that the paper claims 2.2 percent of the population was the amount of increased obesity, pretty much spot on with your 2.5% estimate 😅
My 2.5 wasn’t an estimate; it was taken from the paper’s abstract! (It’s the number I’m skeptical about!)
Other than that, note that both of us routinely point out that “there’s always an effect” — certainly the amount of exercise, for example, that’s reduced when there are fewer trees is nonzero. The questions to ask then are (i) how much, and (ii) what are the mechanisms, as you wrote.
Oh gotcha, I thought you were just throwing out a back of the envelope guess and happened to get the same value :-)
Yes, we’re on the same page. There’s always an effect, but without investigation of mechanism you are never going to get something meaningful. And there are lots of mechanisms. Even things like population flux (people moving out of the area and more obese people moving in) would have to be investigated to figure out whether the effect is actually to cause *people* to get more obese, or just cause different people to want to live in the area.
@Daniel. Agreed.
I also wonder, given the vagueness of these “models” whether some intrepid economist is at work constructing the opposite story: that rising obesity causes less time spent outdoors, which causes less support for wilderness management, which leads to more ash-borer damage. One can monkey with the temporal order by adjusting how “detection” and actual onset are related, whether for invasive species or obesity.
Then one can write meta-analyses combining the two approaches. The papers practically write themselves!
“There’s always an effect, but without investigation of mechanism you are never going to get something meaningful.”
And very, very often no amount of investigation of mechanisms is going to get you anything meaningful! To paraphrase Andrew, there are questions where the signal is so strong that you don’t need statistics, and there are questions where the signal is so weak statistics isn’t going to help you. Seems to me this question is very solidly in the second category.
Joshua,
It all depends on the available data and modelling really. If you start with obesity data and forest cover data and add a couple time use diaries, you’re unlikely to get anywhere. If you have daily movement data on a sample of 100M people from continuous cell phone surveillance location data and detailed weather sensor and air quality sensor data at 5 minute resolution across wide swaths, and daily satellite estimates of forest cover, local temperatures, and chemical concentrations in the surface air for decades, it’s a different story.
Raghu:
You write, “I really wonder if anyone, including the authors, takes a paper like this seriously.” Check out the comment by Skeptical ecologist above, who does seem to take the paper seriously. First, I don’t know the literature, but maybe there are reasons there to take it seriously. Second, lots and lots of researchers are trained in these econometric techniques and really seem to believe in this sort of thing. Remember, Brian Wansink was hired by the U.S. government!
You wrote: “More generally, publishing a work makes it public. If you don’t want work to be doubted in public, there’s no need to publish it. Just to be clear, I’m not saying the author of the above-discussed paper is bothered by this criticism. I’m speaking more generically here.”
That almost sounds like a variation on Christopher Hitchens’s “What can be claimed without evidence can also be dismissed without evidence” (which in turn is derived from an earlier saying in Latin).
So to make the parallel explicit and to recoin a phrase: What can be claimed without public access can also be dismissed without detailed discussion in a public paper.
Economist here. The problem is economics is not exactly a science. The methodologies used in econometrics are sort of a “cloaking mechanism” that conceals the actual truth.
Each paper has a lot of equations, statistically significant variables, robustness checks, and therefore its conclusions must be true.
That is the paradigm that economics has morphed into over the past 30+ years.
I am “lucky” to have access and skimmed the paper to see if I could give the author benefit of the doubt. I think what is most annoying about the paper to me is that there are so many important & interesting questions related to the serious threat of EAB but the concern for obesity through the causal link of EAB -> less tree canopy -> less outdoor exercise is one of the least interesting and important ones (and IMO very under motivated compared to other concerns about deforestation, environmental valuation, and health). I would say that uncertainties about first-order concerns from deforestation (why EAB induced deforestation instead of deforestation in general??) are more important to quantify and understand than this line of inquiry. I have a lot of the same concerns as Andrew about the actual econometrics, but I think it’s the problem framing here that is the biggest problem.
What do you think about the causal inference using Bayesian analysis, like in this paper that I have been reading recently: https://arxiv.org/abs/2306.14230.
There are claims in this paper that I find so dubious and magical, like this Limitation section:
This study has some limitations. Although The Violence Project has made considerable efforts in data collection and fact-checking, there are certain limitations due to data privacy laws in the U.S. Some cases in the past might not have been well-reported or influenced by biases from media agencies. Another limitation is the small sample size due to the nature of mass shootings (it is a good thing that this number is not too big). Despite this drawback, Bayesian analysis aided by the MCMC technique allows for relatively reliable prediction when working with this small sample size.
Indeed, the whole paper is a mind-boggling jumble of words with a theory and claims. Tracing the authors’ profile, similar papers with similar techniques are everywhere. I feel like they are just correlations, but this time, Bayesian was used rather than the frequentist approach. A lot of your criticisms in the blog post is applicable here as well but this is also another dimension that I am struggling to formulate my thoughts around.
I’d bet the paper is weak. Bayesian estimation doesn’t magically make for good science. I found bayesian estimation to be a gateway drug. By accepting uncertainty in estimates instead of dichotomous thinking, I started caring more about effect sizes and the huge role of model error. I had to accept that uncertainty exists and that science is hard.
Andrew had been clear that Bayesianism isn’t some silver bullet to bad science. The whole workflow and system needs fixing. Ultimately, you need to be the strongest critic of your research and work to destroy it, rather than be it’s biggest advocate as the system pushes academics to do. The biggest red flag in a paper is when the authors seems to be presenting procedures just to buttress their point of view vs. really working through what’s going on in the data.
Totally agree with this. The real point of Bayesian methods isn’t that they’re Bayesian for it’s own sake, it’s that Bayes is a procedure that is compatible with whatever your model is. It’s a “generalized logic” so it’s something you can apply to an algebraic model, a diffeq model, an agent based model, a model derived from dimensional analysis, a purely computational “random forest” or whatever.
Frequentist models require that you accept a view of the world that it spits out stable random numbers.
Skeptical:
Your criticism seems pretty harsh and personal. Maybe the guy just likes his job and his life and has no desire to move to Ann Arbor or wherever.
Yikes. You are right. I definitely should not have posted that. My apologies. Why don’t you go ahead and delete it and I’ll try again?