Skip to content
 

A story of fake-data checking being used to shoot down a flawed analysis at the Farm Credit Agency

Austin Kelly writes:

While reading your postings [or here] on the subject of testing your model by running fake data I was reminded of the fact that I got one of these kinds of tests actually published in a GAO report back in the day. Reading your posts on Unz and political vs. economic discourse made me think of that work again. I thought I’d actually drop you a line on the subject.

Back in 2003 GAO was asked to look at Farmer Mac, including a look at the Farm Credit Agency’s regulation of Farmer Mac. As the resident mortgage econometrician back then I was asked to look at FCA’s risk based capital stress test for Farmer Mac. The work was pretty easy. I found a lot of oddities, but the biggest one was that they were using a discrete choice set up (loan goes bad or doesn’t) instead of a hazard model (loan goes bad this period or survives to the next). Not necessarily a problem – lots of mortgage models run that way. But you have to be really careful with your independent variables. FCA’s academic consultants weren’t. They defined as an “independent” variable the largest drop in farmland prices from mortgage origination to now, or to the date the mortgage went bad, whichever came first. I always get a little suspicious when the event you are trying to predict gets incorporated as part of the definition of the variable that’s supposed to explain the event. As a student of Jim Heckman’s I recalled being taught in a labor econometrics class back in the 1980’s that you really couldn’t do that. [We discuss related issues in section 9.7 of ARM (link to chapter here), under the heading, “Do not control for post-treatment variables.”] I searched through Heckman’s old reading list, JSTOR, etc. but couldn’t come up with a proof of why that doesn’t work. Best I could do was Yamaguchi’s book on event history analysis that gave a verbal example of this kind of technique failing, but no proof. So I decided that the easiest thing to do was simulate data with loan failure in any discrete time period being determined by SAS’s Ranuni function, with no reference to farmland price change, farmland price change taken from historical data, and the independent variable calculated as it was in the model (change in farmland price from origination to current or fail, whichever comes first), and ran the regressions. Even though the true effect was zero by construction, I got significant and negative coefficients over 90% of the time. That was the “proof” that got into the appendix of the GAO report. Oddly enough, about the same time that I did this someone in Michigan’s B-School was doing the hard slog of writing our likelihood functions and formally proving that FCA’s set up was inconsistent (I don’t have access to JSTOR anymore so don’t have an easy way of finding key facts, like his name or the citation). Generating some fake data was a lot easier, and apparently more persuasive to my non-quant colleagues. The report is here.

Reading your post on academic vs political, my first thought was that just about every time I’ve engaged an academic in a “political” sphere they’ve adopted “political” discourse. I remember another case where academics had estimated the impacts of Economic Development Administration grants on county level employment, without controlling for the size of the county – ignoring the fact that a county with 1,000,000 workers was more likely to get a grant in the first place than a county with 10,000 workers. Their coefficient implied that every ten thousand dollars in EDA spending created a permanent job! Their main response to criticism was to tell us about how many PhDs they had.

Over my career I could point to many cases where the response from an academic was political discourse. But I can also think of many cases where it was academic. It’s just that the political responses are the ones that stick in the craw and are most easily remembered.

Regarding academic vs political discourse, I agree completely that academics often seem to care more about short-term winning than about getting things right. My point in that blog post was not that academics are better or more honorable than politicians, but rather that the rules are different. We would like an academic to engage in open discourse and not use the truth as negotiation chits, and when they behave in political ways we are unhappy. In contrast, a politician is supposed to negotiate. If a politician makes concessions without getting anything in return, we respect him less.

7 Comments

  1. Steve Sailer says:

    One obvious example of Politics Uber Alles in the news recently was Jason Richwine: look how few academics stood up for him.

  2. […] A Story of Fake Data Checking – via Andrew Gelman- Regarding academic vs political discourse, I agree completely that academics often seem to care more about short-term winning than about getting things right. My point in that blog post was not that academics are better or more honorable than politicians, but rather that the rules are different. We would like an academic to engage in open discourse and not use the truth as negotiation chits, and when they behave in political ways we are unhappy. In contrast, a politician is supposed to negotiate. If a politician makes concessions without getting anything in return, we respect him less. #Data Science Case Study […]

  3. […] A Story of Fake Data Checking – via Andrew Gelman- Regarding academic vs political discourse, I agree completely that academics often seem to care more about short-term winning than about getting things right. My point in that blog post was not that academics are better or more honorable than politicians, but rather that the rules are different. We would like an academic to engage in open discourse and not use the truth as negotiation chits, and when they behave in political ways we are unhappy. In contrast, a politician is supposed to negotiate. If a politician makes concessions without getting anything in return, we respect him less. #Data Science Case Study […]