Do we trust this regression?

Kevin Lewis points us to this article, “Do US TRAP Laws Trap Women Into Bad Jobs?”, which begins:

This study explores the impact of women’s access to reproductive healthcare on labor market opportunities in the US. Previous research finds that access to the contraception pill delayed age at first birth and increased access to a university degree, labor force participation, and wages for women. This study examines how access to contraceptives and abortions impacts job mobility. If women cannot control family planning or doing so is heavily dependent on staying in one job, it is more difficult to plan for and take risks in their careers. Using data from the Current Population Survey’s Outgoing Rotation Group, this study finds that Targeted Restrictions on Abortion Providers (TRAP) laws increased “job lock.” Women in states with TRAP laws are less likely to move between occupations and into higher-paying occupations. Moreover, public funding for medically necessary abortions increases full-time occupational mobility, and contraceptive insurance coverage increases transitions into paid employment.

Here’s what they did:

My reactions:

Of course I think they’re making a (common) error when they say that certain coefficients “should be statistically significant.”

Setting that aside, it’s hard for me to say without a more careful look. The effect seems reasonable but the analysis has lots of moving parts so lots of ways things could go wrong. I’d like to see some graphs of raw data building up to their final analysis, rather than just seeing the results presented as a fait accompli.

One thing that we don’t really train researchers to do, is to understand and explain fitted models. All those steps that should take you from the raw data, through simple comparisons, to more complicated inferences. We did some of that in our stop-and-frisk paper but we never set it up as a general method.

18 thoughts on “Do we trust this regression?

  1. There is alot of irrelevant fluff preceding the actual claim. The fluff lays out the rationale for the claim, but while it makes the claim look sensible, it doesn’t say anything about what’s actually being assessed.

    The claim is that restricting abortion rights prevents women from *changing* jobs. Why should that be true? They’re afraid that they might lose the better job bcz of a pregnancy? But wouldn’t a better job make them better able to deal with unplanned events? Can some help me out on this?

    • Better the devil you know than the devil you don’t.

      If you have a good rapport with your boss and co-workers then they are more willing to give time off/cover the work if a pregnant woman has medical appointments and pregnancy/birth complications. Changing jobs means you discard all the good-will that you have built up with your co-workers.

  2. No. The coefficients don’t mean anything unless you accept this model is correctly specified.

    Does anyone really think they included all the relevant variables/interactions and no irrelevant ones? Or is it more likely they just included the variables they had data around for? There is a whole multiverse of model specifications just as acceptable as this one that will yield a different coefficient.

    • I share your skepticism. But – have you ever seen a study that included “all the relevant variables/interactions?” Don’t we usually include the variables we have data around for? I’m afraid that this criticism, taken a face value, would render most, if not all empirical research in the social sciences, irrelevant. Isn’t there a middle ground? I think it would call for a great deal more humility from authors, and much more careful writing of “conclusions” – perhaps even eschewing conclusions in favor of “speculations” or some other less authoritative noun. But if we reject any research that fails to include all possible relevant variables and interactions, then none of my own research would pass that test. Indeed, I am not aware of any research in economics that would pass the test, except perhaps those rare RCTs.

      Again – I am not suggesting that research like this be given a free pass. I am not denigrating your skepticism or concerns. But I do think inclusion of all relevant variables/interactions is too severe. The variables/interactions that are relevant from a theoretical point of view should be included, and when they are not the authors need to be very careful about their interpretation of their results. But I’d stop short of saying “the coefficients don’t mean anything.”

      • I’m afraid that this criticism, taken a face value, would render most, if not all empirical research in the social sciences, irrelevant.

        Unfortunately that is what seems to be the case (at least the stated goal of the research is impossible to achieve with the methods chosen). Lots of medical research as well. These arbitrary models can be useful for prediction, but interpreting the coefficients is no different afaict than interpreting the weights in a neural network.

        I think this is actually an even bigger problem than NHST. I still read papers that include these types of models and report p-values, but just pretty much ignore that part and all conclusions that depend on them.

        If you want to interpret the parameters of a model it needs to be derived from some assumptions. Then the values all have meaning assuming those assumptions are correct (+/- some “perturbation” for any simplifications made).

      • “I’m afraid that this criticism, taken a face value, would render most, if not all empirical research in the social sciences, irrelevant. ”

        Yes, it probably should.

        It think it’s worthwhile to forget NHST and interactions and all these complicating concepts for a second. The real question is replicability. If a study claims some effect, other studies should also see that effect. They should be able to see it using other methods. Whatever the method of replication – it doesn’t have to be NHST or p values – the effect has to be consistently identifiable.

        That’s the real question, right? So just like physical science, it’s not a question of having every bit of information specified correctly. It’s a question of having the *relevant measurements* identified and properly measured. If that’s done, then the result should be replicable, no matter everything else. Exactly what choices people make to get to that point don’t really matter. When an effect is consistently identifiable through multiple means, it’s real.

        So where the middle ground is I’m not sure but the dividing line in the real world is replicability – not in the NHST/p-value sense but in the larger sense of demonstrating through multiple means that an effect is real.

  3. I mentor wanna-be data science students.

    One thing I do is try to get them to employ the kind of bottom-up exposition you extoll.

    But, to your point, I’ve never made this formulaic.

    I’ll be thinking about that, next time.

  4. If the bar you’re setting is that every conceivable parameter or influence is correctly measured, correctly specified and all potential interactions are included in the model then there’s no reason to ever model anything in the mechanistic or explanatory sense. The only statistics allowed under that formulation are predictive models.

    It seems to me the problem with that position comes back to the old adage about everyone having a model of some kind. If you eliminate all knowledge derived from explanatory statistical models, then the only “models” we’re left with are data-free assumptions and speculations, no?

    Do you feel there is no statistical model which can be interpreted usefully in spite of incorporating only, say, 10 or 20 of the hundreds of potential factors that might conceivably bear on the problem at hand?

    • Brent:

      I’m not setting any bar; I’m just saying that I’d like to see some graphs of raw data building up to their final analysis, rather than just seeing the results presented as a fait accompli. I’m not gonna believe this sort of claim as is. There are too many things that can go wrong. I have no problem with imperfect models. I fit imperfect models all the time. And I spend a lot of time checking the fit of these models, understanding how they work, and comparing them to simpler models.

    • I dont think its a matter of setting the bar. Its just not a productive use of time. If you want interpretable coefficients you derive a model from assumptions. If you want to quickly make some predictions then use a statistical model. They are just different tools for different problems.

  5. Andrew,

    The indention on my reply appears incorrect. I was trying to respond to Anoneuoid’s comment as follows:

    “No. The coefficients don’t mean anything unless you accept this model is correctly specified.

    Does anyone really think they included all the relevant variables/interactions and no irrelevant ones? Or is it more likely they just included the variables they had data around for? There is a whole multiverse of model specifications just as acceptable as this one that will yield a different coefficient.”

  6. I don’t have access to the paper.

    Link is to data from 2017. 49% of abortions are from women below the US poverty line. Another 25% are for women between 1 and 2x the poverty line. The claimed “higher-paying occupations” is relative, but given the concentration in poor women I’m not sure what exactly they are comparing in the equations Andrew shows. In order to have an abortion, you need access to an abortion…so the relatively higher pay must be small.
    https://www.guttmacher.org/news-release/2017/abortion-common-experience-us-women-despite-dramatic-declines-rates

  7. Just using this example to ask a question about models and predictors, especially in my main field (public health). I tend to believe that a model including several predictors is better than one/two restricted models. I assume that restricted models could achieve significant results which are , in fact, type 1 error. However, looking at this example, I have an impression of these predictors are nonsense. I don’t know if I made myself clear here. Thank you

      • Right. I would at least start with a fairly standard logistic regression rather than blindly run OLS on a binary dependent variable.

        Of course, the idea that the model is correctly specified is of course suspect as well. But that OLS shouldn’t be generally used with binary data is… Well, let’s just say it’s not a new thing.

Leave a Reply to Brent Hutto Cancel reply

Your email address will not be published. Required fields are marked *