Robust logistic regression

Posted on June 7, 2013 4:32 PM by Andrew

Corey Yanofsky writes:

In your work, you’ve robustificated logistic regression by having the logit function saturate at, e.g., 0.01 and 0.99, instead of 0 and 1. Do you have any thoughts on a sensible setting for the saturation values? My intuition suggests that it has something to do with proportion of outliers expected in the data (assuming a reasonable model fit).

It would be desirable to have them fit in the model, but my intuition is that integrability of the posterior distribution might become an issue.

My reply: it should be no problem to put these saturation values in the model, I bet it would work fine in Stan if you give them uniform (0,.1) priors or something like that. Or you could just fit the robit model.

And this reminds me . . . I’ve been told that when Stan’s on its optimization setting, it fits generalized linear models just about as fast as regular glm or bayesglm in R. This suggests to me that we should have some precompiled regression models in Stan, then we could run all those regressions that way, and we could feel free to use whatever priors we want.

9 thoughts on “Robust logistic regression”

Fernando on June 7, 2013 4:54 PM at 4:54 pm said:

“robustificated”

My heart bleeds for the English language.
- Corey on June 7, 2013 5:21 PM at 5:21 pm said:
  
  What can I say? When I’m verbing a noun, I like to go big.
  - Fernando on June 7, 2013 5:27 PM at 5:27 pm said:
    
    That was funnyfied, it smiled me all over.
  - Corey on June 7, 2013 6:53 PM at 6:53 pm said:
    
    Adjective, not noun… derp…
Thinkling on June 7, 2013 5:35 PM at 5:35 pm said:

Enough with the neologery!
Mitzi Morris on June 7, 2013 9:18 PM at 9:18 pm said:

“I’ve been told that when Stan’s on its optimization setting, it fits generalized linear models just about as fast as regular glm”
really? I’ve been told that’s not happening until Stan 2.0.
- Marcus on June 8, 2013 11:55 AM at 11:55 am said:
  
  The new optimization work won’t be released until Stan 2.0, however it is already in the development branch of the git repository.
Peter Meilstrup on June 8, 2013 1:03 AM at 1:03 am said:

Wichmann and Hill’s two papers on fitting psychometric data (they come up if you google “wichmann hill psychometric”) explore this question in some detail.

The R package “psyphy” has some prebuilt saturating logit and probit link functions that plug into the GLM command.
- Corey on June 10, 2013 4:14 PM at 4:14 pm said:
  
  These articles look ideal. Thanks Peter!

Comments are closed.