How do you summarize logistic regressions and other nonlinear models? The coefficients are only interpretable on a transformed scale. One quick approach is to divide logistic regression coefficients by 4 to convert on to the probability scale–that works for probabilities near 1/2–and another approach is to compute changes with other predictors held at average values (as we did for Figure 8 in this paper). A more general strategy is to average over the distribution of the data–this will make more sense, especially with discrete predictors. Iain Pardoe and I wrote a paper on this which will appear in Sociological Methodology:
In a predictive model, what is the expected difference in the outcome associated with a unit difference in one of the inputs? In a linear regression model without interactions, this average predictive comparison is simply a regression coefficient (with associated uncertainty). In a model with nonlinearity or interactions, however, the average predictive comparison in general depends on the values of the predictors. We consider various definitions based on averages over a population distribution of the predictors, and we compute standard errors based on uncertainty in model parameters. We illustrate with a study of criminal justice data for urban counties in the United States. The outcome of interest measures whether a convicted felon received a prison sentence rather than a jail or non-custodial sentence, with predictors available at both individual and county levels. We fit three models: a hierarchical logistic regression with varying coefficients for the within-county intercepts as well as for each individual predictor; a hierarchical model with varying intercepts only; and a non-hierarchical model that ignores the multilevel nature of the data. The regression coefficients have different interpretations for the different models; in contrast, the models can be compared directly using predictive comparisons. Furthermore, predictive comparisons clarify the interplay between the individual and county predictors for the hierarchical models, as well as illustrating the relative size of varying county effects.
The next step is to program it in general in R.