Colliders are not only “post-treatment” variables, colliders can be pre-treatment as well.

]]>Of course it’s a predictor — predictor is what predicts.

]]>You can have pre-treatment colliders as well, but the same considerations about causal inference vs. predictive performance applies.

]]>Is there an example of interpreting a stacked model for some explanation of say how changes in inputs can be anticipated to change predictions?

]]>Corey:

Or, to be even more careful: It’s ok to include post-treatment variables in your regression, but then you can’t directly interpret your regression coefficient as a causal effect. You need to do more modeling.

]]>Yeah it’s a predictor. The right caveat for Richard’s concern is something like, “Be aware of the difference between predictive inference and causal inference. If no intervention into the system is planned or possible and the aim is simply to predict as accurately as possible then it’s appropriate to use all available predictors but if the investigation aims to tease out causal connections then you need to avoid conditioning on colliders (for Neyman-Rubinites, post-treatment variables).”

]]>I would say yes. A variable / node can be a collider on one path but not a collider on another path. So, it is possible there is a unblocked path from a variable / node to the outcome, in which case it is a predictor in the usual statistical sense, but at the same time, it can be a collider on another path in the DAG sense if you condition on it.

On the other hand, the utility function for projpred or stacking is being defined solely in terms of predictive ability, rather than being able to interpret an estimate in causal terms. So, if you decide to go that route, I think you have already set aside your DAG.

]]>Sure. Question about terminology: If you know a variable is collider, would you still call it also predictor?

]]>– in case of predictors no need to do model combination, just include all potentially relevant predictors

– regularized horseshoe prior is good if you have lot of predictors

– projection predictive variable selection is better than any approach to choose a smaller set of predictors

– rstanarm + projpred package make all this easy (projpred at https://cran.r-project.org/package=projpred)

– you can use projpred also with rstan

– all this is computationally feasible with laptop at least up to 10000 predictors ]]>

Belay:

1. DIC is not a good idea; better to use LOO; see here and here.

2. When model averaging, don’t use weights based on any information criterion, use stacking; see here.

]]>