Meng (2022) pops up a lot here: “it is the people” (the launch of this blog series a year ago !), “probability samples vs epsem samples vs SRS samples”, “divine probabilities”, and last week’s “GREG”. Like a lot of Meng’s papers, it deserves several rereads.

(The polar bear celebrated the blog series birthday with a rainy hike on the PA AT. Here he is attempting to dry off.)
Let’s zoom in on the part about the Generalized REGression estimator (that doesn’t specifically say “GREG”). Green anotations are mine:

Meng (2022)‘s (5.2) is the first way of writing GREG in our post “GREG”, from Särndal, Swensson, Wretman (1992):

That book goes on to say that GREG often takes a super simple form:

Meng (2022) doesn’t mention this as far as I can tell ? Although I think Meng’s example satisfies the conditions the book Särndal, Swensson, Wretman (1992) goes on to describe: the regression model assumes constant variance and has an intercept.
Anyways, back to the title of this post. Meng emphasizes that GREG is not only “double robust” (consistent if either the outcome model or response model are correct), but “double-plus robust” (consistent if what is left of the outcome model and response model are uncorrelated). I’m interested in the practical implications of this, such as the suggestion to include the estimated response probabilities in the outcome regression model. Thoughts ?
My first experience with double robustness was in our project on estimating the incumbency advantage, where we recognized that our estimate would be unbiased if the regression model were true (even if the treatment were not randomly assigned) or if the treatment were randomly assigned (even if the regression model were not true). I don’t recall if we explicitly said this in the paper, and I know we didn’t come up with the slogan, “doubly robust,” but we did look at both aspects of the model: the regression model for the outcome and the probability of receiving the treatment. Also I wasn’t sure what to do with this robustness property, given that in reality both assumptions are false–the outcome was not actually generated by the regression model and the treatment was not actually assigned at random.
Thanks, Andrew !
To see if I understand:
On p.1153 of your project on estimating the incumbency advantage, in the section “Control Variables” you write:
v_1 = votes in previous election, a control variable i.e. covariate, “X” in the causal literature
I_2 = decision of the incumbent to run, the treatment, “Z”
v_2 = votes in next election, the outcome, “Y”
outcome model:
E(v_2 | v_1, P_2, I_2) equation (6) assumes linearity in v_1
But it is ok if that’s wrong if the treatment I_2 is random and doesn’t depend on v_1.