Survey Statistics: GREG

I just got to chat with Andrew and some of the authors of the MrPlew paper: Ryan Giordano, Erin Hartman, and Avi Feller. Lots more I have to digest here ! The paper came out while the polar bear and I were crossing from TN into VA.

We talked about using a model for response R, a model for outcome Y, or both. So GREG came up, and Andrew asked “what’s GREG ?” Good question.

GREG is Generalized REGression estimator. Särndal, Swensson, Wretman (1992) has a nice section that writes it in a few alternative ways:

1. Adjust an estimate based on the model with a Horvitz-Thompson estimate of the error:

2. Or on the flip side, you can see it as adjusting the Horvitz-Thompson estimate with the model:

It’s called GREG for Generalized REGression estimator, what is being generalized ?

Lumley 2010 made me think we were generalizing to continuous X variables:

Preview

Sharon Lohr’s book made me think we were generalizing beyond simple random samples:

Sampling Design and Analysis: Third Edition — Sharon Lohr

Särndal, Swensson, Wretman (1992) made me think we were generalizing to multiple X  variables:

Amazon.com: Model Assisted Survey Sampling (Springer Series in Statistics): 9780387406206: Särndal, Carl-Erik, Swensson, Bengt, Wretman, Jan: Books

Regardless of the exact origin of the name, GREG has connections to the Doubly Robust literature in causal inference (as Coston et al. (2020) note in a footnote). Any favorite references making these connections ?

6 thoughts on “Survey Statistics: GREG

  1. Firth D, Bennett KE. Robust models in probability sampling. Journal of the Royal Statistical Society: Series B (Statistical Methodology). 1998;60(1):3-21. doi:10.1111/1467-9868.00105

    (Connections in the discussion.)

    • Thank you, Leon ! Skimming the paper, it looks super useful. Am I understanding correctly that they assume known inclusion probabilities pi_i, so the model for inclusion is by definition correct ? They say that GREG is consistent, so it satisfies one side of the DR property. The other side would be if only the outcome model is true, not the inclusion model ?

      Joseph D. Y. Kang. Joseph L. Schafer. “Demystifying Double Robustness: A Comparison of Alternative Strategies for Estimating a Population Mean from Incomplete Data.” Statist. Sci. 22 (4) 523 – 539, November 2007. https://doi.org/10.1214/07-STS227

      The distinguishing feature of DR estimates is that they remain asymptotically unbiased if one of the two
      models is misspecified—that is, they consistently estimate their targets if either model is true (Robins and
      Rotnitzky, 1995).

      • I was thinking of the comment by Robins and Rotnitzky but I realize they don’t actually describe double robustness per se. I still recommend the paper as an exposition of GREG-type results but I now don’t think the “other direction” (outcome model correct, design model incorrect) is explicitly discussed.

Leave a Reply

Your email address will not be published. Required fields are marked *