Survey Statistics: adjusting for interest in politics

Last week we entered a new paradigm: whether you respond to a survey (R) may depend on outcome (Y), even after controlling for covariates (X). This is called Missing Not At Random (MNAR), in contrast to Missing At Random (MAR). Michael Bailey‘s post pushes beyond MAR. In contrast, replies to the post focus on improving the plausibility of MAR by adding additional covariates Z to the set of adjustment covariates X.

EDIT: annoying notation clash here between Bailey 2025‘s randomized response instrument (“Z”) and Kuriwaki et al. 2024‘s additional adjustment covariates (also “Z”). This post focuses on the latter.

In the comment section of our last post, Andrew pointed to his and Gustavo‘s response: Challenges in Adjusting a Survey That Overrepresents People Interested in Politics. They begin:

We agree with the general point of Bailey (2023) that random sampling is a distant benchmark for real-world polls…

Rod Little reminds us that random sampling is still relevant in the world beyond opinion polling:

might be defensible for opinion polling, but not for the field of survey sampling as a whole… government statistical agencies… survey research organizations…. strive to conduct high-quality probability surveys.

Andrew and Gustavo continue with a focus on adjustment for differences between sample and population:

Adjusting for party identification or voting history can be challenging because these variables are not tabulated in the census…

For one way to address this, see our post on 2 flavors of calibration. There we discussed Kuriwaki et al. 2024, who add variable Z to the auxiliary data X using known totals of Z (e.g. past election results).

Then Andrew and Gustavo pivot to talking about interest in politics, we can call this Z_2 (it’s all the rage in training theory):

It is not clear, though, what to do about this oversampling of people who are interested in politics, given that the distribution of this variable is not known in the general population.

One option for population data on this variable comes from the American National Election Study (ANES), which asks (see the 2020 ANES questionnaire):

(This seems to be a different question than the one Andrew and Gustavo use ?)

The ANES is similar to the organizations that that Rod Little says still “strive to conduct high-quality probability surveys”. The 2020 ANES methodology report says the response rate was 36.7%. (Why does their data quality page only report response rates until 2000 ?) While this is lower than in past years, Sharon Lohr reminds us:

starting with a probability sample has several advantages even when response rates are low:
– The sampling frame for a probability sample is well defined, and many frames used in practice have high coverage….
– The probability sample often has more information available that can be used for weighting, imputation, or other types of nonresponse modeling…

The 2020 ANES methodology report says the sampling frame was the US Postal Service Computerized Delivery Sequence File. Their nonresponse adjustment included single vs multi-family dwelling, whether the address has a telephone number, and census division.

It is unclear how severely the ANES’s non-response mechanism impacts its ability to provide reliable population data. So adjustment for Z_2 (political interest) may not be as good as adjustment for Z (past vote), but perhaps still worth doing ?

Andrew and Gustavo conclude:

we should recognize the potential importance of going beyond conventional adjustment variables.

So while Michael Bailey pushes beyond MAR, Andrew and Gustavo are focused on improving the plausibility of MAR by adding Z and Z_2 to the set of adjustment covariates X.

4 thoughts on “Survey Statistics: adjusting for interest in politics

  1. This is another one of those contexts in which case studies have a lot to offer. A case — in this case — would take the form of canvassing people in a participating organization, such as a church, a gym or yoga center, a recreation group (I once stumbled onto a large group of people who dress up as pirates and have meetups periodically), etc. Talk to as many members as possible, asking about how much and when politics interests them, where they get their information from, impacts from their social networks, and political participation and choices. No, it’s not generalizable, but it’s a case, and it can help you identify patterns and potential markers you can then use in a more representative probability sample.

    I have to confess I don’t understand why population sampling and case studies reside in such separate worlds. I see them as complementary, each shoring up weaknesses in the other.

  2. I do think using ANES — or any other political survey — as a benchmark for political interest distribution gets shaky pretty quickly, for the reasons Bailey argues. These are long surveys with many questions about politics. Given Bailey’s findings, it would be interesting to know more about how much interest in politics respondents of marketing surveys (Ipsos, YouGov) generally express.

    On the other hand, Jackman and Spahn (2021, https://doi.org/10.1017/S1049096521000639) have shown that ANES sample does seem to capture at least some voters who are not registered or not listed in commercial lists.

  3. Thanks, Shiro ! Super helpful as always.

    Indeed, Bailey writes:

    If people who are more interested in politics are more likely to answer a poll about politics—which hardly seems unreasonable—then the ANES may have too many people interested in politics… weighting-type adjustments are not feasible for a variable like political interest, which does not have a known population-level distribution…even though only 50% of adults turned out to vote at that time, 70% of ANES respondents voted. ANES turnout declined in the weighted data to around 60%.

    So I like your idea to compare:
    1. ANES: starts with random sample, pays respondents $10, $40, $100, $200 per interview, 36.7% response rate, able to do some nonresponse weighting.
    2. a political poll that doesn’t do any of these
    3. a marketing survey which maybe pays something small

    Probably if someone is analyzing data from scenario 2, using the ANES interest in politics benchmark could be an improvement ? But maybe for scenario 3 it isn’t ?

Leave a Reply

Your email address will not be published. Required fields are marked *