Struggles with surveying nonvoters and young voters

Posted on May 16, 2025 3:34 PM by shira

This post is by Connor Gilroy, Shira Mitchell, David Shor, and Jonathan Tannen.

Our 2024 retrospective report has generated some response. Three recent blog posts about the 2024 election find different results than we’ve seen at Blue Rose Research. Here we explore what might account for these differences, demonstrating with publicly-available voterfile and survey data. At the end, we also bring in our own private survey data, and look forward to further conversations as other organizations do similar explorations with their own data.

Two of the recent blog posts are about party preference among registered 2024 nonvoters:

Bonica et al. Part 1: Does Higher Turnout Now Help Republicans? A Data-Driven Analysis of Partisan Turnout Dynamics (Part 1)
Bonica et al. Part 2: Did Non-Voters Really Flip Republican in 2024? The Evidence Says No.

Andrew has blogged about the second post here.

And a third post is about party preference among young people:

Soler et al.: Have young voters really abandoned the Democrats?

These posts are authored by academics, who are right to be skeptical of claims that don’t come with full replication code and data. However, releasing our data and models would violate legal agreements, let alone give the Republican party a competitive advantage. We want to be as collaborative as possible within these constraints, so we’ll first use publicly-available data to show inconsistencies with the claims in these posts (then turn to our survey).

The inconsistencies appear to come from measurement error and nonresponse bias. Measuring current party preference using voterfile party registration requires care, since older registrations may not reflect current preferences. Using survey data requires adjusting for differences between sample and population. This can be done with regularized prediction models as Andrew discusses here, or with survey weights, which Andrew discusses in his 2007 “Struggles” paper that connects both approaches.

In particular, we are concerned that the analyses in the recent posts do not sufficiently account for political engagement. Political engagement is correlated both with answering surveys and with voting. Failing to account for it will lead to implausible results.

Voterfile party registration as a proxy for current party preference

Bonica et al. Part 1 uses voterfile party registration as a proxy for current party preference. They assert:

We can reasonably assume that at most 10% of registered Democrats would have voted for Trump, and at most 10% of registered Republicans would have voted for Harris—though these crossover rates were likely even lower.

This would be wonderful news for Democrats, but is not based on data. We present survey evidence later, but we think even public voterfile data shows registered Democrats are more likely to cross over. The problem is, voterfile party registration is a snapshot at the time the person registered. Party registration as a proxy of current vote choice is thus less accurate for older registrations, becoming stale further back in time. Figure 1 below shows that among 2024 nonvoters, older registrations are much more Democratic than newer ones. It’s unlikely all of those registrants would still be Democrats if they reregistered. We can’t rule out that “at most 10%” would vote for Trump, but it’s clear the decisions in party registration have fundamentally changed, and assuming these old registrations would reregister as Democrats (and vote for Harris) is risky.

Note on Figure 1: The states with voterfile party registration lean more Democratic than the country overall, with a 51.4% Harris two-way vote share in 2024.

Nonresponse bias in survey data due to political engagement

Bonica et al. Part 2 and Soler et al. use the 2024 Cooperative Election Study (CES) and AP VoteCast data.

Different surveys (CES, VoteCast, Blue Rose) give conflicting answers about whether the youngest voters shifted more Republican in 2024, and young voters’ gender gap. VoteCast shows similar results to Blue Rose for young white men, while CES estimates far higher Democratic support for this group (see Soler et al.). These differences may be due to different sampling methods (e.g. VoteCast uses some probability samples) and different adjustment variables (e.g. VoteCast also adjusts for Catalist’s vote choice index, in addition to the demographic and vote choice variables used by CES).

At Blue Rose we collected 26 million survey responses, asking many more questions than the CES or VoteCast to adjust for ways that survey-takers are not representative of the population. We adjust for hundreds of variables (see our StanCon 2020 talk), which we think produces more plausible estimates. As Andrew says here, adjusting for many more variables can make a big difference for survey accuracy.

We, like many researchers, are particularly worried about nonresponse bias from political engagement. Suppose politically engaged people are more likely to take political surveys. Suppose they are also more Democratic. Then surveys that fail to adjust for political engagement will be biased, especially for groups with the biggest gap in Democratic support by political engagement.

We see many examples in our data that political engagement is increasingly correlated with vote choice. One (very imperfect) administrative proxy for political engagement is whether someone voted in 2020. In Figure 2, we see that among 2024 nonvoters who registered in recent years, those that voted in 2020 are more likely to register as Democrats than those who didn’t.

In addition, we’ve invested heavily in linking individuals’ records between the 2020 and 2024 voterfiles. We can then examine this subset of linked records for change over time. We limit to people who registered in 2020 and then re-registered in 2023 or 2024, and registered as either Democrats or Republicans in both time periods. (This group skews young and Democratic-leaning.) In Figure 3, again we see a difference by our proxy for engagement: 2020 voters went from 65.6% to 63.0% Democratic, while 2020 nonvoters fell from 61.3% to 54.9%.

In Figure 4, among people aged 19-29 who registered in recent years, we again see a difference by our proxy for engagement that is widening over time.

Given these political engagement gaps in Democratic party registration widening in recent years, we encourage researchers to be hesitant of relying on survey analyses that don’t adjust for political engagement. We do not believe that adjusting for 2020 vote (our very imperfect proxy above) is sufficient.

The weighting variables in the 2024 CES are few. According to the CES documentation, weights are registration status, age, race, gender, education,“born again” status, 2020 Presidential vote choice, and 2024 Presidential vote choice, and some subset of interactions. None of these adequately account for potential differential response rates by political engagement. If for some group (e.g. young voters or nonvoters) political engagement is positively correlated with both survey response and vote preference after conditioning on these weighting variables, the CES will overestimate Democratic shares of these groups (except where poststratification has adjusted to election results directly).

Other CES issues

The CES does not show a gender gap among young white respondents. Soler et al. present AP VoteCast results with a 20+ point gap between young white women and men (larger than ours), and then oddly a CES analysis claiming no gap. Even voter registration (in states with party registration) shows a 20 point gap between young women and men. We think there being no gap is extremely unlikely, and encourage researchers to examine their weights that might be failing to account for survey bias.

Results from Our Survey Data

Finally, we analyze our own private survey data, and we hope others can do similar explorations with their own data.

Youth nonvoter findings hinge on political engagement

Below in Figure 5, we present the 2024 vote choice among respondents who say their political identity is “very important” or “not at all important” to them. The key takeaway here is that among young registered voters, low engagement voters were vastly more supportive of Trump than highly-engaged. Failing to account for political engagement (which correlates with both voting and answering surveys) will miss this sharp difference and lead to incorrect statements about nonvoting young people.

Registered-Democrat nonvoters were disproportionately Trump-y

Above, we’ve shown that registration patterns have changed sharply, which suggests those nonvoting Democrats might register differently if they registered today. The result of that decline is we have more registered Democrats who don’t vote for Democrats than we have registered Republicans who don’t vote for Republicans.

In our survey data, we see that registered Democrats who did not vote in 2024 have twoway Harris support of 69%, down from 76% recalled Biden 2020 support (this is the number that the blog claims is at least 90%). On the other side, nonvoting registered Republicans have Trump twoway support of 87% (and 85% recalled Trump 2020 support). And when we ask those survey takers for their current self-reported party and ideology, nonvoting registered Democrats are substantially more moderate and conservative than their voting peers (see the table below). Nearly half of nonvoting Democrats identified as moderate, and 20% as conservative. This explains how a Democratic lead in party registration among nonvoters becomes a Trump lead in actual voting preference.

When we aggregate across registered party, we find that Democrats held only a slight lead among people who voted in 2020 but not 2024: 51% of the two-way vote, far from the 62% that just party registration would suggest. And this group moved rightward between 2020 and 2024 *the most* out of any row in the table below.

Collapsing all who didn’t vote in 2024 (rows 1 and 2 above) gives only a 44.5% Harris twoway cohort, very different from the idea that this was somehow a 60% Harris group, as claimed in Bonica et al. Part 2.

And it isn’t just our results. A recent analysis by the Harris for President Analytics Team found a sharp decline in Democratic support among low-engagement voters, for the first time in recent cycles showing below 50% support (though this finding is of voters and not nonvoters, the trend based on engagement is stark).

A closing note

We want to once again acknowledge it’s not ideal for us to discuss results without releasing data and code. As academically-minded researchers, we deeply appreciate the transparency of folks working in academia. However, we cannot publicly release data and code that would risk the aims and work of the campaigns we serve. We hope our analyses of publicly-available data help people engage with our discussion, and our survey results suggest ways for other surveys to be analysed. We look forward to further conversations about the difficult and important topics of measurement error, nonresponse bias, and understanding the American public.

3 thoughts on “Struggles with surveying nonvoters and young voters”

Joshua on May 16, 2025 5:58 PM at 5:58 pm said:

Seems to me an important question about this analysis is whether or how much it is predictive, particularly since (I guess it’s likely) Trump won’t be running again. And maybe the Dems will stop catering in public support and actually run some more compelling candidates. And maybe the economy will take off, or we’ll run into stagflation.

I had always read that party ID is pretty fickle, and I would imagine it might still be so. Maybe it still is, and these data are predictive mostly to the extent that nothing changes much in the political landscape.

Reply ↓
- Joshua on May 16, 2025 5:59 PM at 5:59 pm said:
  
  Er…cratering.
  
  Reply ↓
Josh on May 17, 2025 5:41 PM at 5:41 pm said:

Hi all, really appreciate this post. On the point of transparency and trust, I wonder if there’s an intermediate position where you share predictions before elections that will only be revealed afterwards. Otherwise, it’s hard for an outsider to assess the quality of your survey data and methods.

Reply ↓

Statistical Modeling, Causal Inference, and Social Science