Last week we dove into Disentangling Bias and Variance in Election Polls by Shirani-Mehr, Rothschild, Goel, and Gelman. In election polls from 1998 to 2014, they found that total survey error (measured as root mean squared error) is about twice reported margins of error:
RMSE_true = 2 * RMSE_reported
Raphael Nishimura helpfully commented that this doesn’t generalize:
smaller (and therefore larger MoE) more carefully designed probability sample surveys tend to have smaller total survey error than very large (and therefore smaller MoE) nonprobability sample surveys.

Yes ! Meng’s 2018 “Statistical Paradises and Paradoxes” (in this blog series here and here) contrasts small probability samples with large nonprobabilitiy samples. Meng expresses the MSE as a product of low quality from correlation between response and outcome (D_I), low quantity (D_O), and difficulty (D_U):

Using this with some algebra (replace the sampling fraction f with n/N), we get RMSE_true = sqrt(D_I) sqrt(N-n) sigma_G/sqrt(n).
RMSE_reported is usually some multiple of sigma_G/sqrt(n). (This ignores weights and sampling design, so it doesn’t quite calculate sampling error, let alone nonsampling error, as Shirani-Mehr et al. point out in section 1.1 Background.)
So we are off by a factor of sqrt(D_I) sqrt(N-n).
Meng’s Theorem 1 says that probability samples have D_I of order 1/N, so this factor goes away. However, for nonprobability samples, this factor can be huge.
Shirani-Mehr et al. found a factor of 2. The polls they analyze are not probability samples nor are they the worst case scenarios Meng describes.
Last week, Andrew reminded us to consider total survey error in the Cooperative Election Study (CES):
The sample size is large enough that I think the usual uncertainty intervals would be very narrow. On the other hand, yeah, if you consider possible nonresponse, there’s tons of uncertainty….
I appreciate it’s a different metric, but this vaguely reminded me of Meng’s heuristic to “double your variance”: https://nejsds.nestat.org/journal/NEJSDS/article/9/info
Olip:
This is related to the equipartition heuristic that I used to interpret an unexpected poll in the 2024 election campaign.
Thanks, OliP and Andrew !
OliP, I haven’t had a chance to read that paper by Meng, thanks for pointing me to it !
Andrew, in that post you suggested “1 percentage point sampling error, 1 percentage point nonsampling error”. In Shirani-Mehr et al. I think you found that true error was double the SRS error. But SRS error isn’t necessarily the sampling error, as the paper points out, because it doesn’t consider design effect. Do I have that right ?