Problem of the between-state correlations in the Fivethirtyeight election forecast

Elliott writes:

I think we’re onto something with the low between-state correlations [see item 1 of our earlier post]. Someone sent me this collage of maps from Nate’s model that show:

– Biden winning every state except NJ
– Biden winning LA and MS but not MI and WI
– Biden losing OR but winning WI, PA

And someone says that in the 538 simulations where Trump wins CA, he only has a 60% chance of winning the elec overall.

Seems like the arrows are pointing to a very weird covariance structure.

I agree that these maps look really implausible for 2020. How’s Biden gonna win Idaho, Wyoming, Alabama, etc. . . . but not New Jersey?

But this does all seem consistent with correlations of uncertainties between states that are too low.

Perhaps this is a byproduct of Fivethirtyeight relying too strongly on state polls and not fully making use of the information from national polls and from the relative positions of the states in previous elections.

If you think of the goal as forecasting the election outcome (by way of vote intentions; see item 4 in the above-linked post), then state polls are just one of many sources of information. But if you start by aggregating state polls, and then try to hack your way into a national election forecast, then you can run into all sorts of problems. The issue here is that the between-state correlation is mostly not coming from the polling process at all; it’s coming from uncertainty in public opinion changes among states. So you need some underlying statistical model of opinion swings in the 50 states, or else you need to hack in a correlation just right. I don’t think we did this perfectly either! But I can see how the Fivethirtyeight team could’ve not even realized the difficulty of this problem, if they were too focused on creating simulations based on state polls without thinking about the larger forecasting problem.

There’s a Bayesian point here, which is that correlation in the prior induces correlation in the posterior, even if there’s no correlation in the likelihood.

And, as we discussed earlier, if your between-state correlations are too low, and at the same time you’re aiming for a realistic uncertainty in the national level, then you’re gonna end up with too much uncertainty for each individual state.

At some level, the Fivethirtyeight team must realize this—earlier this year, Nate Silver wrote that correlated errors are “where often *most* of the work is in modeling if you want your models to remotely resemble real-world conditions”—but recognizing the general principle is not the same thing as doing something reasonable in a live application.

These things happen

Again, assuming the above maps actually reflect the Fivethirtyeight forecast and they’re not just some sort of computer glitch, this does not mean that what they’re doing at that website is useless, nor does it mean that we’re “right” and they’re “wrong” in whatever other disagreements we might have (although I’m standing fast on the Carmelo Anthony thing). Everybody makes mistakes! We made mistakes in our forecast too (see item 3 in our earlier post)! Multivariate forecasting is harder than it looks. In our case, it helped that we had a team of 3 people staring at our model, but of course that didn’t stop us from making our mistakes the first time.

At the very least, maybe this will remind us all that knowing that a forecast is based on 40,000 simulations or 40,000,000 simulations or 40,000,000,000 simulations doesn’t really tell us anything until we know how the simulations are produced.

P.S. Again, the point here is not about the silly scenario in which Trump wins New Jersey while losing the other 49 states; rather, we can use problems in the predictive distribution to try to understand what went wrong with the forecasting procedure. Just as Kos did for me several years ago (go here and search on “Looking at specifics has worked”). When your procedure messes up, that’s good news in that it represents a learning opportunity.

P.P.S. A commenter informs us that Nate wrote something, not specifically addressing the maps shown here, but saying that these extreme results arose from a long-tailed error term in his simulation procedure. There’s some further discussion in comments. One relevant point here is that it you add independent state errors with a long-tailed distribution, this will induce lower correlation in the final distribution. See discussion in comments here and here.

104 thoughts on “Problem of the between-state correlations in the Fivethirtyeight election forecast

  1. I think the model is trying to say “There is a chance that Oregon or New Jersey will begin a strong realignment towards Republicans, and we don’t have enough evidence to exclude that hypothesis entirely.” West Virginia shifted very fast towards Republicans.

    • Aok:

      All things are possible. But it seems more likely to me that they just didn’t have enough correlations and they’re treating the individual states as too close to random. In 2016, New Jersey was (if I’m reading the table on Wikipedia correctly) the 8th most Democratic-voting state in the country. So, yes, I know that realignments happen, but I can’t see it overtaking all the other 49 states when it comes to supporting Trump in 2020. Even in 40,000 simulations this should not be happening. Or, again, maybe there’s a glitch in the program that Fivethirtyeight uses to make those maps.

      • Andrew and Aok:

        To add another point, while we should have a reasonably inclusive prior that factors in the possibility that Oregon and New Jersey shift dramatically to the right from 2016 to 2020 before seeing data, we can be reasonably sure now based on the polling we’ve seen and time to election that it’s plausible for them to shift that far to the right.

        Since we had to tackle this problem earlier this year, it seems to me that Nate has focused too much on the particular groups that might cause polling error or within-cycle drift, and not enough on how all those factors should be constrained (IE about minimum or default correlations between states).

        • Andrew and Aok:

          Let me try again without typos…

          To add another point, while we should have a reasonably inclusive prior that factors in the possibility that Oregon and New Jersey shift dramatically to the right from 2016 to 2020 before seeing data, we can be reasonably sure now based on the polling we’ve seen and time to the election that it’s implausible for them to shift that far to the right, (especially relative to the nation as a whole).

          Since we had to tackle this problem earlier this year, it seems to me that Nate has focused too much on the particular groups that might cause polling error or within-cycle drift, and not enough on how all those factors should be constrained by some overall state-level intercept or some such (IE about minimum or default correlations between states).

        • > So you need some underlying statistical model of opinion swings in the 50 states, or else you need to hack in a correlation just right.

          I may have missed this, but is your model doing the first or the second?

        • “To add another point, while we should have a reasonably inclusive prior that factors in the possibility that Oregon and New Jersey shift dramatically to the right from 2016 to 2020 before seeing data, we can be reasonably sure now based on the polling we’ve seen and time to the election that it’s implausible for them to shift that far to the right, (especially relative to the nation as a whole).”

          There has been literally zero polling in Oregon.

      • Georgia was the 2nd most democratic state in 1960 and the 5th most Republican state in 1964. I think the model is just looking at these historical instances when it’s figuring out what’s possible.

        I think the reason we say stuff like this is impossible is because we have an intuitive understanding of how coalitions are/aren’t changing, and can reject the hypothesis that crazy stuff is happening. 538 does have a polarization variable that takes into account the difference between the parties — perhaps it’s not doing enough for them?

        • I think Elliot’s point here is that some outcomes are incredibly unlikely. Easy example: you wouldn’t expect Oregon to be red when Idaho and Washington were blue. We have enough polling + understanding of the historical correlations between those three states to say that.

          The 1960-1964 Georgia example is consistent with this. The polling showed Goldwater ahead, Georgia voted like its neighbor states, and there was a specific issue that motivated that change. So while the example is super interesting, Georgia in ’64 isn’t the same as someone saying “NJ will be red even when all other 49 states are blue, notwithstanding that polls indicate that NJ is going to be one of Biden’s best 10-15 states and that there are states with very similar people culturally to NJ.

          Overly long navel gazing tangent: I bet there are worlds where AK, HI, or UT does something really weird and inconsistent with what people would expect because there’s frequently little polling in them and each of them have enough voters who aren’t present in high numbers in other states that if those voters changed their preferences, you wouldn’t necessarily pick it up in other states. But even those states aren’t *total* unicorns (I haven’t studied this, but I’d bet a round of drinks that Alaska correlates pretty well with Montana and somewhat well with the rest of the Mountain West).

        • Eastern Oregon and Eastern Washington ARE Idaho. I live in Central Washington and you could map the political orientation of the PNW against average annual precipitation and come up with a pretty accurate map. Just a matter of tuning turnout and population.

        • Well, Eastern and Southern Oregon aren’t going to gain enough in population between now and November 3rd to swing the state. Plus … Bend.

    • When I looked through previous election data, I found that if you looked at each state’s lean in 2008 and then 2012 and just took the (signed) change in lean, the standard deviation was 4.5 points, same with 2000 to 2004. Years without incumbents on the ballot it was close to 7 points. Now, having New Jersey go Republican in a tied national environment may happen every so often. That’s about a 3-sigma event in New Jersey specifically. But having New Jersey go Republican in a Democratic landslide is something else. If Biden wins places like WY and WV, that’s something like a 30+ point win nationally. Even with fairly large uncertainty at the state level, that’s still more than a 6-sigma event in New Jersey.

    • The problem here is correlation. There are plenty of people in central jersey that are commuting to NYC for work. There are a ton of people in north jersey that are basically new yorkers. And the same thing with central jersey/south jersey and philly. These are the people that vote D and the people that trump would have to flip.

      NJ is highly educated and diverse, so it’s hard to find a reason for republicans to win there. But it’s even harder to find a reason for republicans to win without NY/PA coming along for the ride.

  2. That happened in the context of the South shifting to the right. If Trump were to win either Oregon or New Jersey, that would mean he has massively over-performed his polls among college educated voters, in which case he would have won a lot more states that just those. There is no way Trump would win Oregon and New Jersey while at the same time losing GOP strongholds like KS, UT, etc. Similarly if Biden were to win LA and MS like in the map, he would have massively overperformed in the South and carry NC and GA. The 538 maps are just completely implausible.

  3. A bit off topic, but Nate doesn’t use approval rating data either, right? And you’re model does? That and Nate is letting the fundamentals drive his model a bit more at this point.

    It seems to me like there’s a pretty tight correlation between the approval rating trend line on 538 and Bidens lead in the national polls. To the extent this election is referendum on Trump and so we expect that correlation might hold, it’s not unreasonable to hypothesize that there’s a cieling to Trump’s support. 46% is his 3+ year peak in approval. It was at about that in April when Bidens lead was around 4-5%. Assuming the correlation holds and is causative, what would his approval rating need to be to have a reasonable (50-50) at winning and is it possible he can get there given that he hasn’t demonstrated he can move the needle more than 4 points off his long term average of about 42% approval?

  4. Another way to assess the assumed correlations between states is to look at the distribution of “tipping point probabilities”, which 538 reports. A lower assumed correlation should lead to higher probability for the less probable states. (A perfect correlation should lead to a single huge peak on PA.)

  5. One of the more interesting oddities of the 538 model is the odds of Biden winning the electoral college given that Biden loses a specific state, which their model outputs conveniently lists in the state toplines csv.

    For example, out of the simulations where Biden loses Virginia (adding up all the 3.7M simulations from the past 3 months) he lands a 3% chance of winning the electoral college. This seems a reasonable-ish result – losing the Biden+10 Virginia would be quite a blow to the Biden campaign. However, if Biden instead loses neighboring Maryland, a Biden+25 state, his odds of winning the electoral college suddenly rises to 22%. Losing the Democratic bastion of California (Biden+30) bumps that up even further to 23%.

    The only justification I could see for this is that the rare high variance tails of the model completely screws up the covariance between states, but this is being extended as far as a full 1 in 40 shot that Biden simultaneously loses Oregon while winning the electoral college (26% that he wins the EC if he loses OR, and a 10% chance of losing OR), which is a fairly high probability.

    In comparison, of the 9,000 simulations your model for The Economist currently has available, 23 (0.3%) have a Biden loss in Oregon, and all 23 of those end in an electoral college defeat for Biden. Maybe Nate is right that your model is too confident, and their 10% odds of Trump winning Oregon is more accurate than your 0.3%, but Biden simultaneously losing Oregon and winning the electoral college? That shouldn’t be anywhere near 1 in 40.

  6. Andrew — An alternative explanation is that 538 could be using much heavier tails than you imagine.

    For example, you’ve noted that 538’s 50% predictive interval is too wide. However, when making that point you used a normality assumption to estimate the 50% interval from 538’s published 80% interval. I think the normality assumption could be meaningfully invalid, and may reflect underestimation of how heavy the trails could get in 538’s model.

    In particular, suppose that 538 chose the scaled t distribution for heavy tailed distributions. With degrees of freedom parameter of about 1.4, a scaled t distribution can have both an 80% interval that matches the Florida 80% predictive interval published by 538 AND a 50% interval with the same width as your Economist’s model’s 50% predictive interval (ie, a width of 3.8 percentage points).

    A scaled t distribution with low degrees of freedom throws a lot of mass to the tails!! With so much mass in the tails, in 10000s of simulations perhaps the 538 model would do things like produce a map where Biden wins all states but Jersey and other bizarre scenarios.

    So, maybe that’s an alternative explanation. Or, if not an alternative, maybe it’s another part of the puzzle in addition to the interstate correlation issue you raise.

      • More:

        Sure, but the point is that if you have these independent state errors with heavy tails, that induces a lower correlation between states. I clicked through and looked at Nate’s thread, but none of it explains the map where Trump wins New Jersey but loses the other 49 states, or those maps where Trump wins California but loses the election. Mathematically, I can see that you can get this by having state errors with long tails and the resulting lower correlation between states—but, to me, that’s an indication that there’s a problem with the forecasting method.

        To put it another way: when a model gives predictions that don’t seem to make sense, then I think it’s a good idea to at least consider the possibility that your model might have problems.

        Again, these models don’t come down from the heavens. They’re human-constructed and can have human error.

        • P.S. It’s fine to have long tails, but (a) there’s still the question of how long the tails should be (do we really think that the 95% interval should go all the way down to 0.42 for Biden’s two-party vote share in Florida), and (b) it’s possible to add long tails without screwing up the correlations.

        • Andrew — Thanks for clarifying! From your comments, I still think you’re underestimating how heavy tails can get, but prompted by your comments I might have spotted a problem with 538’s model.

          Tails get heavy:

          I ran some quick simulations. It looks like the probability assigned to “Biden wins everywhere except New Jersey” can be increased more than 100,000 fold if a model has heavy enough tails, even if the heavy-tailed model and a comparable non-heavy-tailed model have the same between-state correlation, point estimate for the Biden electoral college vote share percentage (64%, chosen from your model), and 50% predictive interval (56-71%, also chosen from your model).

          I can discuss that more if you’re interested, but I’m not sure if you find a 100,000-fold increase surprising or not.

          Possible 538 problem:

          If the 538 model outputs a heavy-tailed distribution, they are using inverse logit or another transform to limit predictions to 0-100% (or 0-538 etc.). However, heavy-tailed distributions can easily “outrun” the transform at the tails, resulting in unintended peaks in the prediction density near 0% and 100%.

          Here’s an example:

          invlogit = function(x) 1/(1 + exp(-x))
          hist(invlogit(rt(10^6, df = 1.5)/8), breaks = 10^5, xlim = c(-0.02, 1.02))

          It’s possible that some of the most bizarre 538 maps are from unintended peaks near 0% and 100%.

        • Hm… actually, disregard both my points.

          There are problems with both:

          For the “100,000-fold increase”, I thought I’d made an assumption that ensured the same between-state correlations for the heavy-tailed and non-heavy-tailed models. However, all I assumed was: Under the condition that Biden wins X electoral college votes, the heavy-tailed model assigns the same conditional probabilities as the non-heavy-tailed model to all possible electoral college maps. That assumption doesn’t seem to imply between-state correlations are the same for the models.

          For the “possible 538 problem”, I now see peaks near 0% and 100% can be intentional floor and ceiling effects.

  7. I think it would be worthwhile to get in touch with Nate and see if he has any thoughts, constructive criticism and all. I think this is quite an useful critique to examine more closely.

  8. The more you describe your approach to modelling here, the more I doubt the model outputs as conveying useful information. What really hits me with a wave of doubt are statements like this (quoted from the last two posts about Florida predictions):

    > As I wrote, it’s happened to me many times that I’ve fit a model that seemed reasonable, but then some of its predictions didn’t quite make sense, and I used this disconnect to motivate a careful look at the model, followed by a retooling.

    > This is central to our workflow, to make all sorts of inferences and predictions from a fitted model and focus in on what looks wrong. This was the sort of reasoning that motivated us to fix our model last month.

    These suggest that you adjust the model until it makes predictions that seem reasonable [make sense, don’t look wrong] to you. Doesn’t this mean you are fitting the model to your own intuitions on what is likely and unlikely to happen? Well, why not skip the middleman and just give us your own intuitive judgement on what is likely to happen?

    At least from a research perspective, it would seem better to `preregister’ the model, not fix it when it’s predictions look intuitively wrong.

    I assume the same points apply to most models trying to predict election results: at least you are letting us see how the sausage is made, which is really useful.

    • Fin:

      What happens is that we have lots of prior information that is not used when we build models. Just consider all the models in statistics textbooks. Real life is not logistic curves, additivity of inputs, normal distributions, etc etc. The models we fit are crude approximations.

      Consider two alternatives:

      1. We fit models that we completely understand, models that are so simple that they will have no bugs. The problem is that these models are so simple that they won’t capture some important features of reality. For example, in election forecasting we could set between-state correlations to zero, but that will give bad forecasts.

      2. We build complicated models to capture some of the complexities in the world that we care about. But these models are at the bleeding edge of research and can have problems: bugs in the code, conceptual errors in the model formulation, problems that we do not realize at first. If we commit to using such a model and don’t let us fix it problems, that will give us bad forecasts too.

      You’d like to have a model that’s good the first time and never needs to get fixed, but realistically that doesn’t always happen, at least with my colleagues and me. So I think your plan will only work if you hire a better group of analysts, some group that’s better than us and better than the Fivethirtyeight team and gets everything right the first time. In the meantime, I’ll keep trying to muddle through. The advantage of muddling through is that I can learn some things, write them up, and maybe future researchers won’t make the same mistakes. Instead they’ll be able to do better the first time and make more sophisticated mistakes as they go further.

      You write, “Doesn’t this mean you are fitting the model to your own intuitions on what is likely and unlikely to happen? Well, why not skip the middleman and just give us your own intuitive judgement on what is likely to happen?” No, our inference is not just coming from our priors. The inference comes from priors and new data. We can’t “skip the middleman” because we haven’t seen the new data yet!

      Regarding your other point: All our models are on github, so they are preregistered. If you want you can go back to our earlier code and find the bugs in it yourself.

      • Ok, but where is the line between tuning the model to fix the “implausible” predictions (quoting from a previous blog post on this topic) and tuning the model to simply fit prior expectations? Who’s defining “implausible”? Maybe 538 has a different definition of implausible. Is there more judgement calls in modeling for election forecasts than maybe for other research you might do?
        Serious questions. I like (and use) these methods.

        • Jd:

          There’s no sharp line. We spend a lot of time at Cantor’s corner. I would not say that election modeling involves more judgment calls than my other applied research. We have to do the same sorts of things in toxicology, for example. And even something as simple as logistic regression modeling of behavior—a staple of quantitative social science—can involve a lot of tinkering, once you realize that probabilities never really go down to 0 or all the way up to 1.

          Regarding Fivethirtyeight: I can well believe they have a different definition of implausible. I just want to hear if from them! If they think it really could happen that Trump wins NJ and loses all the other 49 states, or if they really think that, if Trump wins California, that he has a big chance of losing the national election, I want them to commit to that. As it is, I think their model makes a lot of predictions that they don’t really believe, and they haven’t wrestled with what that means for their model.

      • Andrew: thanks for responding.

        I guess this issue turns on the distinction between the model and the code. In my understanding, the model is a particular (complex!) statistical description of what you think is going on. The code is an implementation of that model. Fixing bugs in the code (where there is some error where the code doesn’t implement the model as intended) is great, and doesn’t lead me to doubt predictions (though if there were a lot of bugs, that could be a worry). Fixing the model when it doesn’t make (what you think is) a reasonable prediction seems like a mistake, at least to me. Maybe the model is right! and the prediction you think is unreasonable will, in fact, hold up. In general, unreasonable or surprising predictions from a model are the most interesting part of that model: if the model doesn’t make predictions that conflict with intuition, how much extra value does it bring?

        Maybe the difficulty here is that I’m thinking of the model as a hypothesis; and “fixing” your hypothesis when it doesn’t make the predictions that you want is a no-no.

        • Through the muddling process that Andrew describes, he’s more likely to identify and fix a conceptual error if it leads to a model outcome that is implausible. The number of model conceptual errors decreases but, at the same time, the model converges toward the prior. T

          On the other hand, Andrew has also demonstrated a different process, comparing different models constructed by groups with different priors and understanding the differences. Such a process does not guarantee convergence toward one’s own prior, so that’s good, and if done properly only biases subsequent estimates of model structural uncertainty.

        • John:

          It’s a mistake to say “the model converges toward the prior.” This is a mistake for two reasons. First, it supposes there is this fixed thing called “the prior” which exists independent of all else. Actually, one thing that happens during the model evaluation procedure is that we become aware of more information that we have. The second mistake in your formulation is when you say “toward the prior.” What the model converges to is a synthesis of all the information that goes into it, including poll data, previous election results, models for public opinion changes and polling errors, and so forth. The model converges toward our best estimate, which makes use of all that information.

        • Fair enough. Would it be accurate, though, to say that the model output tends to move in the direction of the prior under such an interative improvement process?

        • John:

          I would not say “move in the direction of the prior.” Rather I would say “move in the direction of including more information.”

          That said, sure, it’s possible to cheat and get any result you want. But then that’s not “the prior.” It’s “the result you want to get.”

        • I think this gets at the question of what Bayesian software should do. Does a model “express a hypothesis,” or “encode prior knowledge”? You seem to have some notion of the “model” as crunching the data and giving insight that people might find surprising. We might also see it, though, as automating and more carefully performing reasoning that a human *could* do; whatever answers it gives, it should be able to tell you the pattern of reasoning that led it there, and that pattern of reasoning should make sense to you, given the assumptions you encoded. If you can see the pattern of reasoning that led it to a surprising result, and you can see the flaw in the reasoning (the place where you disagree with it), then fixing the prior or likelihood to help the model avoid that flaw is totally kosher. This doesn’t mean the model can’t surprise you: but it shouldn’t surprise you just with a prediction — you should be able to look at the entire pattern of reasoning, see that it holds up, and realize that your assumptions, combined with the data, really do imply the surprising result. If instead your takeaway is, “Oh, right, of course the model said this surprising thing, because it doesn’t account for Factor X,” then the ‘surprising result’ isn’t very informative.

          One analogy I like is to think of Bayesian inference software as similar to a smart consultant you can hire, to outsource the “hard thinking” about some problem. Then, we are in a scenario where

          • you give your consultant a briefing about how elections in the US work
          • the consultant goes off and thinks for a while, interpreting data (from past elections and from polls) in light of the info you gave them
          • the consultant comes to you with some predictions (and explanations of them).

          You might be surprised by a prediction the consultant made; if, when they explain the prediction to you, you immediately see a flaw, it’s not “bad practice” to explain the flaw to them (by way of specifying more of what you know about elections), and asking them to go think again.

          (I suppose that a failure mode of this kind of debugging is when the model overlooks some fact about elections but, by overlooking those facts, comes up with ‘unsurprising’ predictions. If you debug based on surprising predictions, then on average, your model will have more conceptual bugs that lead to (false) unsurprising predictions than bugs that lead to (false) surprising predictions.)

  9. I’m not sure I follow — is the claim that the New Jersey scenario is an event with literally zero probability such that it could never occur in any number of simulations? Otherwise, what does picking the most absurd result out of all the simulation runs prove?

    I don’t know exactly what they’re doing, but if memory serves, 538 credited its relatively strong performance in 2016 to the fact that, unlike most forecasters, they were factoring in interstate covariances and Nate has said that those covariances in the current model are very high.

    The CA result seems more problematic, but how many simulations do they have with Trump winning CA? Could just be a law of small numbers thing. In any case, it’s not clear to me that the conditional probability in these sorts of scenarios should be extremely high. There’s always some chance of a state-specific shock.

    Not deeply familiar with either model, but isn’t the difference between yours and theirs just about the “fundamentals” component?

    • Joe:

      Nothing’s literally zero probability. But it’s low enough probability that we should not be seeing it in 40,000 simulations.

      I do believe that the Fivethirtyeight model has between-state correlations. But I think these correlations are too low. If you start with a reasonable correlation matrix but then start adding independent long-tailed errors, then you’ll end up with correlations that are too low.

      You can run our model with other settings for the fundamentals but you’re not gonna see any simulations where Trump wins New Jersey and loses the other 49 states.

      • “But it’s low enough probability that we should not be seeing it in 40,000 simulations.”

        Sorry, I’ve got to be missing something but this seems like the mother of all multiple comparison issues, no? It’s hard for me to put an probability on something like the NJ scenario without a model, but the probability that something of the NJ scenario or something of an equivalent ridiculousness could easily be 1 in 10,000.

        “If you start with a reasonable correlation matrix but then start adding independent long-tailed errors, then you’ll end up with correlations that are too low.”

        Okay, get what you’re saying there. Not clear to me why it’s unreasonable to assume the possibility of state specific shocks, but it’s not really my area. Still seems like a natural disaster or something in one particular state could pull them far off from the national trends.

      • I feel a little silly talking to you about basic probability
        but bear in mind that for an extreme 1 in a million event,
        the chances that it shows up in 40,000 simulations is ~4%.
        And considering there are many many 1 in a million events,
        the fact that one or several of them show up in 40,000 simulations should be hardly surprising.

        • Maybe when you wrote “we should not be seeing it in 40,000 simulations” you meant “we should not be seeing it in 100 simulations”?

          (The multiple comparison aspect mentioned in another comment remains: each time they regenerate those 100 points there is some chance of one of many “low probability” scenarios to pop up.)

      • Given that Biden is in front in most states, increasing the between state correlations should mean that you decrease his win probability.

        I agree that too low correlations are probably a problem with the 538 model. But if you fixed it, I suspect it would increase the difference between their model and yours.

        • Brian:

          I think the correlations and the national uncertainties go together. If Fivethirtyeight were to increase their between-state correlations, then they’d be able to decrease the uncertainty in each state and still get their target result of Pr(Biden wins) of approx 70%.

  10. Andrew, could you ask the economist data team to make the time series of the state poll averages available for download? As it is, only the “topline” (current day) is available.
    Thanks!

  11. One thing I haven’t seen mentioned that I’m curious about is how much of the low between-state correlations may be attributable to the 538 model’s additional uncertainty based on COVID-19. Specifically, their methodology page states that:

    *We attempt to account for additional uncertainty in Election Day results because turnout will potentially be less predictable given the pandemic.

    *We allow COVID-19 to be a factor in determining covariance. That is to say, states that have had high rates of COVID deaths and cases (such as Arizona and New York, which otherwise don’t have that much in common politically) could have correlated outcomes. Likewise, we also consider covariance based on a state’s anticipated rate of mail voting.”

    I highlight these two because they explicitly seem like they would reduce the kind of between-state correlation based on partisanship (among other things) that you might expect to see in a ‘normal’ election year. It looks to me that 538 is modeling for the possibility that turnout and mail voting that could lead to some independent state-wise upsets. Perhaps combining that with the other long-tailed errors is producing extreme events like this that otherwise look particularly improbable. I’m not sure that really helps explain the California EC result, though, because I think the voting system would have to be FUBAR pretty bad to turn CA red (and keep the other states at ~current partisanship). I feel like maybe it makes the FL CI seem more plausible, maybe.

    How big the effect is and how reasonable that is is another question, of course.

    • I can see COVID having some very weird effects, given that older people are generally more likely to vote and more vulnerable to COVID. So if due to COVID fears older people are *less* likely to vote this year than younger people…

      And this might not be predictable even 2 months ahead, because 2 months is enough time for COVID fears to decrease a lot (if numbers continue to go down and/or treatments improve) or increase (if numbers rebound).

      • But if the primary effect of COVID is not changing *how* people will vote, but *who* will vote, would that be captured in the polls? Especially as people who wouldn’t feel comfortable voting in-person now might feel comfortable 2 months from now if numbers keep going down.

        • > Especially as people who wouldn’t feel comfortable voting in-person now might feel comfortable 2 months from now if numbers keep going down.

          Or the other way around if numbers keep going up (the situation is not the same everywhere and the “second wave” was supposed to arrive in the fall).

        • True, but perception lags behind reality on this. (It takes time for infections to translate into *reported cases*, and much longer to translate into reported deaths.) There isn’t really time for a fall wave to become ‘publicly visible’ before the beginning of November. (And COVID doesn’t really seem to have that sort of seasonality, at least not very strongly.)

          Sure, the situation isn’t the same everywhere…but the states that look problematic now (places like Iowa, the Dakotas, and West Virginia) are very low-population (not likely to have a noticeable impact on national numbers, therefore shouldn’t affect voting behavior elsewhere) and not swing states (so voting behavior there isn’t terribly relevant).

        • confused –

          OK. You’re on record Hopefully your prediction here will turn out more accurate than what you were predicting for Texas, Arizona, etc.

        • Total cases pretty flat for a couple of weeks. Drop in case rate in Florida, Texas, Cali may continue or and they comprised a lot of the increase but since there was a drop maybe behaviors will reverse towards normal and rate may flatten. Kids going back to school. Colleges getting way. Parties. More businesses reopening. More people believing more strongly that social distancing is only for pinko wussies. More time being spent indoors.

        • To be clear, I’m not ruling out a fall wave (I don’t expect one, but it could happen, we don’t know enough); what I’m actually predicting is that there won’t be an increase of reported deaths before Election Day (IE, 7-day average of deaths then will be lower than today).

          Given the degree of time-lag in the system (both infection to death and death to reporting of death) and the current downward trend, I think it would be very hard for that *not* to be true.

        • Also – for deaths at least, it’s driven by infections among the oldest age groups to a very high degree.

          If people giving up on social distancing isn’t “age-neutral”, infections could rise and deaths drop (10,000 infections among 20-year-olds would be expected to result in fewer deaths than 100 infections among 80-year-olds). There will probably be a lot of infections at colleges… but depending on who they interact with this may not be noticeable in deaths.

          Certainly when I was in college the “hands on” stuff like labs were mostly taught by TAs rather than the (older) professors…

        • > To be clear, I’m not ruling out a fall wave (I don’t expect one, but it could happen, we don’t know enough);

          It seems it’s happening: new daily cases (7-day average) are now over 85k, more than doubling from September 7 (all figures according to the NYT).

          > what I’m actually predicting is that there won’t be an increase of reported deaths before Election Day (IE, 7-day average of deaths then will be lower than today).

          The 7-day average of deaths on September 7 was 802, currently it stands at 825.

          > Given the degree of time-lag in the system (both infection to death and death to reporting of death) and the current downward trend, I think it would be very hard for that *not* to be true.

          The downward trend ran out of steam quickly, it went down to 702 by September 10 and spent most of the following six-weeks in the low-700’s (the lowest mark was 697 on October 6 and 16).

        • Yeah, it’s happening surprisingly early. This does not really look like ‘classic’ seasonality IMO.

          I could quibble that what we’re seeing is probably not entirely a “fall wave” in the sense of seasonality-driven resurgence in previously hit areas – some of the rural Great Plains areas were never hit hard in the first place – but not much point… I was definitely wrong about the overall death numbers.

        • > I could quibble that what we’re what we’re seeing is probably not entirely a “fall wave” in the sense of seasonality-driven resurgence in previously hit areas

          I don’t know about “seasonality-driven resurgence” but it’s a “wave” (for some definition of wave) happening in “fall” (for any definition of fall) mostly everywhere in the US.

          There are only a handful of states where both new cases and deaths are down.

          In most states, deaths are up by over 50% since September 7.

          In most states, new cases have more than doubled since September 7.

          In a previously hit area like New York, cases and deaths have tripled since September 7.

        • >>I don’t know about “seasonality-driven resurgence” but it’s a “wave” (for some definition of wave) happening in “fall” (for any definition of fall) mostly everywhere in the US.

          Yeah, I agree in terms of describing what’s happening now.

          The difference is only relevant in terms what that means for what will happen in the next 4 months or so.

          If seasonality is a primary driver, it will probably continue to get worse for a while (flu season deaths often peak in January).

          If the current Great Plains/western Midwest spike follows the pattern of the NY/NJ March-April one or the TX/AZ summer one, it would peak significantly sooner.

          And both could be true, I mean at the current rates most people in the Dakotas might have COVID before January, but elsewhere measures might slow it down…

        • > There isn’t really time for a fall wave to become ‘publicly visible’ before the beginning of November.

          The election is more than eight weeks away. That gives plenty of time for things to become visible. The situation changed noticeably in some states betwen mid-May and mid-July.

          > the states that look problematic now (places like Iowa, the Dakotas, and West Virginia) are very low-population (…) and not swing states (..)

          Fauci listed the other day 7 problematic states: North Dakota, South Dakota, Iowa, Arkansas, Missouri, Indiana, Illinois. Two of them are in top 10 by population. Two of them are are among the 12 states classified as “competitive” in the forecast from The Economist.

          And now for something completely different: Let’s consider spherical students….

          “In just under two weeks of classes, there have been more than 700 positive COVID-19 cases on campus, according to the university. The school’s researchers had anticipated about 700 positive cases for the entire fall semester, but if current rates continue, the school of about 50,000 students could see as many as 8,000 positive cases by the end of the term, according to a statement from the university.

          “At a press conference held over Zoom, Nigel Goldenfeld, a physics professor who contributed to the school’s reopening plan, said the campus’s models had already anticipated parties and people not wearing masks — but they did not take into account that students would fail to isolate, that they would not respond to local health officials’ attempts to contact them or that students who had tested positive would nonetheless attend and host parties.”

          https://www.npr.org/sections/coronavirus-live-updates/2020/09/03/909137658/university-with-model-testing-regime-doubles-down-on-discipline-amid-case-spike

        • By the way, for what it’s worth the IHME foreceast was updated recently and their base scenario is that deaths in the US will remain around 1k per day in September and then increse to 2k per day by early November.

          https://covid19.healthdata.org/united-states-of-america?view=daily-deaths&tab=trend

          Joshua, that xkcd is brilliant. Of course, the story brought to my mind previous ones making fun of physicists (why does need a whole journal, anyway?).

        • > Fauci listed the other day 7 problematic states: North Dakota, South Dakota, Iowa, Arkansas, Missouri, Indiana, Illinois. Two of them are in top 10 by population. Two of them are are among the 12 states classified as “competitive” in the forecast from The Economist.

          I was wrong there. I don’t know why I had the impression North Carolina was on the list and I didn’t check when I copied it. Both counts drop to one: Illiniois has a large population, Iowa is a swing state.

        • Sure, 8 weeks until the election… but hospitalizations have been dropping, so the next several weeks’ death decline is already “baked in”. (Probably more than that, given the time lag between “occurrence of death” and “reporting of death”.)

          IE, if infections start going up tomorrow, I don’t think we’d really see the effect in *reported* deaths for 5-6 weeks (3-4 weeks from infection to death + 2 weeks or so for reporting delay).

          So a fall wave would have to start before the equinox to reverse the trend in deaths before Election Day.

          Oh, I wouldn’t be surprised to see very high case rates among college students… but that’s a very low-median-age environment (most of the people on a campus are students) and so a high number of infections will not translate to many deaths.

        • And as for IHME… current 7-day average deaths for the US is 840 and dropping, and it should continue to drop for a few weeks from the other (more leading) indicators like hospitalizations and % positive … so 1k average for September is implausibly pessimistic. And 2k in November also is… death rates will be lower than seen in March/April due to better treatments so where are they getting THAT many infections from?

          A fall wave is believable (I doubt it, but no one really knows enough about COVID seasonality to say one way or another) but not one that large.

        • confused –

          Where do you get hospitalizations and (national) positivity rate from?

          Cases dropped by a considerably higher percentage than deaths from 1.5 months ago. I think you’re not considering changes in the number of tests.

          And this is for you:

          https://ourworldindata.org/covid-health-economy

          Economic harm may well be associated with rate of death more so (or somewhat independently from) than degree of shelter in place orders. And the direction of causality remains complicated to assess.

        • Regarding direction of causality…

          I think that a lot of people often underestimate the importance of longitudinal data…not only because cross-sectional data is inadequate for assessing causality, but also because longitudinal data often helps to shed light on direction of causality because it allows for sequences in the chains of mechanism to become more obvious.

          At any rate, as for causality with covid, I still think we’re in the bottom of the second inning or the end of the fourth quarter.

        • >>Where do you get hospitalizations and (national) positivity rate from?

          covidtracking.com and their twitter account

          >>Cases dropped by a considerably higher percentage than deaths from 1.5 months ago.

          Well, yes, because of lag. Currently-reported deaths mostly occurred a couple of weeks ago, from infections maybe a month and a half ago.

          I understand IHME is assuming seasonality (IMO an implausibly large amount of it).

          But what I meant by “where are they getting all the infections from” was location and age.

          If schools/colleges are a major factor in fall we would expect the IFR to be much lower than already seen (since median age of the infected population is much lower).

          Extrapolating current trends to Jan 1, ie if there is no seasonality, we would expect below 250,000 deaths by Jan 1 (covid19-projections.com has 220,000 by Nov 1, and a death rate of 404/day and dropping then).

          A fall wave would have to heavily infect older populations — the opposite of what a school/college driven one would do — to account for 160,000 extra deaths. IFRs in younger populations are extremely low.

        • confused *

          > Well, yes, because of lag. Currently-reported deaths mostly occurred a couple of weeks ago, from infections maybe a month and a half ago.

          I think you missed my point. I believe the 1.5 months takes care of the lag by adding the delay between infections and death to delay between deaths and reporting of deaths. Regardless, you’re assuming the # of tests is constant and it isn’t. There has been a drop in number of tests and so the drop in number of cases is to some degree a function of the drop in the number of tests not a change in positivity. There isn’t a constant ratio bwtweem the number of identified cases and the number of deaths because of (1) changes in fatality rates due to improvements in treatment and (2) changes in the number of tests.

          If you want predict future deaths it would seem to me that hospitalizations (and ICU admissions) is the way to go.

        • confused –

          > I understand IHME is assuming seasonality (IMO an implausibly large amount of it).

          How did you know that? They thread I linked to when through a detailed analysis to make that determination

          I assume that they have a technical process for estimating the effect of seasonality, and you don’t know what that process is. So it seems to me that they are calculating seasonality whereas you are merely guessing without actual study of the indicators of seasonality.

          Or maybe I’m wrong and you have a technical argument about why their method for estimating seasonality is wrong? Maybe you know of a particular assumption they’re making that is wrong?

          You could be right but it seems to me you’re on shaky ground from a structural standpoint. Calculations can be wrong but in general they’re probably more plausible than mere guesses.

        • confused –

          An article from the end of June:

          > In Arizona, the time between diagnosis and death from Covid-19 now is about 14 or 15 days, up from four or five days early in the pandemic. Then the state health department must verify the death, so there can be a three-plus week lag between a new case and a fatality being reported, Gerald said.

          An article from early July:

          > Some people do get infected and die quickly, but the majority of people who die, it takes a while,” [epidemiologist from Boston University] Murry continued. “It’s not a matter of a one-week lag between cases and deaths. We expect something more on the order of a four-, five-, six-week lag.”

        • Sorry – left this out:

          > Eleanor Murray, an epidemiologist at Boston University, told me. “Today’s cases represent infections that probably happened a week or two ago. Today’s deaths represent cases that were diagnosed possibly up to a month ago, so infections that were up to six weeks ago or more.”

        • >>Regardless, you’re assuming the # of tests is constant and it isn’t.

          No I’m not – that’s why I referenced hospitalizations and positive, not number of cases (which I consider near-useless/uninformative under current US conditions).

          >>If you want predict future deaths it would seem to me that hospitalizations (and ICU admissions) is the way to go.

          Oh, yeah, hospitalizations are what I’m primarily looking at. But I was thinking that if infections rose % positivity would probably go up before hospitalizations.

          >>How did you know that? They thread I linked to when through a detailed analysis to make that determination

          Because I saw that thread on my daily “looking for COVID news” check yesterday evening before I came to this site and saw that you linked to it ;)

          >>Or maybe I’m wrong and you have a technical argument about why their method for estimating seasonality is wrong? Maybe you know of a particular assumption they’re making that is wrong?
          >>You could be right but it seems to me you’re on shaky ground from a structural standpoint. Calculations can be wrong but in general they’re probably more plausible than mere guesses.

          I don’t know the details of their method, no. It’s more that the result seems utterly implausible; IFR in fall should be expected to be lower due to schools/colleges meaning a decreased average age of infection, and possibly further improvements in treatments, so a death peak higher than March/April just isn’t going to happen.

          Unless the herd immunity threshold in US cities really is in the 60%-70% range, most would probably hit herd immunity first.

          Plus I have zero trust in the IHME model in general as it has performed very badly so far.

        • confused –

          > IFR in fall should be expected to be lower due to schools/colleges meaning a decreased average age of infection

          You also seem to be operating under the belief that a lot of younger people getting infected doesn’t lead to a lot of older people getting infected. That’s a dubious line of reasoning.

          BTW – please note that there’s evidence tonsigrast that increased spread of infections impacts the economy more than SIPs:

          https://t.co/4e2sjjCqdN?amp=1

        • >>You also seem to be operating under the belief that a lot of younger people getting infected doesn’t lead to a lot of older people getting infected. That’s a dubious line of reasoning.

          Not exactly. If there is a fall wave at all, sure it wouldn’t be limited to just students.

          But it doesn’t take all *that* much of a shift in the age distribution to have a huge effect on the IFR, and e.g. college towns are inherently going to have a lower median age than the national average.

          But a hugely disproportionate part of the deaths so far (and especially in early months in the US) come from long-term care facilities. These residents are not really part of the generally-mixing population so school/college-driven outbreaks may not be relevant to this subpopulation.

          And it seems very likely that independently-living elderly people will on average be more cautious than younger people.

          All these things should drive down the age distribution of infections and thus the IFR.

          As for the economic aspect – could very well be! However, the “second-order” effects I’m more concerned about (including social effects as well as strictly economic ones) wouldn’t be measurable yet. (Things like decreased quality of education, shift toward large potentially-monopolistic sellers like Amazon and Walmart that provide delivery services, etc etc.)

          Also, we are probably too early in the game to judge duration of disruption. If it is genuinely “over” in Sweden but vaccines don’t get rolled out to the population as a whole until spring, this fall and winter might have a very different picture (IE Sweden’s strategy might look better).

        • But the age distribution stuff is really secondary. My primary problems with it are…

          – COVID doesn’t seem to be all that seasonal. I’ve seen arguments that the seasonality is different (summer-oriented) in the Southwest US and Latin America… but then what about the Dakotas, Idaho, etc.?

          – If COVID did exhibit traditional flu-style seasonality, deaths wouldn’t rise that much that early. September isn’t a big flu month.

          – I think the herd immunity threshold in the US will be/is significantly lower than the often quoted 60%-70%, maybe not high enough to allow for that many deaths, especially as some level of social distancing should still be operative this fall and lower the threshold further.

        • confused –

          > Not exactly. If there is a fall wave at all, sure it wouldn’t be limited to just students.

          >> But it doesn’t take all *that* much of a shift in the age distribution to have a huge effect on the IFR, and e.g. college towns are inherently going to have a lower median age than the national average.

          ? That doesn’t really add up. Yes, a higher % of infections among younger people would lower the overall IFR, but more infections among younger people doesn’t affect the IFR among older people. It’s clear that treatments have gotten better, lowering the IFR, but more younger people getting infected means more older people getting infected means more people dying

          > But a hugely disproportionate part of the deaths so far (and especially in early months in the US) come from long-term care facilities.

          More reason, actually, to expect a spike. Because there are likely to be more older people getting infected who have not previously been infected because they don’t live in congregate settings but have interaction with young people. Like grandparents who are primary caregivers for their grandchildren who are now going to school. Or seniors who aren’t in LTCFs because they live in multi-generational homes with their children who help with their care.

          > These residents are not really part of the generally-mixing population so school/college-driven outbreaks may not be relevant to this subpopulation.

          Again – that’s the point. The seniors who would be exposed now ARE the ones who are mixing with younger people.

          Now of course, there’s also the consideration that there are some things specific to LTCFs, in addition to age, which contribute to the high level of mortality among their residents – namely long hours of close contact and inadequate air filtration. But those same considerations could well apply to an environment where a senior is living in a house with kids going to school or young adults going to parties at her college.

          > As for the economic aspect – could very well be! However, the “second-order” effects I’m more concerned about (including social effects as well as strictly economic ones) wouldn’t be measurable yet. (Things like decreased quality of education, shift toward large potentially-monopolistic sellers like Amazon and Walmart that provide delivery services, etc etc.)

          But again, you are assuming a differential impact along those lines from SIPs relative to the impact from more widespread death and disease. It could be the case, but it may not as well. And we don’t really have the evidence. A lot of people have been arguing that it has been the SIPs that have caused the economic damage; if I”m not mistaken, you’ve been making that argument. The data that we have so far don’t seem to back that up.

          > Also, we are probably too early in the game to judge duration of disruption. If it is genuinely “over” in Sweden but vaccines don’t get rolled out to the population as a whole until spring, this fall and winter might have a very different picture (IE Sweden’s strategy might look better).

          With that I agree. As you and I have discussed, the entire picture over time could well change dramatically depending on whether or not a vaccine is developed and distributed.

          >> – COVID doesn’t seem to be all that seasonal. I’ve seen arguments that the seasonality is different (summer-oriented) in the Southwest US and Latin America… but then what about the Dakotas, Idaho, etc.?

          That’s my issue with your critique of the estimate. Presumably, they aren’t just guessing. There’s a lot of evidence that coronaviruses spread much more widely during the winter. So there’s a lot of evidence that there would be seasonality with this particular coronavirus. What evidence do you have that this particular coronavirus should be anomalous?

        • confused –

          The weirdness in the logistics of posting at this blog continues. I posted a response, it showed up with no indication of being in moderation – but there’s no listing in the recent comments list of the comment having been posted.

        • confused –

          Weird. It showed up after a lag.

          Just to be clear – my basic reaction is that 410k is prolly too high. But I also know that’s just a gut reaction and gut reactions don’t stand up terribly well against careful analysis, in general.

          Anyway, check this out – on the impact of the Sturgis super-spreader event. Can’t we expect more super-spreading as we head into November?

          -snip-
          If we conservatively assume that all of these cases were non-fatal, then these cases represent a cost of over $12.2 billion, based on the statistical cost of a COVID-19 case of $46,000 estimated by Kniesner and Sullivan (2020). This is enough to have paid each of the estimated 462,182 rally attendees $26,553.64 not to attend. This is by no means an accurate accounting of the true externality cost of the event, as it counts those who attended and were infected as part of the externality when their costs are likely internalized. 29 However, this calculation is nonetheless useful as it provides a ballpark estimate as to how large of an externality a single superspreading event can impose, and a sense of how valuable restrictions on mass gatherings can be in this context. Even if half of the new cases were attendees, the implied externality is still quite large. Finally, our descriptive evidence suggests that stricter mitigation policies in other locations may contribute to limiting externality exposure due to the behavior of non-compliant events and those who travel to them.
          -snip-

          http://ftp.iza.org/dp13670.pdf

        • >>Like grandparents who are primary caregivers for their grandchildren who are now going to school. Or seniors who aren’t in LTCFs because they live in multi-generational homes with their children who help with their care.

          Sure, this will happen to some degree.

          But the IFR will be far lower in these groups. Grandparents who are primary caregivers of K-12 students are on average much younger (many of them are not even really “elderly” or “seniors”) and significantly healthier than LTCF residents.

          >>Now of course, there’s also the consideration that there are some things specific to LTCFs, in addition to age, which contribute to the high level of mortality among their residents

          Not just environment; it’s also that LTCF residents are a subpopulation which ‘selects for’ poor health.

          >>A lot of people have been arguing that it has been the SIPs that have caused the economic damage; if I”m not mistaken, you’ve been making that argument.

          Not quite. I’ve been blaming it on fear driven by media, business, WHO, *and* government responses (not just SIPs) as well as by the actual risk of the disease.

          All these things feed off each other, especially in the modern highly-interconnected, instant-media world.

          The “counterfactual” I’m thinking of is not just “no SIPs” – by late March we were already pretty strongly on this path. It’s more like “if the WHO said in late February that containment wouldn’t work, contact tracing would be unfeasible in most Western nations, just treat this as a flu pandemic and use your flu pandemic plans”.

          >> Presumably, they aren’t just guessing.

          The IHME model performance has been quite bad; I’m sure they are basing their estimates on *something*, but I see no reason whatever to trust them.

          >>What evidence do you have that this particular coronavirus should be anomalous?

          Oh, I agree a priori we *should* expect significant seasonality. But it hasn’t really showed up this summer – we’ve seen summer surges on both sides of the Equator and at quite different latitudes (e.g. Chile is not particularly low-latitude).

          And anyway, the IHME death surge is too early for traditional cold-and-flu style seasonality.

          >>Just to be clear – my basic reaction is that 410k is prolly too high. But I also know that’s just a gut reaction and gut reactions don’t stand up terribly well against careful analysis, in general.

          Yeah but the IHME model has done really badly. To the point that it’s almost evidence *against* whatever they model ;)

        • Hmm, but separate from the broader issues here re: COVID response … even if seasonality *is* strong and fall infections are among older populations so that a fall wave indeed translates to increased deaths — that wouldn’t give you the results IHME shows. The rise in deaths happens too soon for traditional cold/flu seasonality.

          If they predicted say 450,000 deaths by February 15th… I would consider that very pessimistic, but not ridiculous, as Feb 15th is late enough to capture the majority of a winter peak in respiratory mortality. Jan 1 isn’t.

          I think IHME is basically the same mistake I was making in May, expecting a rise in deaths to come too soon after the initiating event that increased infections. (I expected to see deaths rise by late May from TX/GA reopenings, so when that hadn’t happened by the end of May, I decided it was either seasonality or population density and didn’t expect it to ever happen.)

        • confused –

          > But the IFR will be far lower in these groups. Grandparents who are primary caregivers of K-12 students are on average much younger (many of them are not even really “elderly” or “seniors”) and significantly healthier than LTCF residents.

          Grandparents who are primary caregivers are much more likely to be minorities and to have relatively low SES. I would guess they’re also more likely to be living in urban environments, have poorer baseline health (more co-morbidities), poorer access to health care, etc. So are elderly people who live nulti-generational households (not the least because they’re likely to be there so someone can take care of them).

          > Not just environment; it’s also that LTCF residents are a subpopulation which ‘selects for’ poor health.

          So are grandparent caregivers, and likely elderly people who live in multi-generational households!

          > Not quite. I’ve been blaming it on fear driven by media, business, WHO, *and* government responses (not just SIPs) as well as by the actual risk of the disease.

          OK.

          > Yeah but the IHME model has done really badly. To the point that it’s almost evidence *against* whatever they model ;)

          Past performance not guarantee of current performance…presumably they have learned from past errors and made corrections which should improve current performance.

        • >>Grandparents who are primary caregivers are much more likely to be minorities and to have relatively low SES. I would guess they’re also more likely to be living in urban environments,

          Don’t these affect infection rate not IFR, at least primarily?

          >>So are grandparent caregivers

          Wait, how?

          I’m talking about LTCFs selecting for poorer health *among people of the same age*. LTCF residents are basically by definition people who aren’t able to live independently, much less take care of others.

          Also, the average age of grandparents who are primary caregivers of K-12 students is going to be far lower than that of LTCF residents (the latter is extremely high, about 85 in the US apparently). I don’t think the IFRs will be even remotely comparable.

        • confused –

          > Don’t these affect infection rate not IFR, at least primarily?

          ?? Of course race/ethnicity, SES correlate with fatality rate.

          > Wait, how?

          I’m using “select for” figuratively. IOW, when you select that group. In sdditik to those correlates grandparents who care for their grandchildren tend to neglect their own health. They tend to have poorer health outcomes.

          > I’m talking about LTCFs selecting for poorer health *among people of the same age*.

          Yah. I am too. I would also imagine that older people who live in multi-generational households are generally poorer health than their same-age peers who live independently.

          > older and in poorer healthLTCF residents are basically by definition people who aren’t able to live independently, much less take care of others.

          Right. For the same reason, so would be those who in multi-generational households compared to their same age peers.

          > Also, the average age of grandparents who are primary caregivers of K-12 students is going to be far lower than that of LTCF residents (the latter is extremely high, about 85 in the US apparently). I don’t think the IFRs will be even remotely comparable.

          They’re still likely to be at risk by being in their mid-late sixties or seventies (lower SES, minorities, etc.). IOW, there are quite a few people at higher risk who will be exposed more than they are now. I don’t know that it would lead to the numbers of the prediction but it could be a factor that would increase infections in people at risk relative to what we’ve had over the past few months by virtue of children being a vector. And I’m just saying that it works against your argument that more kids getting infected is no big deal because they aren’t going to be having contact with seniors in LTCFs.

        • >>?? Of course race/ethnicity, SES correlate with fatality rate.

          They correlate with total deaths… but do they correlate with death rate *contingent on being infected*?

          As opposed to being due to higher rates of infection (more likely to be high-contact jobs vs work from home, more likely to live in multigenerational households, etc.).

          Re multigenerational households: I don’t think most of these are elderly people living with their younger relatives because they aren’t capable of living independently any longer, it’s largely economic factors and cultural ones (e.g. immigration from nations where this is the normal living pattern) https://www.pewresearch.org/fact-tank/2018/04/05/a-record-64-million-americans-live-in-multigenerational-households/

          So I wouldn’t expect that elderly people in multigenerational households are notably less healthy than the average of their age.

          However, this may vary a lot locally. There is definitely a cultural/national origin aspect and here in TX (strong Latin American influence) I may see more of this than in the Northwest or Southeast.

        • confused –

          > They correlate with total deaths… but do they correlate with death rate *contingent on being infected*?

          Now I”m the one who’s confused. Do you doubt that fatality (or risk of severe disease) is correlated with factors such as baseline health, # of co-morbidities, health habits, access to healthcare, etc.? Those are all measures that correlated with race/ethnicity and SES. There’s not particular reason that I can think of to think that those factors are only associated with risk for infection but not severity of disease/morality?

          > Re multigenerational households: I don’t think most of these are elderly people living with their younger relatives because they aren’t capable of living independently any longer,

          Well, I’m not sure about “most” either. The relevant question seems to me to be how they would compare to their independent-living peers (and their peers who live in LTCFs). Are they, on average, more likely to be at greater risk, what are the “factors” that might explain whether they’d be at greater risk. Older people who live in multi-generational households are somewhat more likely to have lower income and more likely to be minorities. SES and race/ethnicity are associated with health outcomes in any variety of frames – such as diabetes or cardiovascular disease – which have nothing to do with susceptibility to infection. Then we add in the associations with susceptibility to infection on top of that. It seems that we keep going ’round on this and I’m not sure why?

          > it’s largely economic factors and cultural ones (e.g. immigration from nations where this is the normal living pattern) https://www.pewresearch.org/fact-tank/2018/04/05/a-record-64-million-americans-live-in-multigenerational-households/

          Which would be associated both with higher rates of infection AND risk for poor outcomes.

          > So I wouldn’t expect that elderly people in multigenerational households are notably less healthy than the average of their age.

          Living in multi-generational households is associated with lower SES and race/ethnicity, which in turn are associated with poorer health outcomes across a wide variety of diseases – so I don’t know why you wouldn’t expect that.

          > However, this may vary a lot locally. There is definitely a cultural/national origin aspect and here in TX (strong Latin American influence) I may see more of this than in the Northwest or Southeast.

          Sure.

        • There are no good numbers I’ve seen yet that would allow assessment of the actual “Infection Fatality Rate” and how it varies by sociodemographic factors. You guys can make up numbers that illustrate whatever your pet theory is about how various scenarios play out but don’t confuse any of that with actual modeling.

          There is data available for mortality conditional on ending up in hospital. As far as I’ve seen, that’s about as far as the reliable data goes. The rest is all conditional on having been tested with one or more of a variety of tests having substantial false positive and false negative rates. That does NOT yield a real IFR no matter how you contort your assumptions.

          The whole discussion you’re having is like dueling versions of the Drake Equation.

        • Brent –

          > You guys can make up numbers that illustrate whatever your pet theory is about how various scenarios play out but don’t confuse any of that with actual modeling.

          So let me see if I’ve got this right. Your theory is thst unlike virtually any other health risk, risk for covid is not associated with SES and race/ethnicity?

        • Risk is obviously associated with race and ethnicity. But we can not quantify the association between Infection Fatality Ratio and race/ethnicity because we have insufficient data on infections.

          We know how many people die (to a decent approximation). We know the race and ethnicity of the vast majority of those people. We do not know, to any useful approximation, the rate of INFECTION overall. Much less by race and ethnicity.

          So when these speculations start positing various future scenarios and trying to ballpark either actual or relative IFR’s among race/ethnic groups, those discussions are specious. Much moreso by the time we get to discussing whether IFR’s might be higher or lower for grandparents of college students or the like.

        • Joshua –

          I’m more and more thinking I might be influenced by relatively ‘local’ conditions that aren’t typical for the US.

          Obesity prevalence (which I think is a big factor in COVID stuff and is correlated with other preexisting conditions relative to COVID, like hypertension) varies a lot across the US and I don’t think it has the same relationship to income/poverty/SES everywhere.

          However,…

          >>confused –

          > Now I”m the one who’s confused. Do you doubt that fatality (or risk of severe disease) is correlated with factors such as baseline health, # of co-morbidities, health habits, access to healthcare, etc.? Those are all measures that correlated with race/ethnicity and SES.

          No, I’ll agree with that.

          75-year-olds in these living situations are probably higher risk than independently-living 75-year-olds, though lower risk than 75-year-olds in LTCFs (since the latter group is strongly selected for poor health).

          *However*, age structure of the population is *also* different across different cultural groups.

          So if we’re looking at the group of “grandparents who are primary caregivers” vs “all grandparents in the US”, for example, I think it would be the opposite correlation (as the “primary caregivers” group will average younger, and average healthier by excluding the least healthy individuals e.g. in LTCFs).

        • @Carlos Ungil

          >>Neither does flu!

          Isn’t the lack of a 2020 southern-winter flu season due to precautions/distancing measures taken for COVID? I think the seasonality of flu in general is pretty well established.

          Though I don’t know how strong the seasonality effect is in places like Australia, which is partially tropical. I think flu is less seasonal or not seasonal near the equator.

        • confused –

          I think we’ve killed another horse!

          But along with that, here’s some interesting info renrisk in nursing homes relative to other settings:

          -snip-

          It also seems to have a predilection for nursing homes. Influenza also kills older people, but fewer than 10% of influenza deaths are in nursing homes. Here, it’s more than 40%.

          -snip-

          https://www.medscape.com/viewarticle/936937

  12. A recent tweet from Nate makes it clear:

    > I’m sure their authors would disagree, but we think some other models are closer to *conditional* predictions. IF certain assumptions about partisanship, how voters respond to economic conditions, etc., hold, then maybe then Biden has a 90% chance of winning instead our ~70%.

    I don’t really understand this. Is Nate saying the predictions of their model are somehow not conditional on the model assumptions?

    That said, it does seem to me that the underlying point is right: If there are different modelling decisions to be made and no one knows what the right ones are, wouldn’t it make sense to build many models and average their conditional probability predictions, according to the modeller’s belief that each set of assumptions is right/the right approximation?

    Why isn’t this done more often, other than because it involves a huge amount of work?

    • Anon:

      I don’t know what Nate is talking about here. Our model sets the standard deviation for the polls over the entire course of the election at around 10 points. We get that number from running a regression to predict the delta in November polls v January polls for every year since 1948 with the share of swing voters in the electorate. In that naive sense, it is “conditional” on the assumption — but that’s misleading because we also include uncertainty about the variance!

      So really, he’s just wrong here.

      • Elliott:

        I don’t think Nate’s so wrong, exactly. I think Nate’s point is that our forecast is conditional on its assumptions. Which is true. As Anon points out, that’s true of Nate’s forecast. It’s true of any forecast. There’s a sense in which his forecast, with its wider intervals, is making weaker assumptions than our forecast, but this just brings us to the question of picking a model.

        Anon:

        You ask, “wouldn’t it make sense to build many models and average their conditional probability predictions”? What we recommend here is stacking. But in practice I think that most of the gains come not from averaging different models but from getting insight from many sources and using that to improve the model you’re using.

        For example, one thing the Fivethirtyeight forecast does that ours doesn’t is to account for possible swings based on demographic groups. Maybe it would make sense for us to add something like this to our model. I doubt it would make a big difference, but you never know. Really, though, I think the biggest potential for gains is by using raw survey data rather than just poll summaries.

  13. Are you going to forecast the outcome for the Senate? At this point that’s the real question. If the Dems sweep, the consequences could be profound. If the Repubs hold the senate, the Biden presidency will be Obama 3.

Leave a Reply to Daniel Weissman Cancel reply

Your email address will not be published. Required fields are marked *