Dem 49
image description
   
GOP 51
image description

PollWatch 2024, Part VIII: Choices Must Be Made

We had an item on this already, but we wanted to go back and take a closer look. In short, as we have written a million times, each pollster has their own "secret sauce"—their model of the electorate that helps them decide how to adjust the raw data that they collect. And small differences in opinion (or big differences, or an aggregation of a whole bunch of differences) can have a huge impact on the results.

To try to illustrate this in a useful way, we looked at every pollster who has produced at least five national opinion polls testing Kamala Harris vs. Donald Trump, and we averaged their results. Doing it that way should serve to even out most variances caused by local wonkiness (which might affect state polls) and/or by a wonky poll sample (it happens to every pollster once in a while, even Ann Selzer). Here are the results, ordered based on the average net difference between the two major party candidates:

Pollster Polls Harris Avg. Trump Avg.  Undec. Avg. Net
BigVillage 5 50.4% 45.0% 4.6% Harris +5.4%
ABC News 6 50.5% 46.5% 3.0% Harris +4.0%
MorningConsult 16 48.9% 45.5% 5.6% Harris +3.4%
RMG Research 6 50.2% 47.8% 2.0% Harris +2.3%
New York Times/Siena College 5 48.1% 45.8% 6.0% Harris +2.3%
Reuters/Ipsos 7 45.4% 43.1% 11.4% Harris +2.3%
TIPP 8 49.0% 47.0% 3.0% Harris +2.0%
YouGov/The Economist 5 48.3% 45.9% 5.6% Harris +1.3%
ActiVote 10 50.6% 49.5% 0.0% Harris +1.1%
CBS News 5 50.0% 49.0% 1.4% Harris +1.0%
Emerson College 6 48.3% 47.8% 4.2% Harris +0.5%
Rasmussen Reports 15 46.0% 48.5% 4.2% Trump +2.5%
Forbes/HarrisX 5 47.2% 49.8% 5.7% Trump +2.6%
Average   48.7% 47.0% 4.4% Harris +2.4%

If someone who had only a passing familiarity with American presidential politics were to take a look at this, it might not seem like much, but the gap between the extremes is huge, with the various pollsters often telling very different tales of the election. And this is surely being replicated at the state level. In fact, states are harder to get right because their electorates are harder to predict. So, the spread could well be wider.

Anyhow, let us imagine that polling aggregations (including ours) basically reflect the current consensus. In this case, that means Kamala Harris up by about 2.5 points. Now, imagine that one of the Harris-friendly pollsters is actually the correct one, and she's being underestimated by, say, 2 points. If so, as you can see if you click on the link above, that would shift around 100 EVs in Harris' direction. On other hand, imagine that one of the Trump-friendly pollsters is actually the correct one, and HE is being underestimated by, say, 4 points. That would shift about 115 EVs in HIS direction. So, this admittedly crude math suggests the pollsters have a variance of more than 200 EVs, nearly enough to win the presidency all by itself.

If you would like to see a rundown of the various factors that can affect a pollster's "secret sauce" in a general sense, Good Authority's Josh Clinton has a nice primer. In short, there are four issues: (1) whether the sample matches the demographics of the electorate (errors in this area were the primary reason for mistakes in 2016); (2) whether the sample matches the political breakdown of the electorate, in terms of Democrats vs. Republicans vs. independents (errors in this area were the primary reason for mistakes in 2020); (3) whether or not a respondent will actually vote, and (4) whether the data collection process was truly random (e.g., was there some "type" of voters that was particularly likely to be undersampled or oversampled, such that, for example, the Chinese lesbian Republican women you got on the phone were much less Trumpy than Chinese lesbian Republican women as a whole).

Note also that Clinton's piece uses the current election as a case study, and shows the impact of various reasonable choices a pollster might make. Whereas our little number-crunching exercise above suggests a possible swing of something like 6 points is within the realm of reason, he puts the number at 8 points. If you're interested in playing around with these sorts of questions, remember that FiveThirtyEight has a tool that lets you see how the election outcome changes if you, for example, increase the women's vote by 2%, or decrease the number of college-age voters by 5%.

So, this year's polls are subject a bunch of methodological challenges, as in any year. However, there's also one addition X-factor this year that should be taken under advisement, and that is weighting the responses based on recalled vote. The New York Times' Nate Cohn has a detailed breakdown, with lots of nice charts, here, so we'll just cover the basics. Weighting by recalled vote is exactly what it sounds like. The pollster asks a person who they voted for in the 2020 election, and then uses that to make sure their sample is accurate. Most obviously, since 47% of the electorate voted for Donald Trump in 2017, a pollster wants to make sure that they weight their sample to give 47% of the weight to people who say they voted for Trump in 2020. This is a clear effort to compensate for past underestimates of Trump's support.

As readers will recall, pollsters did not make a point of weighting for education level in 2016, and so missed on many states (and nationally) because that turned out to be an important marker of "Clinton voter" vs. "Trump voter." So, in 2020, that was added to pollsters' questionnaires. Weighting by recalled vote might seem like a similar sort of adjustment, but it's actually not. It is not the case that it did not occur to pollsters to weight by past vote, it's that weighting by past vote is generally considered bad practice, at least until this year.

Why is that? Well, for most questions pollsters might ask, respondents can reasonably be expected to give an accurate response. Everyone knows what their age is, or what state they live in, or whether they went to college or not. On the other hand, asking people how they voted 4 years ago is subject to all kinds of issues that can introduce inaccuracy. People can, and do, forget. Or, they might not want to admit they did not vote, so they could lie. And among those who DO remember, there is a strong bias in favor of the winning candidate. Donald Trump did not win in 2020, but he did in 2016, and so that might well carry over.

Weighting the numbers by recalled vote most certainly makes the 2024 numbers a couple of points more Trumpy, on the whole. Looking at polls that do such weighting (around 2/3 of them, right now), versus polls that do not makes that clear. Whether it makes them more accurate is a very different question. It certainly could, if only to make up for the Trump undercounting of the last two cycles. On the other hand, for what it is worth, if the technique had been used in any of the presidential cycles since 2000, including the two with Trump as a candidate, it actually would have made the polls more wrong, on the whole (not all of them, but more often than not). The average amount of additional wrongness would have been smaller in the two Trump elections (around 1.5 points in 2016, around a point in 2020), but it still would have been there.

Here's the upshot: Because Trump is sui generis, and because pollsters struggled to find a rigorous method for correcting for that, they are now using a non-rigorous method with which they have little experience. It's not throwing darts at a dartboard, but it might be in the same ZIP Code. Beyond that, because the specific goal here is to correct for undervaluing Trump in 2020 and 2016, it is improbable that is happening again. They might be valuing him correctly, or they might be overvaluing him, but they probably aren't underestimating his support. Oh, and because this kind of weighting is based on the 2020 electorate, it would be thrown further askew if this year's electorate were to be markedly different, particularly if Democratic turnout was higher than it was 4 years ago.

As you can imagine, we get asked a lot who is going to win the election. We'll say a bit more next week, but because of the issues outlined above (and others), (Z)'s current answer to people who ask that is this: "33% a close Trump win, 33% a close Harris win, 33% a not-that-close Harris win." Another way you could put that is this: "If the polls are basically accurate, it's a coin flip. If not, they are probably underestimating Harris." (Z)



This item appeared on www.electoral-vote.com. Read it Monday through Friday for political and election news, Saturday for answers to reader's questions, and Sunday for letters from readers.

www.electoral-vote.com                     State polls                     All Senate candidates