After every election, there are a lot of people screaming: "The polls were all wrong—again!" So, let's take a look. The #1 ranked pollster at FiveThirtyEight is Siena College. Its polls are done in coordination with The New York Times, specifically with Nate Cohn and Ruth Igielnik, two polling experts. A poll gives a range, the predicted value plus or minus the margin of error (MoE), separately for each candidate. Here are the predicted ranges, the actual vote percentage, and the actual value minus the predicted value for Kamala Harris and Donald Trump, for Siena's final polls of the swing states:
| |||||||||||||
|
|
||||||||||||
State | Low | Mean | High | Actual | Diff | Error? | Low | Mean | High | Actual | Diff | Error? | |
Arizona | 41.5% | 45.0% | 48.5% | 46.8% | -0.2% | No | 45.5% | 49.0% | 52.5% | 52.3% | 3.3% | No | |
Georgia | 44.5% | 48.0% | 51.5% | 48.5% | 0.5% | No | 43.5% | 47.0% | 50.5% | 50.7% | 3.7% | Error | |
Michigan | 43.5% | 47.0% | 50.5% | 48.3% | 1.3% | No | 43.5% | 47.0% | 50.5% | 49.8% | 2.8% | No | |
Nevada | 45.5% | 49.0% | 52.5% | 47.2% | -1.8% | No | 42.5% | 46.0% | 49.5% | 50.9% | 4.9% | Error | |
North Carolina | 44.5% | 48.0% | 51.5% | 48.5% | 0.5% | No | 42.5% | 46.0% | 49.5% | 51.0% | 5.0% | Error | |
Pennsylvania | 44.5% | 48.0% | 51.5% | 48.5% | 0.5% | No | 44.5% | 48.0% | 51.5% | 50.4% | 2.4% | No | |
Wisconsin | 45.5% | 49.0% | 52.5% | 48.8% | -0.2% | No | 43.5% | 47.0% | 50.5% | 49.6% | 2.6% | No | |
Average | 0.4% | 3.5% |
For those readers who were not math majors, sorry for all the numbers, but when we are discussing how well the polls did, well, we need a lot of numbers. That is what polling is about. As an example, let's consider Wisconsin. The poll said 49% Harris, 47% Trump. In the Wisconsin line above, in the column "Mean," there is a 49.0% entry in the row Wisconsin. "Mean" does not mean that Harris is a mean person. It is what mathematicians mean when they add up all the numbers and divide by how many there are. It means "average," as opposed to "median" (that's the number in the middle of the sample, with half above and half below) or "mode" (the most common number in the sample).
The margin of error for this poll was 3.5% (that is, the standard deviation of the sample was 1.75%) and by convention the margin of error is two standard deviations. The mean for the Wisconsin survey was 49.0% and the MoE was 3.5% so what the pollster is saying is that the probability of Harris' vote score being between the lower bound (45.5%) and the upper bound (52.5%) is 95%. When Harris got 48.8% in Wisconsin (the number in the "Actual" column), the pollster would say, "48.8% indeed falls in the range 45.5% to 52.5%, so we nailed it. Our error was 1.8% (the "Diff" column) and that is well within the ±3.5% MoE." Similarly, Trump's score for Wisconsin was predicted to be in the range 43.5% to 50.5% and, sure enough, his actual score was 49.6%, so the pollster would say they nailed both of them and the poll got it right.
Note that the pollster did not predict that Harris would win Wisconsin, only that each candidate would end up at the predicted value ±3.5%, and that happened. So someone who doesn't understand polling might say: "Trump won, so you got it wrong," but the pollster would say: "We didn't try to predict who would win, we just predicted what ranges the two scores would fall in with 95% probability, and we did it." Please note that this is not necessarily a zero-sum game because Trump going up does not imply Harris going down due to the variable third-party vote.
The same discussion above applies to every state independently. Note that for Harris, all the values in the Diff column are under 3.5%, so the pollster got all of them right. The average error is only 0.4%, so the polls predicted Harris' score in the swing states extremely well. The column "Error" indicates whether the pollster blew it—that is, the poll was off by more than the MoE. None of the Harris scores were off by more than the MoE, so we put "No" there.
Now look at the Trump scores. The differences between the predicted value (the mean) and the actual value are outside the margin of error in three states (Georgia, Nevada, and North Carolina) and are positive in all states, meaning that Trump overperformed the polls. The average overperformance is 3.5%, just at the limit of the MoE.
Now back to English. The pollsters predicted Harris' score extremely accurately but greatly underestimated Trump's score. Why? This is not a math question. It has to do with who was polled and also who was a likely voter. Trump's campaign put a huge emphasis on getting (young) men who normally don't vote to show up for him. The campaign went to sports events and certain music festivals and bars, and places where low-propensity male voters tend to show up and pitched their candidate. These people probably didn't answer when asked to take part in a poll, so they were probably missed. And those that did would be asked screening questions like "Did you vote in 2020?" "Do you think voting is everyone's duty?" "Do you think your vote will count?" A "no" on everything will get you marked as an unlikely voter.
It is also possible that the "undecided voters" really were undecided and then broke for Trump nationally in the last couple of days. A huge question is whether this effect holds when Trump is not on the ballot. It is noteworthy that while Trump won North Carolina, by 3.3%, downballot Republicans didn't do so well. People in the Tar Heel State elected Democrats as governor, lieutenant governor, attorney general, secretary of state, and superintendent of public instruction. They also broke the Republicans' supermajority in the state legislature, so when they pass a bill and governor-elect Josh Stein vetoes it, they can't override his veto. That means that despite Trump's top-of-the-ticket win, downballot, the Democrats did well in some places.
Now let's look at the Siena polls for the Senate races:
|
|
||||||||||||
State | Low | Mean | High | Actual | Diff | Error? | Low | Mean | High | Actual | Diff | Error? | |
Arizona | 46.5% | 50.0% | 53.5% | 50.1% | 0.1% | No | 41.5% | 45.0% | 48.5% | 47.9% | 2.9% | No | |
Michigan | 44.5% | 48.0% | 51.5% | 48.7% | 0.7% | No | 42.5% | 46.0% | 49.5% | 48.3% | 2.3% | No | |
Nevada | 48.5% | 52.0% | 55.5% | 47.6% | -4.4% | Error | 39.5% | 43.0% | 46.5% | 46.7% | 3.7% | Error | |
Pennsylvania | 46.5% | 50.0% | 53.5% | 48.5% | -1.5% | No | 41.5% | 45.0% | 48.5% | 48.9% | 3.9% | Error | |
Wisconsin | 46.5% | 50.0% | 53.5% | 49.4% | -0.6% | No | 42.5% | 46.0% | 49.5% | 48.5% | 2.5% | No | |
Average | 1.1% | 3.1% |
The effect is similar, but a bit smaller. The polls got the Democratic Senate candidates' vote share fairly well, but again underestimated the Republicans' vote share. The low-visibility, low-propensity voters were missed by the polls, but showed up and tended to vote for Donald Trump's party.
One pollster who has a whole chicken farm's worth of egg on her face is Ann Selzer. She released a poll just before Election Day showing Harris +3 in Iowa. The media went wild thinking this could presage a landslide. It sort of did—but the other way. Trump won Iowa by 13 points, meaning Selzer was off by 16 points. The reason is that unlike every other pollster, Selzer uses random digit dialing and does not apply any corrections to the raw data. Her sample this time had far too many college-educated voters and she didn't correct for that. (V)