The "shy Trump voter" bedeviled the pollsters in 2016. How do you poll people who systematically refuse to be polled because they don't trust the pollsters—or worse yet, actively want to sabotage them? Random-sample polling is based on the idea that you sample the population and all kinds of voters are equally likely to take the survey. Even if 9 out of 10 people called refuse to take the survey, that doesn't matter (except for increasing the cost of the poll) as long as the survey-takers are truly representative of the overall electorate. All pollsters are keenly aware of the problem and each one presumably has some (secret) way of trying to compensate for it. A crude way might be to note that the 2016 or 2020 poll in some state underestimated Trump's share of the vote by X% and then just add X% to his measured vote this time. Maybe they have something better, but they are not talking.
That said, one peculiar feature of this year's primaries is that the polls have overestimated Trump's share of the vote three times this year. They have also overestimated his winning margin over Nikki Haley four times. Have pollsters overcorrected for the "shy Trump voter" effect or is this just statistical noise? The effect isn't big, but the other Nate (Cohn) of The New York Times, has noticed it and written it up (actually, he wrote it up before the Michigan results were in, but the same thing happened there).
Here are the data for Trump. For example, the polls predicted that Trump would get 79% of the vote in Michigan and beat Haley by 57 points, the difference (delta) between them. He actually got 68% of the vote and beat her only by 42 points. So Trump underperformed the polling average by 11 points and underperformed the expected margin by 15 points (meaning not only did he underperform the polls but she also overperformed them). Note that in Iowa, Haley came in third, just behind Gov. Ron DeSantis (R-FL) who got 21% to Haley's 19%. Nevada is not included because Haley was in the primary and Trump was in the caucus. There was no polling for the U.S. Virgin Islands.
Final polls | Results | Error | ||||||
State | Share | Delta | Share | Delta | Share | Delta | ||
Iowa | 53% | 34% | 51% | 32% | -2% | -2% | ||
New Hampshire | 54% | 18% | 54% | 11% | 0% | -7% | ||
South Carolina | 62% | 28% | 60% | 20% | -2% | -8% | ||
Michigan | 79% | 57% | 68% | 42% | -11% | -15% |
The misses aren't large, but the polls over predicted Trump's margin four times, and by an average of 8 points. A difference of 8 points is the difference between winning and losing in all the swing states.
Four data points is not a lot and the errors in Trump's score weren't enormous, although the errors in the margins were quite large, meaning Haley did much better than expected rather than Trump doing worse.
What is going on here? Cohn has three theories. First, the undecideds ultimately broke strongly for Haley. Second, the pollsters got the electorate wrong, possibly by not including enough ratf*cking Democrats. Third, there were some "shy anti-Trump Republicans" who chose not to be polled. We would add a fourth possibility, namely, that this is just statistical noise and means nothing. After all, if you flip a coin four times and get four heads, that doesn't mean the coin is biased. If you run the four-flips test 1000 times, you are going to get four heads about 62 or 63 times. This is just the nature of statistics.
An extremely tentative conclusion, if it's not just noise, is that the polls this time may not be underestimating Trump either because (1) Trump voters aren't shy at all anymore; they are proud of their support for him now and want to tell the world about it, or (2) pollsters have figured out how to compensate for it. Or something else. We don't know, but another 50 or so primaries and caucuses are left, so by summer we will have a lot of data. (V)