EVP's Track Record 2006
Before looking at the numbers it is worth mentioning that pollsters and statisticians look at polls in a different way from the general public. If a poll predicts that Smith will win some state 51% to 49% with a margin of error of 3% and Jones win that state by 51% to 49% the pollster would say he got it right because Smith's score was in the range 48% to 54% just as predicted and Jones' was in the range 46% to 52%, just as predicted. On the other hand, if Smith won with 56%, the pollster would admit that he got it wrong. Many people do not understand this.
In 2006, there were 33 Senate races, 31 of which had polls (Hawaii and Indiana weren't worth the expense). EVP predicted the winner correctly in all 31 of them.
Now let us look at how close the polls were. In 2006, we used a new algorithm for averaging the polls. The most recent poll was always used. If other polls had middle dates within a week of the most recent one, all of them were averaged, weighted equally. The results are given below.
The margin of error on state polls is typically around 4%, so if the number in column 4 or column 7 is 5 or more, the algorithm missed it. There were errors for at least one candidate in 10 states. However, in 6 of these states, the most recent poll was in October or earlier. None of these were competitive races with heavy polling.
The polls were outside the margin of error in four states with recent polls: California, Maryland, Pennsylvania, and Wisconsin. In Maryland and Pennsylvania, the Democrat did 5% better than the poll predicted, so these results are 1% outside the margin of error. In California, the Republican did 5% better than expected, also 1% outside the margin of error. The worst prediction was in Wisconsin, where Herb Kohl got 67% of the vote vs. an expected value of 58%. The poll for his opponent was only off by 1% though.
Is this a good performance? Remember that the margin of error gives a 95% confidence interval. Thus it is to be expected that 5% of the numbers are outside the margin of error. We have 62 numbers above, so we would expect three of them to be outside the margin of error. For the states with a November poll, four of them were outside the margin of error. While not perfect, all in all it is not bad.
EVP's Track Record 2004
How well did EVP do in the 2004 election? Let us examine how EVP's predictions went in each of the 51 contests tracked (50 states + DC). Three different algorthms (formulas) were used, each resulting in a different map. Initially, only algorithm 1 was used, but I saw earlier on how unstable it was, so I switched to algorithm 2, which resulted in a huge amount of mail demanding that algorithm 1 be reinstated. Bowing to popular demand, I did so. Later algorithm 3 was invented as a more sophisticated model. Maps for all three of them were produced daily toward the end of the campaign. Thus EVP made 3 x 51 = 153 predictions. The electoral college score was really just a byproduct of the 51 state results. It is important to note that all the results were produced by software doing computations on the polls. There was no human judgement involved. Anybody running the same algorithms on the same data would have come to exactly the same conclusions.
Here are the three algorithms.
The first algorithm, is the simplest. Just use the most recent poll in every state regardless of who took it. The trouble with this one is that when there are many polls, as in Ohio and Florida, there were wild, but meaningless, fluctuations from day to day. In retrospect, this was not a good choice.
The second algorithm made two changes. First it took the most recent poll and any others within three days of it and averaged them. This damped the oscillations appreciable. Second, partisan pollsters were excluded. The need for this may not be obvious to everyone initially. Basically, there are two kinds of pollsters: those who sell their polls to newspapers and TV stations and those who work for candidates (usually only one party). The former try to tell the truth and measure success by coming close to the final result. The latter want their horse to win and don't give a hoot about the truth. Most of the famous pollsters, like Gallup, Mason-Dixon, SurveyUSA, Zogby, Rasmussen, universities, etc. are in the first category. Algorithm 2 omitted the category 2 pollsters like Strategic Vision (R) and Garin-Hart-Yang (D).
Algorithm 3 got into mathematical modeling and tried to predict how the minor candidates would do and how the undecideds would vote. Historically, many people are willing to tell pollsters that they will vote for some ideologically driven minor candidate like Ralph Nader or Pat Buchanan but don't actually do out of fear of having the major party candidate they hate win. Furthermore, historical data shows clearly that the undecideds tend to break for the challenger, usually about 2:1. This algorithm assumed Nader would get 1%, Badnarik would get 1%, and the undecideds would break 2:1 for Kerry.
Here are the state-by-state results, where we have used the layman's definition of correct, that is picked the winner, without regard to the margin of error. That is left as an exercise for the reader.
Thus the predictive accuracy ranged from 92% to 98%, with the best algorithm being the one that averaged polls over the last three days and omitted the partisan pollsters. This year we are using a modification of this one: average polls over 7 days and omit the partisan pollsters.
Where did the polls go wrong? The problem states were Florida, Iowa, New Mexico, and Wisconsin, fairly consistently. In Iowa, the two candidates differed by 0.9%, in New Mexico by 1.1%, and in Wisconsin by 0.4%. Given a standard margin of error of ±3% or sometimes ±4%, these were all well inside the margin of error. The only one that was at the outer edge was Florida. The final result here was Kerry 47.1%, Bush 52.1%. The three algorithms predicted splits of 49-44, 47-48, and 52-46. The first and third were way off. The middle one (poll averaging) picked the right winner and got Kerry right on the dot, but underestimated Bush by 4%.
Back to the main page.