How Accurate are the Polls?
During any election season, the question of how accurate the polls are comes up all the time. To start with, some of the pollsters are working for one of the parties and have a vested interested in making their candidates look good. Clearly, their results can't be trusted. But even for pollsters who are completely honest, there are genuine pitfalls. Newsday columnist Jimmy Breslin wrote a column, in which he said the polls were worthless because they missed the 169 million cell phones in America. This column set off a discussion on the accuracy of polling. Below Breslin's point and several others will be discussed.
The main problem is how to get a random sample of voters. The simplest way to get a random sample is to pick the area code and exchange to be sampled and have the computer dial the last four digits at random. In the trade this is called RDD--Random Digit Dialing. It is very easy to do but unfortunately has some problems. Many numbers will be business phones, hospitals, police stations, teenagers, and noncitizens. Also, with number portability, the pollster may be getting Manhattan, KS instead of Manhattan, NY. Consequently, RDD is not used much any more.
Instead, pollsters buy lists of phone numbers. In some states, the government sells voter lists. These are the best. In other states, lists of residential customers can be purchased from telephone companies and other sources. These lists do not contain cell phone numbers because it is illegal for pollsters and telemarketeers to call cell phones. Thus people who have only a cell phone and no land line are systematically excluded from the polls. But it is estimated that only about 5% of the population is cell only, and not all these live in battleground states. It is widely believed that cell-only customers are largely young people, who as a whole tend to be more Democratic than the population at large. Thus there is some bias introduced here by systematically omitting Democrats, but it is difficult to estimate how much. Here is pollster John Zogby's response to Breslin's column. Robert Landauer also wrote a good piece on the accuracy of polling is his Aug. 31 column.
Another block of voters who are missed in telephone polls are the people using Internet telephone companies such as Vonage. Early adopters of new technologies like VoIP (Voice over IP) are typically highly skilled urban professionals with college degrees working in modern industries. This is prime Kerry territory. Their numbers are still small, but not zero. VoIP raises another issue: area code. Some of the VoIP carriers allow the customer to pick the area code. For example, a Vonage customer living in Texas but formerly from Michigan can choose a Michigan area code so his family can call him without incurring long-distance charges Consequently, even if there were a way for pollsters to poll him (which there isn't) he would be polled as a Michigan voter instead of a Texas voter.
But using land lines is no bed of roses either. Many people use caller ID or their answering machine to do call screening. Busy young professionals are rarely home from work before 11 p.m. whereas lonely old people are only too happy to talk to the nice young girl who seems to care about what they think. Calls made at 2 p.m. are going to oversample housewives, and so on. All these effects lead to biases. To correct for them, pollsters conduct exit polls of voters leaving the polling place on election day to get a good idea of the statistical make up of the electorate for next time. These data are used to correct the polls. For example, if the exit polls show that 10% of the voters in some state are African Americans and in a state poll of 600 people by accident only 30 are African Americans (5%), the pollster can just count each African American twice. This process is called normalization.
Exactly what to normalize for is a controversial issue. Should the pollster make sure his poll has the statistically correct number of Catholics, gun owners, retirees, veterans, immigrants, union members, millionaires, welfare mothers, fat people, lesbians, and [fill in your favorite category]? Where do you draw the line? More specifically, should the pollster normalize to make sure the effective number of Democrats and Republicans is correct? Some pollsters do and some do not. And how many is correct? Gallup is currently normalizing to 40% Republicans and 33% Democrats which some pollsters think is highly unrealistic, which explains why Gallup's polls show Bush doing so well.
As we reported earlier, in an extraordinary step, both John Zogby and Scott Rasmussen criticized Time and Newsweek for also having too many Republicans in their samples. Zogby said: "If we look at the three last Presidential elections, the spread was 34% Democrats, 34% Republicans and 33% Independents (in 1992 with Ross Perot in the race); 39% Democrats, 34% Republicans, and 27% Independents in 1996; and 39% Democrats, 35% Republicans and 26% Independents in 2000." Thus a score of Bush 52%, Kerry 40% doesn't necessarily mean of the 1000 pollees, 520 will vote for Bush, 400 will vote for Kerry and 80 will vote for someone else or are undecided. It might mean this is the result of weighting the Republican votes to force them to represent 40% of the sample.
Another point that was brought up is that if a pollster gets a list of people who voted last time, first-time voters will be omitted. A correction can be applied for this effect, but it will have all the uncertainties discussed yesterday.
Another category of voters that is always missed is the overseas voter. Seven million Americans live abroad and all of them over 18 are entitled to vote in the state they last lived in. They are never polled and represent millions of voters whose preference does not count in the polls.
All of these points refer to how pollsters determine which numbers to call. But making the call is only step 1. The next step is to determine is the person who answers is going to vote. Usually question 1 is "Are you registered to vote?" If not, the interview is ended. But not all registered voters actually cast their ballot. Most pollsters have a screen designed to guess which registered voters are likely to vote and which will sit it out. The formulas used for screening vary from pollster to pollster and are usually secret. The issue of whether to report registered voters (RVs) or likely voters (LVs) is controversial. Most pollsters think they are smart enough to figure out who is a likely voter, but history shows otherwise. Albert R. Hunt writes in his article in the Wall Street Journal Sept. 17, "What if the Polls are Wrong," the following: "In 2000, Gallup's, election eve survey showed George Bush ahead by two points among likely voters; he trailed Al Gore by a point among registered voters, very close to the final outcome." Thus another source of inaccuracy in the polls is the use of RVs versus LVs.
Although it is 7 years old, Gallup's FAQ on polling is still useful reading.