Downloadable Polling Data
Data Format 1
The polling data for the presidency, Senate, and House are each available in two formats. In the first format, the file pres_polls.csv lists every available presidential poll, sorted first on state and for each state chronologically on the middle day of the poll, with Jan. 1 = 1.0, Jan. 2 = 2.0, etc. so a poll that started Jan. 10 and ended Jan 11 would be day 10.5. Similarly, senate_polls.csv and house_poll.csv list the Senate and House polls the same way. In this format, every poll occupies one line of ASCII text, with the following fields separated by commas:Day, poll length, state (or CD), (empty), Dem %, GOP %, Other %, Ending date, (7 empty fields), Pollster-poll length
In this format, all the data is contained in a single file.
Data Format 2
The second format has one .csv file for each day listing the best estimate for each state (or CD). If there were multiple polls for a state (or CD) within a week, they are averaged according to this averaging algorithm. Thus a Senate file for, say, Aug 24 (file: Aug24.csv) contains the best estimate for each of the 33 Senate races as of that day. Some states may have recent polls, but for other states, the most recent estimate may be quite old, or even the 2002 election if that state has not been polled at all in 2008. If new polls came in on Aug. 24, then these will be reflected in the Aug25.csv file, and so on.
The files for the President, Senate and House, have 51, 33, and 435 data lines per file, respectively, corresponding to the number of races. Since there is a separate data file every day, they are packaged as pres.zip, senate.zip, and house.zip, respectively. The format of the lines is the same as above, except that field 4 is the electoral votes (for the presidential race), and fields 9-15 correspond to strong Dem, weak Dem, barely Dem, tie, barely GOP, weak GOP, and strong GOP, respectively. In this case, barely means a lead of less than 5%; weak means 5-9%, and strong means 10% or more. For example, a Democratic lead of 48% to 42% would be consider weak Dem and this there would be an entry in field 10 with the other field in the 9-15 range empty. Similarly, a GOP lead of 48% to 47% would be barely GOP and thus there would be an entry in field 13. A 48% to 48% tie would have an entry in field 12 and the others empty. The information in fields 9-15 is redundant since it can be computed from fields 5 and 6. It is provided for convenience to allow Excel to sum columns 9-15. Here are the files:
The Downloadable Files
For people who want a prettified spreadsheet for President, a method is available to make one yourself quite easily. Basically, there is an Excel-97 template file available and the current daily spreadsheet extracted from pres.zip. By combining the two, you can get the current data in a nice colorful form. To make the colorful spreadsheet, proceed as follows.
Back to the main page.