EVP's Track Record 2006

Before looking at the numbers it is worth mentioning that pollsters and statisticians look at polls in a different way from the general public. If a poll predicts that Smith will win some state 51% to 49% with a margin of error of 3% and Jones win that state by 51% to 49% the pollster would say he got it right because Smith's score was in the range 48% to 54% just as predicted and Jones' was in the range 46% to 52%, just as predicted. On the other hand, if Smith won with 56%, the pollster would admit that he got it wrong. Many people do not understand this.

In 2006, there were 33 Senate races, 31 of which had polls (Hawaii and Indiana weren't worth the expense). EVP predicted the winner correctly in all 31 of them.

Now let us look at how close the polls were. In 2006, we used a new algorithm for averaging the polls. The most recent poll was always used. If other polls had middle dates within a week of the most recent one, all of them were averaged, weighted equally. The results are given below.

State	Dem vote	Dem poll	Diff	GOP vote	GOP poll	Diff	Last poll
Arizona	44	42	2	53	50	3	Nov 02
California	60	61	-1	35	30	5	Nov 04
Connecticut	40	40	0	10	8	2	Nov 04
Delaware	70	60	10	29	26	3	Oct 24
Florida	60	59	1	38	34	4	Nov 04
Maine	21	19	2	74	72	2	Oct 23
Maryland	54	49	5	44	45	-1	Nov 06
Massachusetts	69	65	4	31	30	1	Nov 05
Michigan	57	53	4	41	38	3	Nov 04
Minnesota	58	55	3	38	38	0	Nov 05
Mississippi	35	29	6	64	66	-2	Feb 13
Missouri	50	50	0	47	45	2	Nov 05
Montana	49	49	0	48	45	3	Nov 04
Nebraska	64	54	10	36	34	2	Oct 16
Nevada	41	40	1	55	55	0	Nov 05
New Jersey	53	49	4	45	42	3	Nov 04
New Mexico	70	58	12	30	38	-8	Oct 27
New York	67	63	4	31	31	0	Nov 01
North Dakota	69	57	12	29	35	-6	Jan 25
Ohio	56	56	0	44	42	2	Nov 05
Pennsylvania	58	53	5	41	41	0	Nov 02
Rhode Island	53	49	4	47	44	3	Nov 04
Tennessee	48	45	3	51	51	0	Nov 04
Texas	36	33	3	62	60	2	Nov 05
Utah	31	31	0	62	63	-1	Nov 03
Vermont	65	57	8	32	36	-4	Oct 26
Virginia	50	48	2	49	47	2	Nov 05
Washington	57	53	4	40	41	-1	Nov 02
West Virginia	64	67	-3	34	33	1	Nov 05
Wisconsin	67	58	9	30	31	-1	Nov 05
Wyoming	30	26	4	70	67	3	Oct 12

The margin of error on state polls is typically around 4%, so if the number in column 4 or column 7 is 5 or more, the algorithm missed it. There were errors for at least one candidate in 10 states. However, in 6 of these states, the most recent poll was in October or earlier. None of these were competitive races with heavy polling.

The polls were outside the margin of error in four states with recent polls: California, Maryland, Pennsylvania, and Wisconsin. In Maryland and Pennsylvania, the Democrat did 5% better than the poll predicted, so these results are 1% outside the margin of error. In California, the Republican did 5% better than expected, also 1% outside the margin of error. The worst prediction was in Wisconsin, where Herb Kohl got 67% of the vote vs. an expected value of 58%. The poll for his opponent was only off by 1% though.

Is this a good performance? Remember that the margin of error gives a 95% confidence interval. Thus it is to be expected that 5% of the numbers are outside the margin of error. We have 62 numbers above, so we would expect three of them to be outside the margin of error. For the states with a November poll, four of them were outside the margin of error. While not perfect, all in all it is not bad.

EVP's Track Record 2004

How well did EVP do in the 2004 election? Let us examine how EVP's predictions went in each of the 51 contests tracked (50 states + DC). Three different algorthms (formulas) were used, each resulting in a different map. Initially, only algorithm 1 was used, but I saw earlier on how unstable it was, so I switched to algorithm 2, which resulted in a huge amount of mail demanding that algorithm 1 be reinstated. Bowing to popular demand, I did so. Later algorithm 3 was invented as a more sophisticated model. Maps for all three of them were produced daily toward the end of the campaign. Thus EVP made 3 x 51 = 153 predictions. The electoral college score was really just a byproduct of the 51 state results. It is important to note that all the results were produced by software doing computations on the polls. There was no human judgement involved. Anybody running the same algorithms on the same data would have come to exactly the same conclusions.

Here are the three algorithms.

Just use the most recent poll (original algorithm)
Average the past 3 days worth of nonpartisan polls
A mathematical model of how undecided voters break

The first algorithm, is the simplest. Just use the most recent poll in every state regardless of who took it. The trouble with this one is that when there are many polls, as in Ohio and Florida, there were wild, but meaningless, fluctuations from day to day. In retrospect, this was not a good choice.

The second algorithm made two changes. First it took the most recent poll and any others within three days of it and averaged them. This damped the oscillations appreciable. Second, partisan pollsters were excluded. The need for this may not be obvious to everyone initially. Basically, there are two kinds of pollsters: those who sell their polls to newspapers and TV stations and those who work for candidates (usually only one party). The former try to tell the truth and measure success by coming close to the final result. The latter want their horse to win and don't give a hoot about the truth. Most of the famous pollsters, like Gallup, Mason-Dixon, SurveyUSA, Zogby, Rasmussen, universities, etc. are in the first category. Algorithm 2 omitted the category 2 pollsters like Strategic Vision (R) and Garin-Hart-Yang (D).

Algorithm 3 got into mathematical modeling and tried to predict how the minor candidates would do and how the undecideds would vote. Historically, many people are willing to tell pollsters that they will vote for some ideologically driven minor candidate like Ralph Nader or Pat Buchanan but don't actually do out of fear of having the major party candidate they hate win. Furthermore, historical data shows clearly that the undecideds tend to break for the challenger, usually about 2:1. This algorithm assumed Nader would get 1%, Badnarik would get 1%, and the undecideds would break 2:1 for Kerry.

Here are the state-by-state results, where we have used the layman's definition of correct, that is picked the winner, without regard to the margin of error. That is left as an exercise for the reader.

Algorithm	Correct states	Incorrect states	To close to call	% Correct	EVs: Kerry - Bush
Final results	51	0	0	100%	EVs: 251 - 286
Algorithm 1	46	4: FL IA NM WI	1	92%	EVs: 262 - 261
Algorithm 2	48	1: IA	2: NM WI	98%	EVs: 245 - 278
Algorithm 3	48	3: FL IA NM	0	94%	EVs: 281 - 257

Thus the predictive accuracy ranged from 92% to 98%, with the best algorithm being the one that averaged polls over the last three days and omitted the partisan pollsters. This year we are using a modification of this one: average polls over 7 days and omit the partisan pollsters.

Where did the polls go wrong? The problem states were Florida, Iowa, New Mexico, and Wisconsin, fairly consistently. In Iowa, the two candidates differed by 0.9%, in New Mexico by 1.1%, and in Wisconsin by 0.4%. Given a standard margin of error of ±3% or sometimes ±4%, these were all well inside the margin of error. The only one that was at the outer edge was Florida. The final result here was Kerry 47.1%, Bush 52.1%. The three algorithms predicted splits of 49-44, 47-48, and 52-46. The first and third were way off. The middle one (poll averaging) picked the right winner and got Kerry right on the dot, but underestimated Bush by 4%.

Back to the main page.