At long last, I have finally completed all of the research for my 2010 Senate (and, eventually, Governor) forecast methodology. I am extremely pleased with the results, which I believe show it to be the most accurate methodology produced anywhere. The methodology is explained below.
But first, here are today's numbers!
Senate Picture, August 3rd, with Rasmussen Most likely outcome: Democrats 52 seats, Republicans 47 seats, Charlie Crist 1 seat
Of the 100 Senate seats, 86 are either not up for re-election, or have a polling average where one party has a 100% chance of victory (if the election were held today). Among those 86 seats, there are 48 Democrats, and 38 Republicans. Here is a chart featuring the other 14 campaigns:
Senate picture, competitive campaigns chart, August 3rd, with Rasmussen
The 48 currently safe Democrats, plus the 4.05 wins projected in these 14 campaigns, comes out to 52.05 Democrats, or 52 seats. Charlie Crist is also projected to win one seat.
Senate Picture, August 3rd, without Rasmussen Most likely outcome: Democrats 54 seats, Republicans 45 seats, Charlie Crist 1 seat
Senate picture, competitive campaigns chart, August 3rd, without Rasmussen
The 48 currently safe Democrats, plus the 5.91 wins projected in these 14 campaigns, comes out to 53.91 Democrats, or 54 seats. Charlie Crist is also projected to win one seat.
Notes:
* = Has primary challenger, but heavy favorite
The "current Dem winning %" column projects the chance of Democratic victory if the election were held today. It is not meant to predict the chance of the Democratic candidate winning in November.
Every Senate seat not listed here currently has either a 0% or a 100% chance of a Democratic victory.
Senate Forecast Methodology
I strongly believe this to be the most accurate statewide electoral forecasting methodology published anywhere. Additionally, it is simple enough that almost anyone can reproduce it, no matter their level of background in statistics or polling. This simplicity also means transparency, as almost anyone can both understand the assumptions I am making and check my arithmetic for accuracy.
The methodology is extremely simple: just take the simple mean of almost all polls that had the majority of their interviews conducted during the final 25 days of a campaign (see the notes below for more info). That's it. And it works, too:
Error rates, final predicted margin to final vote margin, 52 closest Presidential, Senatorial and Gubernatorial general election campaigns, 2008-2010
Pollster
538
Simple 25-day mean
Mean error
2.79
2.82
2.55
Median error
2.15
2.16
1.67
In the 52-closest Presidential, Senatorial and Gubernatorial campaigns where final margins were published by both 538 and Pollster.com, the simple 25-day mean resulted in significantly less error. The 25-day simple mean had 9-10% less error on the mean, and 22-23% less error on the median. Additionally, it was the most accurate in 21 of those campaigns, and the second most accurate in 19.
Further, the 2008-2010 performance of the 25-day simple mean was not a fluke. Since 2004, across the 145 closest Presidential, Senatorial and Gubernatorial general election campaigns, its mean error rate has been 2.57, and its median error rate has been 1.76. That is a consistently strong performance that will be difficult for any methodology to surpass, or even equal. As of this writing, I know of no methodology that has done so.
Now, in the extended entry, answering some likely questions / objections:
But you can't average polls that have different methodologies! I have heard this claim as long as I have been blogging. While it is nice deductive reasoning, it does not hold up to empirical research. NCPP has found that the average "candidate error" (defined as half of the mean total error figures I produced above) for nearly 600 individual polls taken during the final eight days of the 2004-2008 general election was about 1.8. By comparison, the average "candidate error" for the 25-day simple mean was about 1.3. So, poll averaging was more accurate than individual polls, and by a significant amount. Sure seems like you can average polls to me.
But you are only telling us what was more accurate in the past, not what will be more accurate in the future! True! Just because it was more accurate in the past does not mean it will be more accurate than other methods in the future. Further, I fully expect that other electoral forecasters have conducted their own research, and improved their methodologies. I guess we will see who did their homework the best, and who made the correct assumptions, after November 2nd.
What's up with the "current Dem win %" column? That is the odds of a Democratic victory in that Senate campaign, if the election were held today. I am not projecting into the future. It is calculated based on the error rates of the 144 campaigns looked at in the research I performed to produce this methodology. For example:
Dem Candidate A trails by 2.25%. Given that 58 of the 144 campaigns have had error rates greater than 2.25%, and that the error has an even chance of breaking in favor of Democrats or Republicans, divide 29 by 144 to arrive at a 20% of Democratic victory in that campaign, if election were held today.
Dem Candidate B leads by 1.33%. Given that 91 of the 144 campaigns have had error rates greater than 1.33%, and that the error has an even chance of breaking in favor of Democrats or Republicans, divide 45.5 by 144, to arrive at a 32% of Republican victory in that campaign, and thus a 68% of Democratic victory, if the election were held today.
But you don't weight polls by recent-ness! No, I don't. And yet, my results are still more accurate. In fact, I am pretty sure this is why they are more accurate, even if I can't prove that assumption. I am pretty sure this methodology works because voter preferences really don't change much in the final stages of a campaign. Without voter preference changing that much, adding more polls into the average and not weighting them produces more accurate results due to the central limit theorem.
Basically, this methodology is about finding the sweet spot between two variables. Adding more polls into the average improves accuracy, while including older polls reduces it. It appears that 25-days from an election is the "sweet spot" that results in the least total error when these two variables are combined.
But you don't weight polls by sample size! Nope. And yet my results are more accurate.
But you don't weight polls by past accuracy! Nope. And yet my results are more accurate.
But you don't weight polls by house effect! Nope. And yet my results are more accurate. However, I am open to weighting by house effect, if a test showed it was more accurate to weight by house effect. I just couldn't find a comprehensive list of the house effect for every pollster to conduct this test.
Why do you include partisan polls internal campaign polls, and multiple polls from single polling firms? I do so is because I tested the total error of the averages both with and without all three of those variables. The averages had less error when all three of those types of polls were included.
But you are only looking at campaigns within 18.50%! I am just not interested in forecasting blowout elections. Looking at campaigns within 18.5% or less still allows me to look at all targeted Senate and Governor campaigns, as well as the most expansive definition of "swing state" possible. Go any further out, and we are just not looking at competitive elections.
But you are focusing on the margin, rather than on the candidate raw totals! Elections are not academic exercises--this is about knowing who is winning and who is losing. If another forecaster can come closer to predicting the final raw number for each candidate, good for him or her. However, I want to know how close elections are, who is going to win, and who is going to lose. In the end, that is all that matters.
Why do you produce two forecasts, one with Rasmussen and one without? Showmanship, mainly. I imagine readers are interested in seeing a forecast without Rasmussen polling. The forecast with Rasmussen polls remains the "official" forecast.
That said, the scandal at Strategic Vision, along with Rasmussen's born-again Republican house effect and their new outside funding, does make me suspicious. So, I want to see what the numbers look like without Rasmussen, too.
That's it. As always, I am very interested to read your comments.
Methodological notes
The data used in these calculations can be found here. (zipped folder with six Excel files)
I tested the performance of the simple mean from 2004-2010 for several other cut-off dates (10-day mean, 15-day mean, 20-day mean, 21-day mean, 24-day mean and 30-day mean). The 25-day mean was chosen because it had the lowest median error rate, even though 24-day day simple mean had a very slightly lower mean error rate (2.5612 vs 2.5654).
By "almost all polls," I exclude Zogby Internet polls (because they are so inaccurate), Strategic Vision and Columbus Dispatch polls (because they are conducted by mail). Other than that, I include partisan polls, internal campaign polls, and multiple polls from a single polling firm. Tracking polls are separated into individual polls depending on the number of days the poll was in the field. For example, a three-day tracking poll will be broken into multiple three-day polls, a four-day tracking poll into multiple four-day polls, etc.
By "majority of their interviews conducted during the final 25 days of the campaign," I mean every poll where at least 50% the days the poll was in the field fell during the final 25 days of a campaign. This resulted in the following cut-off dates for the 50% threshold:
2004: October 8th
2005: October 15th
2006: October 14th
2008: October 10th
2009: October 9th
2010: December 25th, 2009 (Massachusetts special election)
At least two polls are used for every campaign, even if less than two polls were conducted during the final 25-days of a campaign.
Finally, I only looked at campaigns from 2004-2010 where the final predicted polling margin was less than 18.50. This is because I could not find a comprehensive list of polls from 2002 or earlier, and because I am simply not interested in forecasting massive electoral blowouts.
For 2004 and 2005, the polls used in the calculations were taken from Real Clear Politics. For 2006-2010, the polls used in the calculations were taken from Pollster.com