(Just to be clear, in case I turn out to be way off, this forecast is mine, and does not necessarily reflect the opinion of anyone else at Open Left. - promoted by Chris Bowers)
The new consensus among election forecasters is that the Massachusetts special election is a "toss-up." Stuart Rothenberg, Charlie Cook and Nate Silver have all described the election as such. Pollster.com only shows Coakley ahead by 1.9%, which probably qualifies as a toss-up in their book.
This is a case where I am going to disagree with the consensus of election forecasters, and instead argue that the Massachusetts Senate campaign still shows a distinct lean toward Democratic nominee Martha Coakley. I do this not based upon a desire to be contradictory, but instead upon a different reading of the empirical evidence available on the campaign.
As the only empirical data on voter preference available, polling data remains at the heart of any election forecast. My research into the accuracy of various methods of reading polling data shows that the most accurate method still gives Martha Coakley a decided edge in the campaign.
Here are the polls I am looking at:
Massachusetts special election polling, 2010
| Pollster |
Poll Mid-date |
Coakley |
Brown |
| Black Rock |
Jan 14 |
39 |
54 |
| Research 2000 |
Jan 13 |
49 |
41 |
| Suffolk |
Jan 12 |
46 |
50 |
| Rasmussen |
Jan 11 |
49 |
47 |
| Mellman |
Jan 09 |
50 |
36 |
| PPP |
Jan 08 |
47 |
48 |
| UNH |
Jan 04 |
53 |
36 |
| Rasmussen |
Jan 04 |
49 |
41 |
| Mean |
Jan 15 |
47.88 |
44.13 |
(While it is quite possible the Black Rock poll is juiced, until proof of that emerges, it will remain in the average. Also, there are rumors of internal polling, but until those polls are released to the public, I am not including them.).
The simple mean of the eight polls conducted, and released to the public, on the campaign in 2010 still show Martha Coakley ahead by 3.75%. That may not sound like a lot, but from 2004-2009, only 34 of the closest 143 campaigns for President (both state-level and national), Senate and Governor saw a swing of greater than 3.75% from the final, 15-day, simple polling mean to the final result. Further, my research shows that the final, 15-day, simple polling mean was more accurate than any election website in predicting the results of these elections. As such, I still see Martha Coakley as the strong favorite in this campaign. Even with the Black Rock poll included, I still give her an 88% chance to win.
It is pretty bold to claim that I have a more accurate means of predicting elections than websites like Pollster.com and fivethirtyeight, especially given that both websites took me to school in 2008. In the extended entry, I explain the basis for this claim, and in so doing the basis for arguing that Martha Coakley is still the favorite in Massachussets.
More in the extended entry.
|
In order to justify still calling Martha Coakley the favorite in the Massachusetts special election, I have to explain my new methodology for forecasting elections. I developed this new method over the last three months, basically as a hobby in my free time:
1. What do you mean by "the 140 closest" campaigns?
Specifically, I mean the 140 Gubernatorial, Senatorial, Presidential swing state and Presidential national popular vote campaign from 2004 to 2009 where the final polling average predicted a margin of less than 18.5%. I only looked at those campaigns because, well, I don't think election forecasters are needed for campaigns decided by 18.5% or more.
I drew the line at 18.5% because I thought, ironically, that forecasting who would win Massachusetts in the 2008 Presidential election was pointless. Further, every swing state and Senate seat that was seriously contested by the two major parties in 2008 and earlier fit inside the 18.5% range. In short, 18.5% was the widest net I could throw and still argue I was looking at the meaningful races where an election forecaster might be useful.
From 2004-2009, there turned out to be 143 of these campaigns. I did not look at House races. I did not look at primaries--only general elections. I would very much have liked to look at campaigns for 2002 and earlier, but I couldn't find a complete online resource for public polling on those campaigns.
2. What did you find?
I found, rather surprisingly, that the simple mean of (almost) all polls with the majority of their interviews conducted in the last fifteen days of a campaign was more accurate at predicting the final election margin than the far more statistically informed methodologies of Pollster.com and fivethirtyeight.com. Further--and this was just as surprising--the final, 15-day average was more accurate than the final, 10-day (or 20-day, or 25-day, or 30-day) average, which meant that including older polls actually made the forecast more accurate, not less.
Across the 52 campaigns that Pollster.com (final numbers here) and fiverthirtyeight.com (final numbers here) both produced final averages for in 2008, here was the average error between the final predictions and the final result:
Error rates, final predicted margin to final vote margin, 2008
|
Pollster |
538 |
Simple 15-day mean |
| Mean error |
2.76 |
2.88 |
2.56 |
| Median error |
2.14 |
2.16 |
1.68 |
This is not a dramatic improvement on the Pollster.com and fivethirtyeight.com predictions, but it is large enough to be noticeable and (I think) significant.
To make sure it wasn't a fluke, I went back to 2004 and 2005-2006, and measured the average error rates for the Simple 15-day mean method. The results were strikingly similar: a mean error of 2.46 in 2004, and a mean error of 2.56 in 2005-2006. Overall, across all 143 campaigns, the mean error was 2.54, and the median error was 1.76. The similarities of the numbers is striking--it performed just as well in the past as it performed in 2008.
Further, I checked to see if the simple polling mean would perform better if a different date range than 15-days was used. Here is what I found when looking at some other date ranges, across the same 143 campaigns:
Mean error rate, various date ranges
|
30-day |
25-day |
20-day |
15-day |
10-day |
| Mean error |
2.63 |
2.60 |
2.56 |
2.54 |
2.59 |
the 15-day performed (very) slightly better, but really there is no significant difference. This is perhaps the most important finding of all: including older polls in the averages, including those up to one month old, does not signficantly affect the overall accuracy of the averages.
3. What does this mean?
I drew a couple of conclusions from all of this:
- Special sauce has no effect. If Pollster.com and fivethirtyeight.com performed equally well in 2008, then the differences between their two methodologies is not significant. This means that the extra weights 538 puts into the mix--a demographic regression, weighting by past pollster accuracy, adjustments for pollster "house effects," and weighting by poll sample size--don't seem to have a positive impact on the overall accuracy of the forecast. Pollster.com has none of those weights, and performed (very) slightly better. Pollster.com came to the same conclusion in December 2008.
- Recentness doesn't matter, either. One "special sauce" adjustment Pollster.com's regression estimates do make is that more recent polls have more impact on the overall estimate. Fivethirtyeight adjusts for recentness, too. However, the simple mean estimate does not adjust for recentness, and seems to produce more accurate results. Further, the simple mean itself does not appear impacted by recentness, given there is virtually no difference in the overall accuracy of the 30-day, 25-day, 20-day, 15-day and 10-day polling averages.
Now, this is all very counter-intuitive, since one would think that polls taken closer to an election are more accurate than polls taken further out from an election. And, in fact, there is strong empirical evidence demonstrating this. However, it appears that including less recent into a polling average actually improves the accuracy of the overall average, even though the older poll are less accurate than more recent polls.
My explanation for this is that, according to the study that showed older polls are less accurate (see page three of this PDF), the older polls were not that much less accurate. At the same time, polling averages become more accurate when more polls are included in the average (I don't have specific numbers on this at this time, but a quick glance at my work suggests this). So, what is happening is that the inclusion of slightly less accurate polling is improving the accuracy of the overall polling average simply by adding more data. The increase in the amount of data in the system more than cancels out the inclusion of slightly less accurate data.
There appears to be a "sweet spot" for the accuracy of simple mean polling averages at around 15-days before an election. The difference is minor, but I am going to run with it until the numbers suggest a different date. Far out from an election (such as, say, nine months from the 2010 midterms), I will use a much wider range of dates for my polling averages (90 days for my Senate forecast, and 30-days for the National House ballot)
4. Can I see your data?
Sure. You can download the zip folder with all 21 spreadsheets here:
Election forecast study
Some notes on the data:
- For 2004-2005 all polls were taken from Real Clear Politics. For 2006-2009, all polls were taken from Pollster.com
- All election results are taken from Dave Leip's atlas.
- Zogby Interactive polls and Columbus Dispatch polls were not included, due to their horrendous past performance and questionable methodologies.
- Strategic Vision polls, as it seems likely those were never real polls.
- Partisan and Campaign-funded polls are included. Since this methodology works because it equally weights as much scientific, relevant data as possible, the more polls, the better.
- For the same reason, if there is more than one poll from a single organization in the given date range for a campaign, I include all of those polls.
- Polls need to have 50% or more of their interviews conducted in the given date range to be included in the averages.
- At least two polls per campaign, even if that means including polls that are older than the date range in question. Otherwise, I'm not forecasting--I am just reporting on a single poll, or throwing my hands up in the air saying that forecasting is impossible in this case. Both are unacceptable.
- The date range for the polls included do not include election day. For example, the 10-day average for 2008 includes polls where a majority of their interviews were conducted on October 25th or later, given that Election Day in 2008 was on November 4th. The 15-day averages for 2008 included polls with a majority of their interviews conducted on October 20th or later. The 20 day averages included October 15th or later, etc.
And basically, that is it. It took me months to put all this together, but a surprisingly simple answer to accurate election forecasts emerged. Just take the simple mean of (almost) all the polls conducted over the last 15 days.
5. What does this mean for Massachusetts?
What this means for the Massachusetts special election is that I will include all of the polls that conducted the majority of their interviews on January 4th, or later, into the average. This produces the following result:
Massachusetts special election polling, 2010
| Pollster |
Poll Mid-date |
Coakley |
Brown |
| Black Rock |
Jan 14 |
39 |
54 |
| Research 2000 |
Jan 13 |
49 |
41 |
| Suffolk |
Jan 12 |
46 |
50 |
| Rasmussen |
Jan 11 |
49 |
47 |
| Mellman |
Jan 09 |
50 |
36 |
| PPP |
Jan 08 |
47 |
48 |
| UNH |
Jan 04 |
53 |
36 |
| Rasmussen |
Jan 04 |
49 |
41 |
| Mean |
Jan 15 |
47.88 |
44.13 |
As I noted above, in the 143 campaigns I looked at in this study, the final, 15-day simple poll mean differed from the final margin by more than 3.75 on 34 occasions. Given that the polls could just as easily be favoring Coakley as they could be favoring Brown, this comes out to only a 12% chance that Scott Brown will win. As such, I still consider Martha Coakley to be the clear favorite, and this campaign far from being a toss-up.
Now, it is possible that special election polling is more like primary polling, and making turnout modelling much more difficult for pollsters. In fact, the wide range of results among the polls suggests that is actually likely. However, the truth is that I don't have numbers anywhere approaching the level of detail for primary and special elections that I have for general elections. I want to base my forecasts on thorough, empirical research, and I just don't have that for a special election. Truth is, there are so few special elections, there wouldn't be enough data points for a convincing study, anyway.
So, I am going to stick with the general election research I have conducted. That research suggests that Martha Coakley is still the clear favorite, and this campaign is not a toss-up. The 144th test of this theory takes place on Tuesday, January 19th. |