With all of the upcoming primaries, I have been reading a little bit about polling data. Nate Silver of the NY Times discusses how frequently a candidate’s vote total falls in the margin of error (based on poll data) . Usually, 95% confidence intervals are reported, so you would expect a candidate’s numbers to be outside the confidence interval ~5% of the time.
FiveThirtyEight has a database consisting of thousands of primary and caucus polls dating back to the 1970s. Each poll contains numbers for several candidates, so there are a total of about 17,000 observations. How often does a candidate’s actual vote total fall within the theoretical margin of error?
The answer is, not very often. In theory, a candidate’s actual vote total should fall outside the margin of error only 5 percent of the time. In reality, the candidate’s vote total was outside the margin of error 65 percent of the time! Part of this is because the database includes some polls conducted months before the actual voting took place. But even if you restrict the analysis to polls conducted within the final week of the campaign, about 40 percent of the vote totals fell outside the margin of error — eight times more often than is supposed to happen if you could take the margin of error at face value. [emphasis added]
Silver argues that it is important to recalibrate the polling data based on the accuracy of past polls. To make predictions about election/primary results based on polling data, he (b) adjusts the results based on how recent the polls are (more recent = more accurate), (c) accounts for undecided voters, and (c) accounts for “momentum.” Silver’s methodology can be found here and his prediction for the New Hampshire primary can be found here.