College of LAS « Illinois

How could so many be so wrong in predicting Trump vs. Clinton?

Political scientist weighs in on one of the nation's greatest political upsets

Many methods and assumptions of polling and forecasting will be open to examination in the wake of the presidential election, says University of Illinois political science professor Brian Gaines.
Many methods and assumptions of polling and forecasting will be open to examination in the wake of the presidential election, says University of Illinois political science professor Brian Gaines.
This year’s U.S. presidential election has been called one of the biggest political upsets in history. Just weeks ago, Hillary Clinton seemed headed for a big win. And though the race tightened in the lead-up to Nov. 8, almost every respected election forecast had her as a strong favorite over Donald Trump. Betting markets were predicting Clinton as well. And so was Illinois political scientist Brian Gaines, an expert on polling and public opinion. Gaines spoke with News Bureau social sciences editor Craig Chamberlain about how the polls largely failed to predict Trump’s victory and the challenges of getting it right.

So just how wrong were the forecasters, pollsters, bettors and experts?

Almost all of the short-term forecasters said Clinton’s probability of winning was high, in the range of 60-90 percent. A coin weighted to heads will come up tails now and then, but there is surely more going on here. Electoral College forecasts depend on accurate polls. Bettors can, in theory, exploit private information, but most bettors seem to rely very heavily on polls too.

Of 14 national public polls conducted entirely in November, 13 had Clinton ahead, and only one had Trump up. Those estimates all come with margins of error, which should never be ignored, but most of them were wrong even taking those into account.

State polls are more important to forecasts than national polls, and some of those look fine in hindsight. Only one of the four final Florida polls was statistically off, and it was a partisan poll, not commissioned by a media outlet. On the other hand, the November Michigan polls, excluding two partisan polls, all had Clinton ahead by 4 or 5 points in a state that she will probably lose by about 1 percentage point, once that vote is official.

Pollsters and political scientists will dive into these data over the next months, and a lively debate is guaranteed.

Given that, what should we understand about the ways these predictions can go astray?

Poll aggregators such as RealClearPolitics or FiveThirtyEight, whether they report simple averages or employ more complicated models, rely on errors cancelling out. Most people put too much faith in “the more, the better” logic, figuring that an average from 10 polls must be about 10 times as accurate as any one of them. But even when the polls have no bias, increasing sample size does not help that much because of the way uncertainty is computed.

More importantly, polls can err many ways beyond the luck-of-the-draw involved in making guesses about millions by talking to thousands. The key statistical assumption about this process is that each person in the target population – registered voters, say – is equally likely to be chosen in the sample. We know that this assumption is false for many reasons relating to both who can be reached (by landline, cell, email or Internet), and who is willing to answer questions about their voting plans.

As norms of phone usage have dramatically shifted, pollsters have clung to hopes that their old model of contact still works. Because landline users are so obviously not a random sample of the entire population, many pollsters now include cell-phone samples. But quite apart from caller ID, screening and young people’s greater willingness to hang up, cell phones are a challenge. I’ve lived in Illinois more than 20 years, but my personal phone has a Palo Alto (California) area code. My wife’s phone thinks she still lives in Washington state.

Where do online polls figure in?

Many polls are now conducted online, from panelists matched to genuinely random samples. The industry would love to believe that combining phone and online approaches provides perfect coverage of the target population since phones work best for older respondents and internet polls are preferred by the young. But it might also be that each approach tends to produce distinct errors, which sometimes roughly balance but other times pull the result in the same direction, away from the truth. If, for instance, less-educated respondents are increasingly hard to reach by either route, mixing contact method cannot solve the bias.

Pollsters often weight their data to overcome these and other problems. What are the pitfalls there?

Weighting is as much an art as a science. If a sample of “likely voters” is too female, for instance, should it be weighted to match the sex-balance from the last election, or should the pollster conclude that a gender gap in turnout is widening? Polling firms use different weighting schemes, often jealously guarded. But, again, the prospect that mistakes disappear when different approaches are mixed is a hope, not a theorem.

The hardest adjustment to make with weighting is probably “Who won’t talk to us?” Everyone worries about stealthy blocs, but there is no foolproof way to adjust for them. Primary polls did not systematically under-estimate Donald Trump’s support, and analysts were thus perhaps too complacent about the general election following suit. It could be that those Democrats and independents who decided to back Trump over Clinton were especially likely to be lurking out of pollsters’ sights or feigning indecision.

There has been a lot of talk about things being rigged during this election season. What about the polls?

The polls were not rigged. Conspiracy theories can be irresistible, but commercial polling operations live or die on their reputations for accuracy. Even if a firm consists entirely of Republicans or Democrats, producing deliberately wrong numbers in the belief that they might sway undecided voters would be folly. So media bias based on rigged polls is the least-plausible explanation for experts under-estimating Trump’s strength.

Still, the possibility that Trump voters were a bit harder to find in key states because of their unwillingness to interact with pollsters they saw as biased is deserving of more scrutiny, and may yet prove to be one of the ingredients in the crow pie we “experts” are now being served.

Craig Chamberlain, Illinois News Bureau

Related Topics

  • Political Science
  • Faculty research
  • Social and behavioral science