When a poll shows Candidate A at 48% and Candidate B at 45% with ±3% MOE, headlines often say "within the margin of error, too close to call." This is partly correct — but it misses the key point. The MOE on the lead (the 3-point difference) is approximately ±6%, not ±3%. A ±3% MOE on each candidate's individual number means the margin itself has a much wider range of uncertainty than most readers realize.
What Margin of Error Is — and What It Is Not
The margin of error reported in polls is the sampling margin of error — the range of values that would be produced by random variation in who happened to be selected for the sample, if the poll were repeated many times with the same methodology. At the standard 95% confidence level, a poll of 1,000 adults with a result of 50% support has a MOE of approximately ±3.1%, meaning if you ran the same poll 100 times under identical conditions, roughly 95 of those runs would produce a result between 46.9% and 53.1%.
The formula for sampling MOE is approximately: MOE = 1.96 × √(p × (1-p) / n), where p is the proportion and n is the sample size. At p = 0.5 (maximum uncertainty) with n = 1,000: MOE ≈ 1.96 × √(0.25/1000) ≈ 1.96 × 0.0158 ≈ 3.1%.
What sampling MOE does not cover:
Non-response bias: If certain types of voters systematically refuse to participate in polls — or are harder to reach — the sample is not truly random even with a large n. Weighting adjustments help but do not fully solve this.
Likely voter modeling: Registered voter polls and likely voter polls produce different results because pollsters must estimate who will actually vote. This modeling assumption is a major source of variation between polls that is not captured in the stated MOE.
Question wording and order effects: How a question is phrased, what questions precede it, and whether the poll includes a "don't know / undecided" option all affect results. Different pollsters asking about the same candidate on the same day can produce results 3-5 points apart based on methodology alone.
Herding: When polls cluster suspiciously close to each other near Election Day, it may indicate pollsters are adjusting their results to avoid being an outlier — a phenomenon called herding. This can make aggregates artificially precise.
Sample Size and Its Effect on Precision
The relationship between sample size and margin of error is not linear — it follows a square root curve. To halve the margin of error, you need to quadruple the sample size. This has important practical implications for poll quality.
A poll of 400 respondents has a MOE of approximately ±4.9%. A poll of 1,000 respondents has ±3.1%. A poll of 4,000 respondents has ±1.55%. The improvement from 1,000 to 4,000 respondents — a fourfold increase in cost — only reduces MOE by about 1.5 percentage points. This is why most commercial polls use samples of 600-1,200: it is the cost-precision sweet spot.
Crucially, sample size matters within the target population. A national poll of 1,000 has a MOE of ±3.1% for national numbers. But if you want to analyze results within a subgroup — say, Black voters in Pennsylvania, who might be 200 of those 1,000 respondents — the MOE for that subgroup alone is approximately ±6.9%. Crosstab analysis of small subgroups in standard polls carries very wide uncertainty.
| Sample Size | Margin of Error (95% CI) | Range for 50% Result | Typical Use |
|---|---|---|---|
| 200 | ±6.9% | 43.1% – 56.9% | Small online surveys, subgroup analysis |
| 400 | ±4.9% | 45.1% – 54.9% | Low-budget state polls |
| 600 | ±4.0% | 46.0% – 54.0% | Standard commercial polls |
| 1,000 | ±3.1% | 46.9% – 53.1% | Most published polls |
| 1,500 | ±2.5% | 47.5% – 52.5% | Higher-quality state polls |
| 4,000 | ±1.5% | 48.5% – 51.5% | Large academic/government surveys |
House Effects: Why the Same Race Looks Different in Different Polls
A "house effect" is a systematic, consistent lean in a pollster's results relative to actual election outcomes. If a firm's polls consistently show Republicans 2 points higher than the final result across many elections, that firm has a Republican house effect of approximately +2. This is not fraud or incompetence — it reflects methodological choices that happen to produce a consistent directional lean.
Common sources of house effects include:
Likely voter screens: Different pollsters use different criteria to identify likely voters. Strict screens (based on past voting history) tend to favor Republicans; looser screens (self-reported likelihood to vote) tend to include more Democrats and younger voters.
Party ID weighting: Some pollsters weight their samples to a fixed party registration or self-identification ratio; others allow party ID to float. Given that partisan self-identification shifts in reaction to political events, this choice matters.
Mode effects: Phone polls (live caller) and online polls reach different populations. Phone polls increasingly reach only those who pick up calls from unknown numbers — a population that skews older and more Republican in some demographics. Online panels have their own selection biases.
Known house effects for 2024 cycle (approximate): Rasmussen Reports historically showed a Republican lean of +3 to +5 compared to final results. Emerson College and Trafalgar were identified as Republican-leaning pollsters in 2020-2022 but came closer to the actual 2024 results. CNN/SSRS and New York Times/Siena showed slight Democratic leans in 2022. Aggregators maintain running house effect estimates that are updated after each election.
Margin of Error vs. Polling Error: An Important Distinction
The margin of error (or sampling error) is the expected variation from random chance. Polling error is the actual difference between what a poll (or poll average) predicted and the final election result. They measure different things — and polling error consistently exceeds sampling error.
FiveThirtyEight's analysis found that the average error in state-level presidential polling averages — not individual polls, but averages of many polls — has been roughly 5-6 percentage points in recent cycles. That is significantly larger than the ±1-2% theoretical accuracy of a well-aggregated average would suggest.
The difference between MOE and polling error comes from systematic biases that affect multiple pollsters simultaneously: social desirability effects (voters reluctant to admit a preference for an unpopular candidate), differential non-response (Trump voters more likely to refuse polling in some cycles), and herding (late polls clustering together to avoid being outliers, masking true uncertainty).
The practical implication: when a poll average shows a candidate leading by 3 points, that is well within the range of polling error even though it exceeds the theoretical margin of error. A genuine "safe" lead in polling requires somewhere between 5-8 points on average, depending on the race, pollster quality, and historical accuracy in that state.
How to Read a Poll Aggregate vs. a Single Poll
A single poll, even from a high-quality pollster, has a meaningful probability of being several points off from the true value. The solution is to use averages of multiple polls — aggregates — which reduce random sampling variation by pooling many surveys.
What poll aggregates do well: Smoothing out random noise from individual polls. Identifying trends over time — whether a candidate is rising or falling. Incorporating polling from multiple firms with different methodologies, which catches some house effect biases.
What poll aggregates do not do well: Correcting for systematic errors that affect all pollsters simultaneously (like consistent undercounting of Trump voters in 2016 and 2020). Handling extremely sparse polling (some Senate primaries have only 1-2 polls). Accounting for late breaks — voters who decide in the final days often shift the result beyond what even late-breaking polls capture.
Reading a single poll correctly: Note the sample size, the date it was conducted, the pollster's track record (check FiveThirtyEight pollster ratings), whether it is registered voters or likely voters, and whether it is a live caller, automated/IVR, or online panel poll. A single poll showing one candidate up 5 in a competitive race is potentially meaningful — or it could be sampling noise. Do not make confident conclusions from a single data point.
2024 Example: Why Polls Within MOE Were Not Simply "Wrong"
The 2024 presidential polling showed a very close race in most swing states. Final averages had Vice President Harris leading or tied in Pennsylvania, Wisconsin, and Michigan. Trump won all three. This generated headlines about polling "failure" — but the reality is more nuanced.
In Pennsylvania, the final polling average was approximately Harris +0.1%. Trump won by +1.9% — a miss of about 2 points. In Wisconsin, the average was Harris +1.1%; Trump won by +0.8% — a miss of about 2 points. In Michigan, the average was Harris +0.5%; Trump won by +1.4% — a miss of about 1.9 points.
Most individual state polls in Pennsylvania had sample sizes around 800-1,000, giving them MOEs of ±3% to ±3.5%. The actual result (Harris at approximately 48.5% in PA, vs. the roughly 48.6% average) fell well within the stated margin of error for most individual polls. The individual polls were not technically wrong — the final result was within the range each poll said was plausible.
The better critique is systematic error across all swing states — every single competitive state missed in the same direction (Trump outperforming). This is not random sampling error; it reflects a structural issue with how current polls reach and model Trump voters. An accurate margin of error for a poll average in 2024 swing states, accounting for historical systematic error, was more like ±4-5 points — much larger than the statistical MOE would suggest.
| State | Final Poll Average | Actual Result | Miss (Polling Error) | Within Single-Poll MOE? |
|---|---|---|---|---|
| Pennsylvania | Harris +0.1 | Trump +1.9 | 2.0 pts (Trump direction) | Yes (within ±3%) |
| Wisconsin | Harris +1.1 | Trump +0.8 | 1.9 pts (Trump direction) | Yes (within ±3%) |
| Michigan | Harris +0.5 | Trump +1.4 | 1.9 pts (Trump direction) | Yes (within ±3%) |
| Arizona | Trump +1.5 | Trump +5.5 | 4.0 pts (Trump direction) | Borderline |
| Georgia | Trump +1.0 | Trump +2.2 | 1.2 pts (Trump direction) | Yes (within ±3%) |
| Nevada | Trump +0.5 | Trump +3.1 | 2.6 pts (Trump direction) | Yes (within ±3%) |
| North Carolina | Trump +1.3 | Trump +3.2 | 1.9 pts (Trump direction) | Yes (within ±3%) |
Frequently Asked Questions
If a poll has ±3% MOE and a candidate leads by 2 points, is it "too close to call"?
It is competitive but not necessarily a coin flip. The ±3% MOE applies to each candidate's individual number. The MOE on the difference between two candidates is larger — approximately ±4.2% to ±6% depending on how correlated the two candidates' numbers are. A 2-point lead with a 1,000-person poll is genuinely uncertain — the leading candidate wins maybe 65-70% of simulations given sampling error alone. Add systematic polling error and it becomes effectively indeterminate.
How do poll aggregators like FiveThirtyEight weight polls differently?
Poll aggregators weight polls based on pollster quality (track record and methodology), recency (more recent polls get higher weight), and sample size. Polls by firms with documented Republican or Democratic house effects may be adjusted. Pollsters with poor track records receive lower weights. FiveThirtyEight also applies a time decay — a poll from three months ago counts for much less than one from last week in the aggregate.
Why do polls keep underestimating Trump support?
Several theories exist: some Trump supporters may be less willing to participate in polls or admit their preference to pollsters (social desirability bias), Trump voters may be harder to reach through standard polling modes, and the likely voter modeling used by many pollsters may systematically undercount the lower-propensity voters who turn out for Trump specifically. Pollsters made methodological adjustments after 2016 and 2020, but the 2024 results suggest the underlying issue persists.