- A sample of 1,000 produces ±3.1 points margin of error; doubling to 2,000 reduces it only to ±2.2 — accuracy improvements diminish rapidly with sample size increases.
- Response rates collapsed from 35% in 1990 to 2-6% by 2023, meaning pollsters contact 20-50 people for every one respondent — systematic self-selection bias is unavoidable.
- Likely voter screens are the single most consequential methodological variable in midterm polling, explaining most of the variation between pollsters in the same race.
- Weighting corrects for self-selection but introduces its own error; consistent directional bias (house effects) across a pollster averages ±2-3 percentage points.
Sample Size: How Many People Do You Need?
The math of sampling is counterintuitive. A well-constructed random sample of 1,000 people can accurately represent 330 million Americans — if and only if the selection process is truly random. The margin of error formula is 1/√n multiplied by a confidence-level constant (1.96 for 95%). At n=1,000, that yields ±3.1 percentage points. Crucially, doubling to n=2,000 only reduces the margin to ±2.2 points — a diminishing return that makes large samples expensive relative to the precision gained.
State-level polls face a harder challenge. A state with 5 million registered voters still requires 500-700 respondents for a usable poll — a minimum that becomes expensive when pollsters are covering dozens of competitive races simultaneously. The result is that top Senate and governor races get polled frequently while marginal congressional districts may receive only two or three polls per cycle, leaving forecasters heavily reliant on partisan lean calculations rather than fresh data.
Response rates are the hidden crisis of modern polling. In 1990, telephone pollsters could expect a 35% response rate. By 2023, that figure had collapsed to 2-6% for live-call phone surveys. This means pollsters contact 20-50 people for every one who responds — and the people who respond are systematically different from those who don't. Heavy weighting is required to correct for this self-selection bias, which introduces its own error.
Likely Voter Screens: Who Actually Votes
The single biggest methodological choice a pollster makes is whether to measure all adults, registered voters, or likely voters. Each universe produces systematically different results. All-adult polls typically show the most Democratic-leaning results because the non-voting public skews younger and more diverse. Registered voter polls move closer to the actual electorate. Likely voter polls — restricted to people with a high probability of actually casting ballots — tend to show the most Republican-leaning results, reflecting the older, more educated, higher-income profile of consistent voters.
The Gallup likely voter screen, developed in the 1950s and still used widely, asks seven questions: Did you vote in the last election? Do you know where your polling place is? Have you voted in the precinct before? How much have you thought about the election? How closely are you following political news? Do you plan to vote? How certain are you that you will vote? Respondents who answer yes to five or more are classified as likely voters. Other models use a single screen (self-reported intent to vote) or a weighted probability score. The choice of screen can shift final results by 2-4 points.
Weighting: Correcting for Who Picks Up the Phone
| Demographic Variable | Typical Weight Target | Common Direction of Raw Skew | Method | Controversy Level |
|---|---|---|---|---|
| Age (18-34) | ~22% of LV electorate | Under-represented in raw sample | Census ACS data | Low |
| College education | ~38% of voters | Over-represented (college grads answer surveys more) | Census CPS Voting Supplement | Medium |
| Race/ethnicity | ~73% White non-Hispanic | Varies by mode | Census ACS data | Low |
| Party identification | ~33% D, 30% R, 37% I (varies) | Over-depends on pollster's prior | Prior election results or registration | High |
| Geography (region) | Census regional shares | Urban over-represented in online panels | Census data | Low |
| 2020 vote recall | ~51% Biden, ~47% Trump | Varies; some show too many Biden voters | Adjusted for known recall bias | High |
Margin of Error: What It Actually Means
The margin of error is the most misunderstood concept in polling. When a poll shows Candidate A at 48% and Candidate B at 46% with a ±3-point margin of error, it does not mean the race is "within the margin of error" in the sense that we cannot say who is ahead. The margin of error applies to each candidate's number independently. The combined uncertainty for the difference between the two candidates is approximately √2 times the individual margin — about ±4.4 points for a 3-point individual margin. The 2-point lead is real but narrow.
Furthermore, the stated margin of error only captures sampling error. It does not account for systematic errors: bad weighting, non-response bias, poor question wording, or social desirability effects. The "herding" phenomenon — where pollsters subtly adjust their results to match the polling consensus — is a form of systematic error that can cause all polls to be wrong in the same direction simultaneously, as happened in 2016 in Wisconsin and Michigan. The true uncertainty in any individual poll is roughly twice its stated margin of error when systematic error is included.