What is a voter file and how do campaigns use it?

A voter file is a database maintained by state election authorities containing registered voter information: name, address, party registration, and voting history. Campaigns purchase enhanced versions with demographic and consumer data appended, enabling them to model each voter's likely vote choice and turnout probability for targeted outreach.

What changed in campaign data use after Cambridge Analytica?

After the 2018 Cambridge Analytica scandal, Facebook drastically restricted third-party data access, forcing campaigns to rely more heavily on first-party data, state voter files, and consented opt-in lists. The industry shifted toward context-based targeting (interests, content consumption) rather than psychographic profiling from scraped social data.

How do campaigns use micro-targeting in 2026?

In 2026, campaigns use micro-targeting to identify persuadable voters (modeled as 40-60% likely to vote for either party) for personalized messaging on digital platforms, and to prioritize door-knocking lists for ground game volunteers. AI-assisted models now score each voter on 20+ issue dimensions to optimize message delivery.

How Campaigns Use Data in 2026: Voter Files, Micro-Targeting, and Post-Cambridge Analytica

Home › News & Analysis › How Campaigns Use Data

Voter Files

175M

Registered voter records

Model Dimensions

20+

Issue scores per voter

Digital Share

58%

Of ad spend digital in 2024

Persuasion Window

40-60%

Modeled persuadable range

Key Findings

175 million voter records are enriched with 20+ modeled issue scores per voter; the persuadable targeting window is defined as a 40-60% modeled partisan lean score.
Machine learning produces two scores per voter — turnout probability and partisan lean — that determine who gets door-knocked, called, or targeted digitally.
Cambridge Analytica fallout: Facebook's 2018 API restrictions ended social-graph psychographic targeting, shifting campaigns toward context-based digital and first-party data.
Digital ad spend reached 58% of total 2024 campaign budgets; first-party email and phone data now commands a premium as third-party targeting options shrink.

Campaign Data Tools and Vendors, 2026 Cycle

Tool / Vendor	Party	Function	Data Source	2026 Usage
NGP VAN	Dem	CRM / voter contact	State voter files	Universal D
i360	Rep	Voter modeling & targeting	Koch network + voter files	Universal R
Catalist	Dem	Voter modeling / analytics	Proprietary + public	Major races
TargetSmart	Dem	Voter file enhancement	Consumer + voter data	Broad D use
Aristotle	Both	Voter data & compliance	State voter files	Both parties
Civis Analytics	Dem	AI voter modeling	Multi-source ML	Tier-1 races
Digital targeting (Meta/Google)	Both	Context + interest targeting	Platform first-party	All races

How Campaigns Use Data in 2026: Voter Files, Micro-Targeting, and Post-Cambridge

How Modern Campaigns Build and Use Voter Files

The foundation of data-driven campaigning in 2026 is the voter file — a state-maintained database of registered voters that records name, address, date of birth, party registration, and crucially, voting history (though not how they voted, only whether they voted). Campaigns purchase access to these files and then layer on commercially available consumer data to build enriched voter profiles. A typical enhanced voter file in a swing districts includes 50-80 data points per voter: estimated household income, homeownership status, magazine subscriptions, consumer category purchases, vehicle ownership, estimated education level, and commercial survey responses. None of these factors individually predict voting behavior well, but in combination, machine learning models can generate probabilistic vote-choice and turnout scores with meaningful predictive accuracy.

The practical output of this modeling is two key scores per voter: a turnout score (0-100, probability of voting) and a partisan lean score (0-100, probability of voting Democratic). Campaigns use these scores to prioritize their ground game and digital advertising. Door-knocking lists are generated by filtering for high-turnout probability voters with 40-60 partisan lean scores — the genuinely persuadable voters worth investing face-to-face time in. Low-turnout Democrats (high partisan lean, low turnout score) get mobilization messaging. High-turnout Republicans (high R partisan lean) are excluded from targeting resources. The efficiency gains from this targeting versus random canvassing are well-documented: contacted voters show 2-6 percentage point higher turnout than uncontacted voters when properly targeted.

Post-Cambridge Analytica, the major shift is the constraint on data sourcing. Facebook's 2018 API restrictions eliminated the ability to harvest social graph data for psychographic modeling at scale. Campaigns can no longer build personality profiles from social network connections in the way that Cambridge Analytica (and, less dramatically, mainstream Democratic and Republican data operations) did in 2014-2016. Instead, campaigns have shifted toward context-based digital targeting — reaching users based on the content they consume, the interests they've expressed, and their behaviors on the platform — rather than scraped social data. First-party email and phone data, carefully opted-in, has become more valuable. The net effect is that campaigns are more privacy-compliant but have somewhat reduced precision in identifying persuadable voters via digital channels, partially compensated by advances in ML modeling applied to voter files.

AI and Micro-Targeting in the 2026 Cycle

The 2026 cycle represents the first major deployment of large language model tools in campaign messaging optimization. Campaigns are using AI-assisted message testing to evaluate hundreds of message variants simultaneously — a process that previously required multiple weeks of survey testing to accomplish for a handful of message frames. AI tools allow campaigns to generate and test messages across 20+ issue dimensions (economy as an issue, healthcare, immigration, education, energy, reproductive rights) and identify which frame produces the strongest persuasion effect among specific voter segments within days rather than months. The practical result is more precisely targeted persuasion messaging delivered to voters based on the issues they care most about, rather than the broadest message a campaign can find that tests well across all voters.

For digital advertising in particular, the combination of voter file data and AI message optimization is producing what campaigns call "personalization at scale" — delivering distinct message variants to different voter segments simultaneously. A competitive House campaign in a suburban voters might run 40-60 distinct ad variants simultaneously, each optimized for a different voter segment: one variant emphasizing prescription drug costs for senior women, another emphasizing small business regulation for male independent business owners, another emphasizing education funding for suburban parents. The targeting isn't based on party registration (campaigns can't target by registration on most platforms) but on modeled issue priorities derived from consumer behavior and content consumption patterns.

The Republican and Democratic data ecosystems operate largely in parallel rather than sharing infrastructure. Democrats anchor around NGP VAN (voter management), Catalist (voter modeling), and an expanding ecosystem of progressive data vendors. Republicans use i360 (Koch network), the RNC data operation, and commercial vendors. Both parties are aggressively investing in data infrastructure for 2026, recognizing that marginal improvements in targeting efficiency can translate directly to seat margins in a cycle expected to be decided by 2-3 points in dozens of competitive races.

What This Means for 2026

Data-driven targeting will likely determine outcomes in 10-15 competitive House and Senate races where margins come down to 1-3 percentage points. Campaigns with superior voter file quality, more accurate persuasion models, and better digital targeting execution can generate the equivalent of 2-4 percentage points of structural advantage in a close race. In cycles where the national environment is competitive and races cluster near 50/50, that technological edge is decisive.

Generic Ballot Tracker — Democrats +7.0 as of June 2026 → Senate Majority Math 2026 — Democrats Need Net +4 to Flip → House Majority Math 2026 — Republicans Hold 4-Seat Margin → 2026 Election Forecast — Senate Tipping-Point Races →

More to Explore

Analysis

How Campaigns Use Data in 2026: Voter Files, Micro-Targeting, and Post-Cambridge Analytica

Campaign Data Tools and Vendors, 2026 Cycle

How Modern Campaigns Build and Use Voter Files

AI and Micro-Targeting in the 2026 Cycle

What This Means for 2026

Related

National Polls

Issues

All 50 States

More to Explore

Open Seats 2026

Incumbent Advantage 2026

Generic Ballot Tracker 2026

2026 Election Forecast

Trump Approval Rating

Senate 2026 Race Map