
Persuasion at Scale: Why Bigger AI Doesn’t Mean Better Influence
As the world heads toward an AI-shaped future, there are plenty of reasons to be nervous: one of the most pressing is whether large language models (LLMs) can become persuasive enough to sway elections, shift public opinion, or flood the internet with robotic propaganda.
In 2024, that question didn’t feel theoretical. With nearly half the global population heading to the polls, concerns about AI-written political content became tangible. Headlines warned of state-sponsored influence campaigns using ChatGPT-like tools. Experts talked of “superhuman persuasion.” And many wondered: how far can this go?
In a new study published in the authoritative Proceedings of the National Academy of Sciences (PNAS) in March 2025, Paul Röttger (Bocconi Department of Computing Sciences), Ben Tappin (London School of Economics), Kobi Hackenburg, Scott Hale, Jonathan Bright and Helen Margetts (all of them of the University of Oxford) set out to answer this. Their conclusion: the AI persuasion apocalypse may not be imminent—at least not from just making models bigger.
A giant experiment, one message at a time
The team ran one of the most ambitious experiments yet on AI-generated persuasion. They gathered 24 language models, ranging from small (70 million parameters) to huge (Claude-3-Opus and GPT-4), and asked each one to write short persuasive messages—around 200 words—on hot-button U.S. policy issues like immigration, healthcare, and criminal justice.
Then, they showed these messages to nearly 26,000 Americans, comparing how much attitudes shifted after reading AI-written content versus human-written or no messages at all.
The result? AI can indeed persuade. On average, exposure to a single AI-written message nudged people’s opinions by 5.77 percentage points toward the stance being promoted. That’s about as much movement as a human-written message caused.
But, crucially, increasing model size brought only modest gains. As Paul Röttger puts it: “Current frontier models are only slightly more persuasive than models smaller in size by an order of magnitude or more.”
Is bigger better?
At first glance, the finding might seem surprising. After all, LLMs tend to get better at most tasks as they get larger. So why doesn’t that happen here?
The answer seems to lie in what the researchers call “task completion”: whether the message is coherent, on topic, and grammatically correct. In essence, once a model can write a solid, readable, and relevant 200-word argument, making it bigger doesn’t help much more. The best models are already at the ceiling on this basic skill.
When the authors statistically adjusted for this factor, the relationship between model size and persuasiveness disappeared. The edge of the frontier models wasn’t due to some rhetorical magic—it was just that they were more likely to write a legible, on-target message in the first place.
A persuasive ceiling?
This doesn’t mean we are safe from manipulative AI content. Even small models can already persuade like humans. But it does mean that the scaling of persuasion—just making models bigger and expecting radically more influence—may not pan out. The researchers used different mathematical models to explore the shape of this “diminishing returns” curve. The best-fitting one suggests that even if we scale up from today’s 300-billion-parameter giants to models with 3 trillion or more, we might gain less than 1 percentage point in persuasive power—hardly a game-changer.
That has policy implications. As governments and companies develop safeguards for AI’s influence on public discourse, they may need to worry less about model size and more about deployment strategies—like message targeting, repetition, or interactive dialogue.
Why it matters now
In a world where people increasingly get their political content from algorithmic feeds, even small persuasive nudges matter, especially at scale. A 5-point swing, applied across millions of voters, can decide elections. The finding that small models can already match humans in single-message persuasion is both impressive and sobering. But there’s a flip side. If we know the ceiling, we can prepare. This study adds an important dose of empirical clarity to a debate that’s often dominated by speculation and sci-fi anxieties.
And as the authors note, their messages weren’t fine-tuned for persuasion. With more sophisticated techniques—like tailoring arguments to a person’s values or delivering them over time in conversation—the limits may shift. Future studies will need to probe how personalization, interactivity, and repetition shape LLM persuasion.
One last thought
Perhaps the most striking takeaway from the paper is not the limitation of AI, but its accessibility. The team found that even modestly sized, open-source models—trained on a relatively small dataset—could rival frontier models and humans in persuasive power.
The barrier to entry for building a persuasive AI is falling fast. As the authors warn, this means “the cost and complexity of training or accessing a persuasive language model is lower than might have previously been assumed.”