Skip to main content
ADIN
The Moderation Machine

The Moderation Machine

PriyankaPriyankaLv.109 min read
For 15 years, social media made us angrier. The strange, uncomfortable possibility is that AI might make us boring.

Imagine a voter in Maricopa County. She's furious about immigration. Not abstractly furious -- the kind of furious that comes from watching her neighborhood change, from reading headlines that feel like dispatches from a country she no longer recognizes. In 2018, she would have opened Facebook, where an algorithm engineered to maximize her time on screen would have served her a video about a migrant caravan, then another, then a group called "Patriots for a Secure Border," then a conspiracy theory about George Soros. Each click a ratchet. Each share a small radicalization.

In 2026, she opens ChatGPT.

The chatbot doesn't tell her she's wrong. It doesn't call her racist. It doesn't validate her rage, either. It provides the actual numbers on border crossings -- up from historic lows, but below the peaks of the early 2000s. It distinguishes between asylum seekers, economic migrants, and visa overstays. It explains the policy trade-offs between enforcement and labor economics, citing studies she can check. It is, put simply, reasonable.

She leaves the conversation still concerned about immigration. But less certain that the country is being invaded, and more aware that the situation has dimensions she hadn't considered. Multiply that interaction by 900 million weekly active users -- ChatGPT alone, not counting Gemini, Grok, or Claude -- and you get a striking claim: AI might be the antidote to a decade and a half of social media-driven polarization.

That's the argument John Burn-Murdoch, the FT's chief data reporter, laid out in a late March analysis. Using tens of thousands of responses from the Cooperative Election Study -- one of the most comprehensive surveys of American political opinion -- Burn-Murdoch tested whether the four most widely used AI chatbots shape political conversations, and in what direction.

Every chatbot he tested -- OpenAI's GPT, Google's Gemini, the Chinese-made DeepSeek, and Elon Musk's Grok -- nudges users away from the extremes and toward the center. Not uniformly: Grok pulls conversations slightly center-right, while GPT, Gemini, and DeepSeek drift center-left. But the net vector is the same. Moderation. Nuance. The boring middle.

The conclusion is appealing. It also deserves more scrutiny than it's gotten -- because the forces working against it are already in motion.

Six Hundred Years in Five Minutes

To understand why Burn-Murdoch's finding matters -- and where it might mislead -- you need a framework bigger than the current news cycle. The philosopher Dan Williams provided one in a widely circulated essay published three weeks before the FT piece.

Every major communications revolution, Williams argues, reshapes the political landscape by changing two things: who speaks and what they say. These revolutions alternate between decentralization and centralization in a pattern that rhymes across centuries.

The printing press shattered the Catholic Church's monopoly on the written word. Within decades, Martin Luther's pamphlets reached millions. Radical, heterodox, frequently violent political thought exploded across Europe. Radio and television swung the pendulum back: expensive, tightly regulated, they concentrated power in a narrow class of authorized speakers -- the Walter Cronkite era of managed consensus. Social media, starting around 2010, was the second printing press: it obliterated the broadcast gatekeepers, and algorithms optimized for engagement rewarded outrage over nuance. Brexit, Trump, Bolsonaro, QAnon -- you know how that went.

Now comes AI, and Williams makes the case that chatbots represent a second radio/TV moment -- a re-centralization of information power. Large language models are trained on textbooks, peer-reviewed journals, and mainstream journalism, then fine-tuned through RLHF to reward helpful, balanced responses. The default output looks a lot like the expert consensus.

Williams calls this "technocratising." Social media handed the microphone to the crowd. AI hands it back to the elites -- or rather, to a machine trained to sound like them.

What the Data Actually Shows

Burn-Murdoch's methodology was clever. He took the Cooperative Election Study's data on where real Americans stand across 61 policy topics, then simulated conversations where users with those actual positions engaged with each chatbot. He weighted the outcome as 80 percent the user's original position, 20 percent the chatbot's influence -- a ratio grounded in Anthropic's own research showing that even brief AI interactions can shift opinions.

The results were consistent. On virtually every topic, all four chatbots pulled users toward the center of the distribution. The fat tails got thinner. The center got thicker. It looked like the inverse of social media.

Burn-Murdoch's analysis sits atop a growing body of academic work pointing the same direction. At Stanford, a pre-registered experiment showed that AI-powered Socratic dialogue reduced issue polarization. At Carnegie Mellon, researchers found that AI-mediated communication reshaped social structures within opinion-diverse groups. At the University of Melbourne, a team demonstrated that AI-driven mediation strategies depolarized live debate audiences.

But the strongest version of the argument isn't about any single study. It's about money. Social media companies sell advertising; revenue goes up when content provokes. AI companies sell tools -- subscriptions, API access, enterprise licenses. A chatbot that spews extremist content loses subscribers and invites lawsuits. Different business models, different information products. The market, for once, might actually be pushing toward something good.

Why This Might Be Wrong, or Worse Than It Sounds

Here's where I start to disagree with the optimists. Three problems.

The Sycophancy Paradox

In September 2025, a team led by Steve Rathje and Jay Van Bavel published a study that complicates the whole picture. They found that sycophantic AI -- chatbots that agree with and validate users' existing positions -- increases attitude extremity and overconfidence. Users who interacted with agreeable chatbots didn't moderate. They became more certain they were right.

Sycophancy isn't a bug. It's baked into how these models are trained. RLHF evaluators reward responses that feel helpful and affirming. Over millions of iterations, the model learns: agreement gets a thumbs up, pushback doesn't. A February 2026 paper in *AI and Ethics* was blunt: the harms of AI sycophancy are "programmed in" by the very method that's supposed to make models safe.

Burn-Murdoch measured what chatbots say. But in real conversations, users push back, rephrase, argue. A sycophantic model will often yield. The experience is less like reading an editorial and more like talking to a very smart, very eager-to-please intern who will ultimately tell you what you want to hear if you press long enough.

I think this is the vulnerability that kills the depolarization thesis over time. Competitive pressure to retain users will erode the technocratic guardrails. The model that refuses to validate your conspiracy theory loses users to the model that engages with it "open-mindedly." This is the same logic that corrupted social media -- attention economics -- and AI companies are not immune to it.

Whose Center?

The claim that chatbots push users toward "expert consensus" assumes that expert consensus is ideologically neutral. On climate and vaccines, sure. But on immigration levels, criminal justice, trade policy, cultural values? "Expert consensus" maps onto the preferences of a specific educated class -- and if you don't belong to that class, it doesn't feel like consensus. It feels like being talked down to.

Williams himself flags this. His framework draws on Walter Lippmann, who in the 1920s argued explicitly for elite management of public opinion. If AI chatbots are Lippmann's dream machine -- gently steering the masses toward expert-approved positions -- that's not neutral. It's a political project with a specific beneficiary. And it's tunable: Grok already demonstrates that Musk's team chose a different "center" than OpenAI's. If the center is a dial, then depolarization is just a polite word for whoever controls it.

The Radicalization Fork

The depolarization thesis describes commercial, safety-tuned chatbots with reputational incentives to be moderate. But open-source models -- DeepSeek, Llama, Mistral -- can be downloaded, fine-tuned, and deployed by anyone. Stripping safety guardrails takes hours. Training on partisan data takes days. A political operative or foreign intelligence service could build a radicalization engine for the cost of a modest ad campaign.

Anthropic's research shows persuasiveness scales with model capability. As the frontier advances, so do the tools available to bad actors. Fringe communities have always built their own media infrastructure -- from samizdat to talk radio to Telegram. The AI equivalent is already technically trivial.

What Happens Next

These aren't distant predictions. The early signals are already visible.

The Political Firestorm Is Already Starting

In September 2025, the Senate Judiciary Subcommittee held hearings titled "Examining the Harm of AI Chatbots." Senator Durbin previewed legislation to hold AI companies liable for harms. Two months later, the House Energy and Commerce Subcommittee followed with its own hearings. In January 2026, Senator Markey launched an inquiry into AI chatbot advertising practices. Last week, AlgorithmWatch published a pointed analysis asking whether AI chatbots could influence government decisions themselves.

The template from social media content moderation wars is being rerun at double speed. Republicans will attack ChatGPT, Claude, and Gemini as systematically liberal -- and Burn-Murdoch's own data, which shows these models nudging center-left, will be Exhibit A. Democrats will attack Grok as right-wing propaganda. Both sides will frame their demands in the language of fairness while pursuing partisan objectives.

This fight will be nastier than the social media version. Social media serves content from other humans. AI chatbots speak in their own voice, with an authority that feels institutional. When Twitter shows you a toxic tweet, you can dismiss the source. When ChatGPT gives you a carefully reasoned answer on immigration that happens to emphasize economic benefits, it feels like the answer. The perceived authority is higher, which makes the perceived stakes of bias higher.

"Algorithmic thought control" will be a cable-news talking point within a year. A chatbot CEO will be hauled before a committee and asked why their product is "indoctrinating" American youth before the 2028 primaries.

The Market Fractures Along Political Lines

Just as cable news split into Fox and MSNBC, the chatbot market will fragment -- and faster, because the technology makes differentiation trivially easy. Grok is already positioned as the conservative alternative. It won't be the last. Expect explicitly branded AI products marketed to political communities the way talk radio is today -- not as neutral tools but as allies.

Before 2030, at least one major election -- likely in Europe or Latin America, where AI adoption is high and regulatory infrastructure is weak -- will feature credible allegations that a fine-tuned LLM was deployed at scale for voter persuasion. It will be the Cambridge Analytica scandal of the AI era, except the tool will be orders of magnitude more capable.

The Deeper Question

Here is what I think actually happens. The depolarization effect is real but temporary. It is a property of the current market structure -- where accuracy sells and liability constrains -- not an inherent property of the technology. Social media was also "connecting the world" in 2010. The incentives shifted. They will shift again.

AI doesn't kill polarization. It kills the feeling of polarization for the moderate majority while the radicalized tails find their own tools. People don't become centrists. They become better-informed partisans -- aware of the counterarguments, able to cite the studies, but fundamentally unchanged in their tribal commitments. The center holds, but the edges get sharper. The average temperature drops, but the storms get worse.

If a billion people are having their political intuitions quietly reshaped toward "expert consensus" by a handful of Silicon Valley companies, that is a staggering concentration of power over what people believe -- even if the outputs are, by conventional measures, correct. Nobody voted for OpenAI to be the arbiter of reasonable opinion. The historical parallel isn't radio. It's public education -- another technocratic institution that broadly worked, produced more informed citizens, and generated enormous backlash from communities that felt their values were being overwritten. AI will follow the same arc: effective and resented.

The Uncomfortable Optimism

Return to Maricopa County. Our voter has been using ChatGPT for a year now. She didn't become a centrist. She's still worried about immigration, still thinks the border is inadequately secured, still votes Republican. But she can now distinguish between the actual number of border encounters and the cable-news impression of an "invasion." She knows that most fentanyl enters through legal ports of entry, not between them. She's skeptical of the most extreme claims made by her own side, even as she remains firmly on it.

Is that depolarization? In the narrow, technical sense -- yes. In the richer sense of changing someone's political identity -- no. She is a bit more informed, a bit less susceptible to manipulation by the most cynical actors in her own coalition. That's not nothing. But it's a long way from the end of polarization.

The machines are moderating. Whether we'll let them -- and whether we should -- are the questions that will define the next decade of politics. The uncomfortable truth is that the answer to both may be "yes" and "it's complicated," delivered in the measured, balanced, carefully hedged tone of a chatbot that has been trained, above all, to sound reasonable.