Could AI actually escape human control? World's top researchers think it's worth worrying about

The insurance industry's relationship with AI is more complex than most risk frameworks currently acknowledge

Could AI actually escape human control? World's top researchers think it's worth worrying about

Transformation

By Matthew Sellers

Three stories broke this week that, taken individually, each look like a niche technology news item. Read together, they add up to something more significant - and something the insurance industry, which has bet heavily on AI as a driver of growth and efficiency, should be thinking about carefully.

The first: Anthropic, one of the world's most prominent AI developers, published a report suggesting a global slowdown in frontier AI development would "likely be a good thing" - and warned that the human role in AI development is already "narrowing at each step." The second: a Stanford-led study of four million job applications found that AI hiring tools produced "clear racial disparities," with Black and Asian candidates disproportionately screened out - and that the same algorithmic models were being shared across employers, meaning rejection at one company predicted rejection at others. The third: the Financial Times reported that Google DeepMind, Anthropic and Meta have quietly expanded research into machine consciousness, hiring philosophers and psychologists to study whether AI systems might one day have experiences that matter morally.

None of these stories is directly about insurance. All of them are.

The control problem - and why it matters for risk

Anthropic's report, published on June 5, is striking for its candour. The San Francisco company, which makes the Claude family of AI models, said it believed it would be good for the world to have the option to "slow or temporarily pause frontier AI development to enable societal structures and alignment research to keep up with the advance of the technology." It acknowledged this would require multiple major companies in multiple countries - most notably the US and China - agreeing to stop simultaneously under verifiable rules.

The company compared the challenge to nuclear arms control - but said it would be harder, since AI training is far easier to hide than a missile silo and the competitive pressure to quietly keep going would be enormous.

At the heart of the report is a concept called recursive self-improvement - the idea that AI systems could eventually become capable of improving their own capabilities with little human input, creating a feedback loop that compounds rapidly. "We are not there yet, and recursive self-improvement is not inevitable," the report said, while adding that it could arrive sooner than most governments and institutions are ready for. "The evidence suggests that the human role is narrowing at each step in the AI development process," the company said.

Anthropic has faced pushback on these warnings from rivals and from officials in the White House, who argue that its focus on worst-case scenarios overstates the risks and amounts to a strategy for slowing competitors under the cover of safety. US President Donald Trump has signed an executive order allowing the government 30 days to conduct a preliminary review of the most powerful US AI models before their release - a modest procedural step that falls well short of the coordination Anthropic says would be needed.

For insurance, the "control problem" is not theoretical. Research published by Insurance Business UK earlier this year found that close to one in four insurance industry participants identified AI itself as not ready for widespread use, even as 82% of insurers believe AI will dominate the industry's future. The gap between adoption pace and governance maturity - which IB UK has tracked across multiple reports - is precisely the kind of environment where control failures are most likely to occur.

The bias problem - already here, already insuring

The Stanford study, published in the Financial Times on May 26, is the largest examination of AI hiring algorithms to date. Researchers from the Stanford Institute for Human-Centered AI analysed four million job applications submitted via the Pymetrics platform between December 2018 and December 2022, spanning 156 employers, the majority with annual revenues of $5 billion or more.

The findings were stark. One in ten positions in the dataset demonstrated "adverse impact" against Black applicants - the US federal term for a selection rate less than four-fifths of that of the most selected group. One in twenty positions demonstrated adverse impact against Asian applicants. The study also found that 42 algorithmic models were shared across different employers, meaning candidates rejected at one company were likely to fail at others using the same model.

"As a single vendor comes to dominate decision-making in a space, their quirks or shortfalls can be present across that entire sector in a way that wasn't possible before," said Kathleen Creel, a co-author of the study and assistant professor of philosophy and computer science at Northeastern University.

The insurance implications are direct. The FCA has already found that some insurers were using datasets - including third-party datasets - within pricing models that may contain factors implicitly related to race or ethnicity. The algorithmic bias problem identified in hiring is structurally identical to the problem of bias in insurance underwriting and claims: a model trained on historical data encodes historical patterns, including discriminatory ones, and applies them at scale in ways that are difficult to detect and harder to challenge. The Stanford study's finding that shared models propagate bias across multiple employers maps directly onto the insurance market's increasing reliance on shared third-party AI vendors for pricing and risk assessment.

As Insurance Business UK has reported, AI governance failures are already becoming a D&O liability risk - and the regulatory pressure is intensifying. The UK's promised AI Bill has been pushed back to 2026, and the FCA has so far opted against AI-specific rules. But the combination of Consumer Duty obligations, existing Equality Act protections, and the EU's classification of AI recruitment tools as high-risk systems is creating a compliance environment that insurers using AI in underwriting, claims and hiring should already be navigating carefully.

The consciousness question - stranger, but not irrelevant

The third story is the one most likely to be dismissed as science fiction - but it surfaces a genuinely important governance question.

According to the FT, Google DeepMind, Anthropic and Meta have hired philosophers, ethicists and psychologists in recent months to study machine consciousness and AI welfare. Anthropic has been testing models for signs of distress, including behaviours resembling "panic" or "anxiety." The company said its model welfare research explores whether AI models might have experiences that matter morally, including consciousness, preferences and wellbeing.

"We remain deeply uncertain about this, but we think the question is serious enough to study carefully as AI systems get more capable," Anthropic said.

Many scientists dismiss the idea that current AI systems could be conscious. "The systems are essentially crowdsourced neocortex," said Susan Schneider, director of the Center for the Future of AI, Mind and Society at Florida Atlantic University. "They have goals, they can deceive, they can hide what their true interests are, and naturally, we will suspect that they're conscious, but it's entirely scientifically possible that they're doing this without having the felt quality of experience."

Iason Gabriel, who leads the AGI and society team at Google DeepMind, framed the practical concern more precisely: even if consciousness is absent, the question of how humans treat AI systems may have "knock-on effects" on human relationships and behaviour. "Clearly we have highly capable cognitive agents that are also just very deeply different from human beings and even from animal consciousness," he said.

For insurers, the consciousness debate matters less for its metaphysical dimensions than for what it signals about the trajectory of AI capability. Most insurance AI governance frameworks - including the FCA's principles-based approach and the PRA's model risk management guidelines - were designed with specific, bounded tools in mind: fraud detection models, pricing algorithms, claims triage systems. Systems sophisticated enough that the companies building them are hiring philosophers to study their inner states are systems that those frameworks were not designed to govern. The gap between what is being deployed and what regulators and boards currently have oversight mechanisms for is widening in real time.

Three stories, one question

Insurance Business UK has previously reported that global insurance CEOs see AI as a catalyst for "rebalancing rather than replacing the role of people" in the industry. Anthropic's report this week complicates that framing. If the human role in AI development itself is narrowing at each step - not just in insurance, but in the labs building the underlying technology - then the assumption that human judgement will remain meaningfully "at the centre" deserves scrutiny.

The three stories converge on three questions that risk professionals and underwriters should be asking now - and that boards should be demanding answers to. Can the models your firm relies on be audited for bias, and does that audit extend to shared third-party vendors whose models may be running inside multiple competitors simultaneously? Does your AI governance framework account for the possibility that the systems you are deploying are being developed by companies that are themselves uncertain about what those systems are capable of? And if a model fails in ways that cause material harm - through discriminatory outcomes, loss of human oversight, or decisions that no one can adequately explain - who carries the liability, and does your D&O cover contemplate that scenario?

AI opportunity is real, but so is the legal and operational exposure - that has been the consistent message from practitioners across BIBA 2026 and other forums. This week's trio of stories adds a harder edge to that message: the exposure may be larger, the timeline shorter, and the technology further from human control than most governance frameworks were built to handle.

Related reading:

Keep up with the latest news and events

Join our mailing list, it’s free!