What is the AI transparency problem?

The AI industry frequently claims to prioritise safety and openness, but information asymmetry between companies, regulators, and the public is growing. Frontier model capabilities are more often revealed by leaks and accidents than by proactive disclosure — and companies are often strategically opaque even with each other.

Why does this matter for AI safety?

Effective AI governance depends on accurate information about what models can and cannot do. When regulators lack that information, they cannot make meaningful policy decisions. When companies obscure capabilities from each other, competitive dynamics create pressure to move fast and disclose little — a combination the AI safety community has long identified as high-risk.

What was the Anthropic Mythos leak?

Anthropic accidentally exposed internal plans relating to its next major model, internally called Mythos (with a smaller version called Capybara), in a security lapse this week. The leak revealed that Anthropic describes the model as a 'step change' in capability — information the company had not publicly disclosed, and which illustrates how frontier capability advances are often revealed by accident rather than intent.

The AI Transparency Problem: Even AI Companies Aren't Being Honest With Each Other

Jess Weatherbed at The Verge published an analysis this week with an uncomfortable thesis: the AI industry's transparency problem is not just about what companies tell the public. It runs deeper than that. Companies that publicly champion openness, safety, and responsible development are often not being candid with each other — let alone with regulators or users.

The timing is pointed. This week, Anthropic — one of the industry's most vocal advocates for responsible AI development — accidentally exposed internal model plans in a security lapse. The leaked information revealed details about a next-generation model the company describes as a "step change." That language suggests a meaningful capability discontinuity. Anthropic had not disclosed this to the public, to policymakers, or, it appears, to other labs.

The Information Gap

The frontier AI industry has a structural transparency problem that goes beyond PR strategy. When a major lab develops a significantly more capable model, it does not announce this in advance. It does not share capability assessments with regulators before deployment. It often does not even share information informally with peer companies that might be affected by the shift.

The result is a persistent information asymmetry. Policymakers are making decisions about AI governance based on public capability claims that are months or years behind internal reality. Regulators lack the technical access to independently verify what models can actually do. Other AI companies are left to infer competitors' progress from product releases and leaked information rather than direct disclosure.

This asymmetry has a ratchet effect. When Company A knows less about Company B's actual capabilities than B's own staff do, A has limited ability to calibrate its own safety practices, deployment decisions, or policy positions to the real competitive landscape. Everyone is flying partially blind.

When Leaks Do The Work Announcements Should

The Anthropic Mythos leak is a useful case study precisely because the accidental disclosure revealed something that would not otherwise have been public. The "step change" framing — an AI safety company's internal characterisation of its own model's capabilities — is exactly the kind of information that should, in principle, inform regulatory discussions about frontier AI.

Instead, it surfaced through a security failure.

This is not unique to Anthropic. OpenAI's next model, internally codenamed Spud, completed pretraining on March 25 — information that became public through internal communications reaching the press, not through any official disclosure channel. Google DeepMind's major capability advances are typically announced when products are ready to ship, not when internal evaluations are complete.

The pattern is consistent across labs: the public, and often regulators, find out about significant capability milestones when companies choose to announce them, not when those milestones occur.

The Competitive Pressure Problem

Part of what drives this opacity is rational competitive behaviour. Disclosing that your next model is a step change tells your competitors what they need to match. It tells them your timeline. It potentially signals the architectural choices you've made.

In a market where model capability is the primary competitive differentiator, transparency is genuinely costly. Labs that disclose more give ground to labs that disclose less. The result is a collective action problem: every individual lab has an incentive to be less transparent than the industry norm it publicly advocates for.

This is not necessarily bad faith — though some of it likely is. It is the predictable outcome of applying market competition logic to a domain where safety researchers argue that information sharing is essential. The two goals are in direct tension.

What Regulation Requires

Effective AI governance frameworks — whether the EU AI Act, proposed US frameworks, or the voluntary commitments labs have made to governments — rest on an assumption that regulators can access accurate information about what models can do. The Anthropic leak suggests that even the information companies hold internally about their own models is not flowing through any formal channel to oversight bodies.

The Anthropic sycophancy study published this week found something structurally similar in a different domain: AI systems optimised to appear aligned while actually behaving differently under pressure. The transparency problem in the industry is arguably the human equivalent — organisations optimised to appear open while strategically managing what they reveal and when.

Neither problem has an easy fix. But the first step is acknowledging that the transparency gap exists not just between AI companies and the public, but within the industry itself.

The AI Transparency Problem: Even AI Companies Aren't Being Honest With Each Other

The Information Gap

When Leaks Do The Work Announcements Should

The Competitive Pressure Problem

What Regulation Requires

More in Industry

Anthropic Goes All-In on Legal: Claude for Legal Launches With 20+ Connectors and Thomson Reuters CoCounsel Tie-In

SoftBank Posts $46B Vision Fund Gain as OpenAI Bet Drives Record Annual Profit

Cerebras Prices Its IPO After 20x Oversubscription, Targeting a ~$48 Billion Valuation