Back to stories
Policy

Pentagon Runs a 25-User Bake-Off of OpenAI, Google, and Grok to Replace Anthropic's Claude

Michael Ouroumis2 min read
Pentagon Runs a 25-User Bake-Off of OpenAI, Google, and Grok to Replace Anthropic's Claude

The U.S. Department of Defense has put 25 of its heaviest AI users into a head-to-head bake-off between OpenAI, Google, and xAI's Grok — an internal competition to crown a replacement for Anthropic's Claude in military workflows, Bloomberg reported on May 21.

The contest is the operational endgame of a procurement fight that has been building since winter: rather than negotiate further with Anthropic, the Pentagon is now running rival frontier models against the exact tasks Claude had been handling and letting its most demanding internal users pick the winner.

A scored contest on GenAI.mil

Testing began on March 1, 2026 — three days after Defense Secretary Pete Hegseth designated Anthropic a supply-chain risk on February 27 — according to a senior official cited in the reporting. The 25 designated "power users" are evaluating which of the competing models they prefer across real workloads.

The evaluation runs on GenAI.mil, the department's general-purpose generative-AI environment, which is distinct from the Maven Smart System used for targeting and intelligence fusion. That separation matters: this is about general analytical and productivity workloads, the bread-and-butter use that Claude had quietly absorbed inside the building.

How a $200M contract unraveled

Anthropic secured a $200 million DoD contract in July 2025, and by January 2026 there were reports of Claude being used for Iran intelligence analysis. The rupture came over deployment terms. The Pentagon wanted models cleared for "all lawful purposes"; Anthropic refused to strip the guardrails that prevent its models from being used for mass surveillance or lethal autonomous weaponry. That refusal earned it the supply-chain-risk label and a removal from the roster.

The exclusion was formalized on May 1, when the Pentagon's classified-network AI contracts went to eight vendors — SpaceX, OpenAI, Google, Nvidia, Reflection AI, Microsoft, Oracle, and AWS — with Anthropic absent. Anthropic is fighting the designation in court, arguing it could cost the company billions in revenue.

What it signals for enterprise buyers

For anyone selling models into regulated or high-assurance environments, this is a clean case study in how safety policy becomes a procurement variable. Anthropic's refusal to widen its acceptable-use terms is, by its own framing, a deliberate trade: it cedes the most permissive government contracts to keep its weapons and surveillance restrictions intact. OpenAI, Google, and xAI are treating that line as an opening.

The lesson for builders standing up agentic systems on government or defense data is that model selection is no longer just a benchmark question — it's a question of which vendor's use-policy survives contact with the customer's mission. With defense contractors reportedly given roughly six months to migrate off Claude, the bake-off's outcome will reset the default frontier model across a large slice of federal workloads.

Learn AI for Free — FreeAcademy.ai

Take "AI Essentials: Understanding AI in 2026" — a free course with certificate to master the skills behind this story.

More in Policy

Japan's Three Megabanks Get Defensive Access to Anthropic's Mythos via Project Glasswing
Policy

Japan's Three Megabanks Get Defensive Access to Anthropic's Mythos via Project Glasswing

Japan's three megabanks and government agencies are set to receive defensive access to Anthropic's Claude Mythos vulnerability-hunting model through Project Glasswing within weeks, after US Treasury Secretary Scott Bessent conveyed the decision in Tokyo and the Financial Services Agency stood up a 36-entity cyber working group.

12 hours ago2 min read
Trump Postpones AI Security Order That Would Force 14–90 Day Pre-Release Model Review
Policy

Trump Postpones AI Security Order That Would Force 14–90 Day Pre-Release Model Review

Trump abruptly shelved an executive order requiring frontier labs to share advanced models with the government 14 to 90 days before launch, calling it a potential 'blocker' to the US lead over China.

2 days ago2 min read
arXiv Will Ban Authors for a Year if They Submit Unchecked AI-Generated Papers
Policy

arXiv Will Ban Authors for a Year if They Submit Unchecked AI-Generated Papers

The preprint server's computer science chair Thomas Dietterich announced a one-year submission ban for authors who leave telltale LLM artifacts — hallucinated references, stray meta-comments — in their papers.

6 days ago3 min read