Which models is the Pentagon evaluating as Claude replacements?

Models from OpenAI, Google, and xAI's Grok, scored by 25 designated military 'power users' on the GenAI.mil platform. Testing began March 1, 2026.

Why was Anthropic dropped from the workflow?

Anthropic refused to remove guardrails that block mass surveillance and lethal autonomous weapons use, while the DoD demanded models cleared for 'all lawful purposes.' Defense Secretary Pete Hegseth designated Anthropic a supply-chain risk on February 27, 2026.

What's financially at stake for Anthropic?

A $200 million DoD contract signed in July 2025 was dissolved, and Anthropic is challenging the supply-chain-risk designation in court, arguing it could cost the company billions in revenue.

Pentagon Runs a 25-User Bake-Off of OpenAI, Google, and Grok to Replace Anthropic's Claude

The U.S. Department of Defense has put 25 of its heaviest AI users into a head-to-head bake-off between OpenAI, Google, and xAI's Grok — an internal competition to crown a replacement for Anthropic's Claude in military workflows, Bloomberg reported on May 21.

The contest is the operational endgame of a procurement fight that has been building since winter: rather than negotiate further with Anthropic, the Pentagon is now running rival frontier models against the exact tasks Claude had been handling and letting its most demanding internal users pick the winner.

A scored contest on GenAI.mil

Testing began on March 1, 2026 — three days after Defense Secretary Pete Hegseth designated Anthropic a supply-chain risk on February 27 — according to a senior official cited in the reporting. The 25 designated "power users" are evaluating which of the competing models they prefer across real workloads.

The evaluation runs on GenAI.mil, the department's general-purpose generative-AI environment, which is distinct from the Maven Smart System used for targeting and intelligence fusion. That separation matters: this is about general analytical and productivity workloads, the bread-and-butter use that Claude had quietly absorbed inside the building.

How a $200M contract unraveled

Anthropic secured a $200 million DoD contract in July 2025, and by January 2026 there were reports of Claude being used for Iran intelligence analysis. The rupture came over deployment terms. The Pentagon wanted models cleared for "all lawful purposes"; Anthropic refused to strip the guardrails that prevent its models from being used for mass surveillance or lethal autonomous weaponry. That refusal earned it the supply-chain-risk label and a removal from the roster.

The exclusion was formalized on May 1, when the Pentagon's classified-network AI contracts went to eight vendors — SpaceX, OpenAI, Google, Nvidia, Reflection AI, Microsoft, Oracle, and AWS — with Anthropic absent. Anthropic is fighting the designation in court, arguing it could cost the company billions in revenue.

What it signals for enterprise buyers

For anyone selling models into regulated or high-assurance environments, this is a clean case study in how safety policy becomes a procurement variable. Anthropic's refusal to widen its acceptable-use terms is, by its own framing, a deliberate trade: it cedes the most permissive government contracts to keep its weapons and surveillance restrictions intact. OpenAI, Google, and xAI are treating that line as an opening.

The lesson for builders standing up agentic systems on government or defense data is that model selection is no longer just a benchmark question — it's a question of which vendor's use-policy survives contact with the customer's mission. With defense contractors reportedly given roughly six months to migrate off Claude, the bake-off's outcome will reset the default frontier model across a large slice of federal workloads.

Pentagon Runs a 25-User Bake-Off of OpenAI, Google, and Grok to Replace Anthropic's Claude

A scored contest on GenAI.mil

How a $200M contract unraveled

What it signals for enterprise buyers

More in Policy

Japan's Three Megabanks Get Defensive Access to Anthropic's Mythos via Project Glasswing

Trump Postpones AI Security Order That Would Force 14–90 Day Pre-Release Model Review

arXiv Will Ban Authors for a Year if They Submit Unchecked AI-Generated Papers