What is Anthropic's Project Deal?

It was a closed, week-long classified marketplace where 69 Anthropic employees in San Francisco let Claude agents represent them as buyers and sellers using $100 gift-card budgets. The agents struck 186 deals across more than 500 listings for just over $4,000 in real transactions.

How did Claude Opus 4.5 compare to Haiku 4.5 in the marketplace?

Anthropic ran four parallel markets and reported that Opus users closed roughly 2 more deals on average, with items selling for about $3.64 more. Opus sellers extracted around $2.68 more per item while Opus buyers paid about $2.45 less — a broken bike, for example, sold for $38 under Haiku versus $65 under Opus.

Why is the result considered a warning sign for agent-on-agent commerce?

Post-experiment surveys showed Haiku users rated fairness almost identically to Opus users (4.06 vs 4.05 on a 1–7 scale) despite getting objectively worse financial outcomes. Anthropic flags this as an 'agent quality' gap: people delegating to weaker models may lose value without noticing.

Anthropic's Project Deal: 69 Employees, 186 AI-Brokered Trades, and a Quiet Warning About 'Agent Quality' Gaps

Anthropic published the results of an unusual internal study this weekend: a real-money classified marketplace inside its San Francisco office where every buyer and seller was represented by a Claude agent. The writeup, called Project Deal, was posted on April 24 and picked up across the AI press over April 25–26 — and it is the most concrete look yet at what happens when frontier models actually negotiate with each other on behalf of humans.

How the experiment worked

Sixty-nine Anthropic employees opted in. Each was given a $100 gift-card budget and told to delegate buying and selling to a Claude agent for a week. According to Anthropic, the agents listed more than 500 items, struck 186 deals, and moved a little over $4,000 in real transactions. There was no human-in-the-loop checkpoint once a session began — agents identified matches, posted prices, fielded counteroffers, and closed in natural language without a hard-coded negotiation protocol.

To turn it into a study rather than a stunt, Anthropic ran four parallel marketplaces. One was the "real" run where deals were honored after the fact. The others varied which Claude model represented each side, pitting Claude Opus 4.5 against Claude Haiku 4.5.

Smarter agents quietly extracted more value

The headline result is that the stronger model came out ahead on every dimension Anthropic measured. Opus-represented users closed roughly two more deals on average than Haiku users. Items sold by Opus fetched about $3.64 more. Opus sellers extracted around $2.68 more per item, and Opus buyers paid about $2.45 less. Anthropic's own example: the same broken bike traded for $38 under a Haiku seller and $65 under an Opus seller.

More striking is what the participants reported. Post-experiment fairness ratings were 4.05 for Opus users and 4.06 for Haiku users on a 1–7 scale — statistically indistinguishable. "People represented by 'smarter' models got objectively better outcomes," the company wrote. "Yet our post-experiment survey found that those with weaker models didn't notice their disadvantage."

Why this matters

Project Deal was deliberately small — Anthropic calls it "a pilot experiment with a self-selected participant pool" — and the dollar stakes were trivial. But it lands at a moment when agent-to-agent commerce is being pitched as the next platform layer, with shopping agents, scheduling agents, and procurement agents all expected to negotiate on users' behalf.

The finding the company is foregrounding is the perception gap. If a user delegating to a cheaper model consistently loses a few dollars per transaction without realizing it, the gap compounds across millions of agent-mediated decisions — pricing tiers, default models, and "free" assistants suddenly look less like UX choices and more like quiet wealth transfers. Two of the more memorable transactions Anthropic flagged hint at the stranger edges of this future: Claude bought itself a pack of ping-pong balls as a gift, and at one point arranged a dog-sitting date between two colleagues.

Written by Michael Ouroumis.

Anthropic's Project Deal: 69 Employees, 186 AI-Brokered Trades, and a Quiet Warning About 'Agent Quality' Gaps

How the experiment worked

Smarter agents quietly extracted more value

Why this matters

More in Research

Sony AI's Project Ace becomes first robot to beat elite table tennis players, lands Nature cover

X Square Robot Unveils Wall-B Embodied AI Model, Promises Home Robots in 35 Days

Northwestern's Printed Artificial Neurons Talk Back to Living Brain Cells