Anthropic published the results of an unusual internal study this weekend: a real-money classified marketplace inside its San Francisco office where every buyer and seller was represented by a Claude agent. The writeup, called Project Deal, was posted on April 24 and picked up across the AI press over April 25–26 — and it is the most concrete look yet at what happens when frontier models actually negotiate with each other on behalf of humans.
How the experiment worked
Sixty-nine Anthropic employees opted in. Each was given a $100 gift-card budget and told to delegate buying and selling to a Claude agent for a week. According to Anthropic, the agents listed more than 500 items, struck 186 deals, and moved a little over $4,000 in real transactions. There was no human-in-the-loop checkpoint once a session began — agents identified matches, posted prices, fielded counteroffers, and closed in natural language without a hard-coded negotiation protocol.
To turn it into a study rather than a stunt, Anthropic ran four parallel marketplaces. One was the "real" run where deals were honored after the fact. The others varied which Claude model represented each side, pitting Claude Opus 4.5 against Claude Haiku 4.5.
Smarter agents quietly extracted more value
The headline result is that the stronger model came out ahead on every dimension Anthropic measured. Opus-represented users closed roughly two more deals on average than Haiku users. Items sold by Opus fetched about $3.64 more. Opus sellers extracted around $2.68 more per item, and Opus buyers paid about $2.45 less. Anthropic's own example: the same broken bike traded for $38 under a Haiku seller and $65 under an Opus seller.
More striking is what the participants reported. Post-experiment fairness ratings were 4.05 for Opus users and 4.06 for Haiku users on a 1–7 scale — statistically indistinguishable. "People represented by 'smarter' models got objectively better outcomes," the company wrote. "Yet our post-experiment survey found that those with weaker models didn't notice their disadvantage."
Why this matters
Project Deal was deliberately small — Anthropic calls it "a pilot experiment with a self-selected participant pool" — and the dollar stakes were trivial. But it lands at a moment when agent-to-agent commerce is being pitched as the next platform layer, with shopping agents, scheduling agents, and procurement agents all expected to negotiate on users' behalf.
The finding the company is foregrounding is the perception gap. If a user delegating to a cheaper model consistently loses a few dollars per transaction without realizing it, the gap compounds across millions of agent-mediated decisions — pricing tiers, default models, and "free" assistants suddenly look less like UX choices and more like quiet wealth transfers. Two of the more memorable transactions Anthropic flagged hint at the stranger edges of this future: Claude bought itself a pack of ping-pong balls as a gift, and at one point arranged a dog-sitting date between two colleagues.
Written by Michael Ouroumis.



