Back to stories
Industry

MiniMax M2.5 Matches Claude Opus 4.6 on Coding Benchmarks — at 1/20th the Cost

Michael Ouroumis2 min read
MiniMax M2.5 Matches Claude Opus 4.6 on Coding Benchmarks — at 1/20th the Cost

MiniMax M2.5 Matches Claude Opus 4.6 on Coding Benchmarks — at 1/20th the Cost

The economics of frontier AI just shifted again. MiniMax, a Chinese AI startup that has more than doubled its sales in the past year, released M2.5 as open weights — and the benchmarks are making the industry take notice.

The Numbers

M2.5 scores 80.2% on SWE-Bench Verified, the real-world software engineering benchmark that has become the gold standard for evaluating coding models. That matches Anthropic's Claude Opus 4.6, currently considered the best coding model available.

On Multi-SWE-Bench, which tests models across multiple repositories simultaneously, M2.5 ranks first at 51.3%.

The catch — or rather, the lack of one — is the cost. M2.5 runs at approximately one dollar per hour at 100 tokens per second, which is nearly twice the generation speed of other frontier models. That translates to roughly 1/20th the cost of running Claude Opus 4.6 through Anthropic's API.

How It Works

M2.5 is a 230-billion parameter mixture-of-experts model that activates only 10 billion parameters per forward pass. This architecture is what makes the cost equation possible — the model has frontier-level knowledge encoded in its 230 billion parameters but only needs the compute budget of a 10-billion parameter model for each token.

The model was trained using MiniMax's proprietary Forge reinforcement learning framework and released on Hugging Face under a modified MIT license.

What It Does Well

Beyond raw benchmarks, M2.5 has drawn attention for practical enterprise capabilities. It handles agentic tool use — autonomously calling APIs, writing files, and executing multi-step workflows — at a level that previously required models costing twenty times more.

The model also generates Microsoft Office documents directly, a capability that positions it for enterprise productivity workflows where creating Word documents, Excel spreadsheets, and PowerPoint presentations from natural language is increasingly expected.

The Bigger Implication

MiniMax's thesis is straightforward: the future of AI is not about building the smartest model. It is about building the smartest model that organizations can actually afford to deploy at scale.

When frontier performance is available at commodity pricing, the competitive advantage shifts from model capability to integration, reliability, and domain-specific fine-tuning. Companies that built their AI strategy around a single premium API provider may need to rethink their approach.

The open-weights release means any team can download M2.5 today and start evaluating it against their existing Claude or GPT deployments. For many production workloads, the performance-per-dollar improvement will be hard to ignore.

Learn AI for Free — FreeAcademy.ai

Take "AI for Business: Practical Implementation" — a free course with certificate to master the skills behind this story.

More in Industry

Stanford AI Index: China Closes Gap With US to Just 2.7% as Scholar Inflows Collapse
Industry

Stanford AI Index: China Closes Gap With US to Just 2.7% as Scholar Inflows Collapse

Stanford's 2026 AI Index shows China's top model trailing Anthropic's Claude Opus 4.6 by only 39 Elo points on Arena, while AI scholars moving to the US have dropped 89% since 2017.

6 min ago2 min read
Intel Launches Core Series 3 Chips, Bringing 40 TOPS AI to Budget Laptops
Industry

Intel Launches Core Series 3 Chips, Bringing 40 TOPS AI to Budget Laptops

Intel's new Core Series 3 processors, built on the 18A node, deliver up to 40 platform TOPS and a 17-TOPS NPU to entry-level laptops and edge systems starting April 16, 2026.

1 hours ago3 min read
Upscale AI In Talks for $2B Valuation as AI Networking Race Heats Up
Industry

Upscale AI In Talks for $2B Valuation as AI Networking Race Heats Up

Santa Clara startup Upscale AI is reportedly in talks to raise roughly $200M at a $2 billion valuation, just three months after its Series A, as investors pile into the networking layer of AI infrastructure.

3 hours ago3 min read