Back to stories
Research

DeepMind's AlphaCode 3 Beats 99% of Competitive Programmers

Michael Ouroumis2 min read
DeepMind's AlphaCode 3 Beats 99% of Competitive Programmers

Google DeepMind has released AlphaCode 3, an AI system that achieves a 99th percentile rating on Codeforces, the world's largest competitive programming platform. The result means AlphaCode 3 can solve problems that stump all but the top 1% of human competitive programmers — a significant leap from AlphaCode 2, which reached the 85th percentile in 2024.

How It Works

AlphaCode 3 uses a three-stage architecture that goes well beyond simple code generation. First, a large language model analyzes the problem statement and generates a structured reasoning plan. Second, a tree-of-thought search explores multiple solution paths in parallel, evaluating trade-offs between algorithmic approaches. Third, a learned code verifier checks each candidate solution against generated test cases and selects the most likely correct answer.

The system generates up to 10,000 solution candidates per problem, then filters them down to a final submission. DeepMind says the verifier is the key innovation — it was trained on millions of correct and incorrect solutions and can predict with high accuracy whether a given solution will pass all test cases.

Results in Detail

On a held-out set of 200 recent Codeforces problems spanning all difficulty levels, AlphaCode 3 solved 167 within the time limits. It performed nearly perfectly on problems rated below 2000 Elo and maintained strong performance up to 2800 Elo — a level that corresponds to Grandmaster rank.

The system struggled most with problems requiring novel mathematical insights or creative construction techniques that had no close precedent in its training data. DeepMind acknowledged this as an open challenge.

"AlphaCode 3 is exceptionally strong at problems that require applying known techniques in complex combinations," said Oriol Vinyals, VP of Research at DeepMind. "The frontier is problems that require genuinely new ideas."

Why It Matters

Competitive programming problems are synthetic, but the skills they test — algorithmic reasoning, edge case handling, optimization under constraints — are directly relevant to real-world software engineering. DeepMind says the reasoning techniques developed for AlphaCode 3 will feed into future Gemini models, improving their ability to handle complex, multi-step coding tasks.

The result also raises questions about the future of competitive programming as a human discipline. If AI systems can outperform nearly all participants, the nature of these competitions may need to evolve.

Not a Product — Yet

AlphaCode 3 remains a research system. It is not available as an API or integrated into any Google product. DeepMind says the computational cost of generating thousands of candidates per problem makes it impractical for real-time use in its current form. However, the underlying techniques — particularly the learned verifier — are being adapted for more efficient deployment in future Gemini releases.

Learn AI for Free — FreeAcademy.ai

Take "AI Essentials: Understanding AI in 2026" — a free course with certificate to master the skills behind this story.

More in Research

Anthropic's Mythos Is Finding Bugs Faster Than Open-Source Teams Can Patch Them
Research

Anthropic's Mythos Is Finding Bugs Faster Than Open-Source Teams Can Patch Them

Bloomberg reporting this week highlights a lopsided new reality: Anthropic's Mythos model has surfaced thousands of high- and critical-severity vulnerabilities across major operating systems and browsers, but fewer than 1% have been patched by maintainers.

13 hours ago3 min read
Physical Intelligence's π0.7 Robot Brain Teaches Itself Tasks It Was Never Trained On
Research

Physical Intelligence's π0.7 Robot Brain Teaches Itself Tasks It Was Never Trained On

Physical Intelligence's new π0.7 model shows early signs of compositional generalization, letting robots fold laundry and operate new kitchen appliances without task-specific training data.

14 hours ago3 min read
Anthropic Refuses to Fix MCP Flaw Putting 200,000 Servers at Risk
Research

Anthropic Refuses to Fix MCP Flaw Putting 200,000 Servers at Risk

OX Security researchers disclosed a systemic design flaw in Anthropic's Model Context Protocol affecting 150M+ downloads and roughly 200,000 servers. Anthropic declined to modify the architecture, calling the behavior expected.

22 hours ago3 min read