Back to stories
Research

DeepMind's AlphaCode 3 Beats 99% of Competitive Programmers

Michael Ouroumis2 min read
DeepMind's AlphaCode 3 Beats 99% of Competitive Programmers

Google DeepMind has released AlphaCode 3, an AI system that achieves a 99th percentile rating on Codeforces, the world's largest competitive programming platform. The result means AlphaCode 3 can solve problems that stump all but the top 1% of human competitive programmers — a significant leap from AlphaCode 2, which reached the 85th percentile in 2024.

How It Works

AlphaCode 3 uses a three-stage architecture that goes well beyond simple code generation. First, a large language model analyzes the problem statement and generates a structured reasoning plan. Second, a tree-of-thought search explores multiple solution paths in parallel, evaluating trade-offs between algorithmic approaches. Third, a learned code verifier checks each candidate solution against generated test cases and selects the most likely correct answer.

The system generates up to 10,000 solution candidates per problem, then filters them down to a final submission. DeepMind says the verifier is the key innovation — it was trained on millions of correct and incorrect solutions and can predict with high accuracy whether a given solution will pass all test cases.

Results in Detail

On a held-out set of 200 recent Codeforces problems spanning all difficulty levels, AlphaCode 3 solved 167 within the time limits. It performed nearly perfectly on problems rated below 2000 Elo and maintained strong performance up to 2800 Elo — a level that corresponds to Grandmaster rank.

The system struggled most with problems requiring novel mathematical insights or creative construction techniques that had no close precedent in its training data. DeepMind acknowledged this as an open challenge.

"AlphaCode 3 is exceptionally strong at problems that require applying known techniques in complex combinations," said Oriol Vinyals, VP of Research at DeepMind. "The frontier is problems that require genuinely new ideas."

Why It Matters

Competitive programming problems are synthetic, but the skills they test — algorithmic reasoning, edge case handling, optimization under constraints — are directly relevant to real-world software engineering. DeepMind says the reasoning techniques developed for AlphaCode 3 will feed into future Gemini models, improving their ability to handle complex, multi-step coding tasks.

The result also raises questions about the future of competitive programming as a human discipline. If AI systems can outperform nearly all participants, the nature of these competitions may need to evolve.

Not a Product — Yet

AlphaCode 3 remains a research system. It is not available as an API or integrated into any Google product. DeepMind says the computational cost of generating thousands of candidates per problem makes it impractical for real-time use in its current form. However, the underlying techniques — particularly the learned verifier — are being adapted for more efficient deployment in future Gemini releases.

More in Research

AI2 Releases OLMo Hybrid: Combining Transformers and RNNs for 2x Data Efficiency
Research

AI2 Releases OLMo Hybrid: Combining Transformers and RNNs for 2x Data Efficiency

The Allen Institute for AI releases OLMo Hybrid, a fully open 7B model that blends transformer attention with linear recurrent layers, achieving the same accuracy as OLMo 3 using 49% fewer tokens.

8 hours ago2 min read
Stanford Study: AI Tutoring Doubled Student Test Scores in Six Months
Research

Stanford Study: AI Tutoring Doubled Student Test Scores in Six Months

A Stanford-led randomized controlled trial finds that students using AI tutoring systems for 30 minutes daily scored twice as high on standardized math assessments compared to a control group, the strongest evidence yet for AI in education.

1 day ago3 min read
Oxford AI System Predicts Heart Attacks Up to 10 Years in Advance With 92% Accuracy
Research

Oxford AI System Predicts Heart Attacks Up to 10 Years in Advance With 92% Accuracy

Researchers at the University of Oxford have developed CardioSense, an AI system that analyzes routine blood tests and ECG data to predict cardiac events up to a decade before they occur.

3 days ago3 min read