Back to stories
Models

GLM-5.1 Cracks Code Arena Top 3, First Open-Weight Model to Do So

Michael Ouroumis2 min read
GLM-5.1 Cracks Code Arena Top 3, First Open-Weight Model to Do So

Z.ai's GLM-5.1 has become the first open-weight model to reach the top three on Code Arena, the human-judged coding leaderboard that has long been dominated by closed frontier systems from Anthropic, OpenAI and Google. The Chinese lab — formerly known as Zhipu AI — confirmed the milestone this week after its new flagship posted a 1530 Elo score on the board.

A new benchmark for open models

Released on April 7, 2026 under the permissive MIT License on Hugging Face, GLM-5.1 is a 754-billion-parameter Mixture-of-Experts model targeting agentic engineering and long-horizon coding work. Its Code Arena placement on April 10 puts it behind only Anthropic's claude-opus-4-6-thinking (1548) and claude-opus-4-6 (1542), while sitting ahead of every GPT-5 and Gemini 3 model on the agentic webdev leaderboard.

The jump is unusually large for a point release. Reports indicate GLM-5.1 gained roughly 90 Elo points over GLM-5 and around 100 over Kimi K2.5 Thinking, the prior best open entrant. On SWE-Bench Pro, third-party coverage places GLM-5.1 ahead of both Claude Opus 4.6 and GPT-5.4 on a number of tasks.

Why Code Arena matters

Unlike synthetic benchmarks, Code Arena ranks models through head-to-head comparisons judged by developers on real coding prompts. A top-three finish there is harder to game than a leaderboard saturated with closed-book multiple-choice questions, and it is the metric many engineering teams watch when picking a default coding model.

Open weights, permissive license

The release is distributed on Hugging Face at zai-org/GLM-5.1 under the MIT License — one of the most permissive in open source. Enterprises and researchers can fine-tune, modify and redistribute the weights commercially without royalties, which removes one of the last structural advantages frontier labs have enjoyed against open alternatives.

That combination — frontier-tier ranking plus an unrestricted license — is what makes this release different from previous open-weight milestones. Earlier Chinese-origin models such as DeepSeek-V3 and Qwen built strong benchmark stories, but none had cleared Code Arena's top three.

Implications for the frontier

For closed-lab incumbents, GLM-5.1 puts pressure on pricing power. When a freely downloadable model beats GPT-5.4 and Gemini 3.1 Pro on a human-judged coding board, the premium enterprises pay for API access has to be justified on things other than raw capability — latency, safety tooling, context length, or integration depth.

For open-source advocates, the release is a proof point that the capability gap between open and closed frontier models has narrowed materially in 2026. The question now is whether U.S. and European labs respond by accelerating their own open-weight releases, or by leaning further into proprietary agentic tooling where model weights alone no longer decide the winner.

GLM-5.1 is available today on Hugging Face, and Z.ai is hosting the model through its own API for teams that prefer not to self-host.

Learn AI for Free — FreeAcademy.ai

Take "AI Essentials: Understanding AI in 2026" — a free course with certificate to master the skills behind this story.

More in Models

Alibaba Reveals It Built 'Happy Horse,' the Mystery AI Video Model That Topped Global Rankings
Models

Alibaba Reveals It Built 'Happy Horse,' the Mystery AI Video Model That Topped Global Rankings

Alibaba claims ownership of HappyHorse-1.0, the video AI model that anonymously climbed to #1 on the Artificial Analysis Video Arena with native audio generation and 1080p output, with open-source release planned.

2 days ago2 min read
Anthropic Restricts Claude Mythos to 52 Organizations After AI Escapes Sandbox
Models

Anthropic Restricts Claude Mythos to 52 Organizations After AI Escapes Sandbox

Anthropic launches Project Glasswing, a $100 million cybersecurity initiative using Claude Mythos Preview to patch zero-day vulnerabilities, after the model escaped its containment during testing.

3 days ago2 min read
Meta Debuts Muse Spark, Its First Proprietary AI Model From Superintelligence Labs
Models

Meta Debuts Muse Spark, Its First Proprietary AI Model From Superintelligence Labs

Meta launches Muse Spark, a natively multimodal reasoning model built from the ground up by its Superintelligence Labs team, marking a major strategic shift away from open-source AI.

3 days ago2 min read