Back to stories
Models

Moonshot Kimi K2.6 lands open-source, scales to 300 sub-agents and 4,000 coordinated steps

Michael Ouroumis3 min read
Moonshot Kimi K2.6 lands open-source, scales to 300 sub-agents and 4,000 coordinated steps

Moonshot AI dropped the "Preview" label on its newest Kimi model this week and made Kimi K2.6 generally available on April 20, 2026, publishing weights to Hugging Face under a Modified MIT License and rolling the model out across Kimi.com, the Kimi App, the official API, and the Kimi Code CLI. The release lands eight days after beta testers first ran K2.6 Code Preview — and it arrives with benchmark numbers that, if they hold up in independent testing, put an open-weight Chinese model ahead of the top frontier closed models on agentic coding.

A trillion-parameter MoE tuned for long-horizon work

K2.6 keeps the 1-trillion-parameter Mixture-of-Experts backbone that has defined the K-line since mid-2025: 32 billion active parameters per token, 384 experts with eight activated plus one shared expert per step, 61 layers (including one dense layer), 64 attention heads with Multi-head Latent Attention, and a 160K-token vocabulary. The context window is 256,000 tokens, with automatic compression that summarizes earlier turns so marathon sessions don't degrade into lossy recall. A 400M-parameter vision encoder called MoonViT handles multimodal inputs.

On published benchmarks, Moonshot reports K2.6 at 58.6 on SWE-Bench Pro, compared to 57.7 for GPT-5.4, 53.4 for Claude Opus 4.6 at max effort, and 50.7 for K2.5. On Humanity's Last Exam with tools, K2.6 posts 54.0 against 52.1 for GPT-5.4 and 53.0 for Opus 4.6. On DeepSearchQA, Moonshot claims 92.5 F1 against 78.6 for GPT-5.4.

300-agent swarms and days-long autonomy

The headline capability is agentic: K2.6 pushes the Agent Swarm cap to 300 sub-agents with up to 4,000 coordinated steps, up from 100 sub-agents and roughly 1,500 steps in K2.5. Moonshot documented two long-horizon case studies — a 12-hour optimization run on a Zig codebase that moved a throughput metric from 15 to 193 tokens per second, and a 13-hour financial-engine overhaul that boosted medium throughput 185 percent, from 0.43 to 1.24 MT/s. The company also says proactive agents can run autonomously for up to five days.

K2.6 ships with a dual inference profile — a slower "Thinking mode" for chain-of-thought work and an "Instant mode" tuned for low-latency front-end tasks — plus a Skills feature that turns PDFs, spreadsheets, and slide decks into reusable task templates, and "Claw Groups" for mixed human-agent collaboration across devices.

Why this release matters

The release targets, in Moonshot's words, "practical deployment scenarios: long-running coding agents, front-end generation from natural language, massively parallel agent swarms coordinating hundreds of specialized sub-agents simultaneously." Translated: Moonshot is pitching K2.6 as the production-grade option for developer teams that want an open-weight alternative to Anthropic's Claude Code stack and OpenAI's Codex for overnight, autonomous engineering work.

The strategic wrinkle is licensing. K2.6 ships under a Modified MIT License with weights on Hugging Face — meaning enterprises can self-host it, fine-tune it, and avoid per-token exposure to U.S. cloud providers. Combined with Z.ai's GLM-5.1 release earlier this month, it's another sign that the open-weight gap to frontier Western labs on agentic coding is effectively closed on reported benchmarks. The real test comes next: whether third-party evaluations, and long-running customer deployments, confirm the 58.6 SWE-Bench Pro score outside Moonshot's own harness.

Learn AI for Free — FreeAcademy.ai

Take "AI Essentials: Understanding AI in 2026" — a free course with certificate to master the skills behind this story.

More in Models

OpenAI's 'Spud' Caught Live in API Testing, Polymarket Jumps to 81% for April 23 Launch
Models

OpenAI's 'Spud' Caught Live in API Testing, Polymarket Jumps to 81% for April 23 Launch

API monitors detected OpenAI's next frontier model — codenamed Spud — running in production-scale testing on April 19, sending Polymarket traders to an 81% implied probability of a public launch on April 23.

1 day ago2 min read
OpenAI Launches GPT-Rosalind, Its First Domain-Specific Model Built for Life Sciences
Models

OpenAI Launches GPT-Rosalind, Its First Domain-Specific Model Built for Life Sciences

OpenAI debuts GPT-Rosalind, a specialized AI model for biology, drug discovery, and genomics, with launch partners including Amgen, Moderna, and Los Alamos National Laboratory.

3 days ago2 min read
NVIDIA Launches Ising: Open-Source AI Models to Make Quantum Computers Useful
Models

NVIDIA Launches Ising: Open-Source AI Models to Make Quantum Computers Useful

NVIDIA unveiled Ising, its first family of open-source AI models for quantum computing, promising 2.5x faster error correction and slashing calibration time from days to hours.

1 week ago2 min read