Moonshot AI dropped the "Preview" label on its newest Kimi model this week and made Kimi K2.6 generally available on April 20, 2026, publishing weights to Hugging Face under a Modified MIT License and rolling the model out across Kimi.com, the Kimi App, the official API, and the Kimi Code CLI. The release lands eight days after beta testers first ran K2.6 Code Preview — and it arrives with benchmark numbers that, if they hold up in independent testing, put an open-weight Chinese model ahead of the top frontier closed models on agentic coding.
A trillion-parameter MoE tuned for long-horizon work
K2.6 keeps the 1-trillion-parameter Mixture-of-Experts backbone that has defined the K-line since mid-2025: 32 billion active parameters per token, 384 experts with eight activated plus one shared expert per step, 61 layers (including one dense layer), 64 attention heads with Multi-head Latent Attention, and a 160K-token vocabulary. The context window is 256,000 tokens, with automatic compression that summarizes earlier turns so marathon sessions don't degrade into lossy recall. A 400M-parameter vision encoder called MoonViT handles multimodal inputs.
On published benchmarks, Moonshot reports K2.6 at 58.6 on SWE-Bench Pro, compared to 57.7 for GPT-5.4, 53.4 for Claude Opus 4.6 at max effort, and 50.7 for K2.5. On Humanity's Last Exam with tools, K2.6 posts 54.0 against 52.1 for GPT-5.4 and 53.0 for Opus 4.6. On DeepSearchQA, Moonshot claims 92.5 F1 against 78.6 for GPT-5.4.
300-agent swarms and days-long autonomy
The headline capability is agentic: K2.6 pushes the Agent Swarm cap to 300 sub-agents with up to 4,000 coordinated steps, up from 100 sub-agents and roughly 1,500 steps in K2.5. Moonshot documented two long-horizon case studies — a 12-hour optimization run on a Zig codebase that moved a throughput metric from 15 to 193 tokens per second, and a 13-hour financial-engine overhaul that boosted medium throughput 185 percent, from 0.43 to 1.24 MT/s. The company also says proactive agents can run autonomously for up to five days.
K2.6 ships with a dual inference profile — a slower "Thinking mode" for chain-of-thought work and an "Instant mode" tuned for low-latency front-end tasks — plus a Skills feature that turns PDFs, spreadsheets, and slide decks into reusable task templates, and "Claw Groups" for mixed human-agent collaboration across devices.
Why this release matters
The release targets, in Moonshot's words, "practical deployment scenarios: long-running coding agents, front-end generation from natural language, massively parallel agent swarms coordinating hundreds of specialized sub-agents simultaneously." Translated: Moonshot is pitching K2.6 as the production-grade option for developer teams that want an open-weight alternative to Anthropic's Claude Code stack and OpenAI's Codex for overnight, autonomous engineering work.
The strategic wrinkle is licensing. K2.6 ships under a Modified MIT License with weights on Hugging Face — meaning enterprises can self-host it, fine-tune it, and avoid per-token exposure to U.S. cloud providers. Combined with Z.ai's GLM-5.1 release earlier this month, it's another sign that the open-weight gap to frontier Western labs on agentic coding is effectively closed on reported benchmarks. The real test comes next: whether third-party evaluations, and long-running customer deployments, confirm the 58.6 SWE-Bench Pro score outside Moonshot's own harness.



