Back to stories
Models

AI21 Labs Releases Jamba 2, a Hybrid SSM-Transformer That Matches GPT-5 at One-Fifth the Cost

Michael Ouroumis2 min read
AI21 Labs Releases Jamba 2, a Hybrid SSM-Transformer That Matches GPT-5 at One-Fifth the Cost

AI21 Labs has released Jamba 2, a 398-billion-parameter model that takes a fundamentally different approach to architecture by interleaving Mamba-style state space model (SSM) layers with traditional transformer attention layers. The result matches GPT-5 and Claude Sonnet 4.5 on major reasoning benchmarks while running inference at roughly one-fifth the cost.

How the Hybrid Architecture Works

Pure transformer models compute attention across all tokens in a sequence, creating quadratic scaling costs as context windows grow. Jamba 2 replaces a significant portion of these attention layers with SSM layers based on the Mamba architecture, which process sequences in linear time by maintaining a compressed state representation instead of attending to every previous token.

The attention layers that remain handle tasks where precise token-to-token relationships matter — retrieval, exact matching, and fine-grained reasoning. The SSM layers handle long-range dependency tracking, summarization, and general language modeling. AI21 reports that this division of labor is what makes the cost reduction possible without sacrificing quality.

Benchmark Results

On MMLU-Pro, HumanEval+, and MATH-500, the 398B Jamba 2 scores within striking distance of both GPT-5 and Claude Sonnet 4.5. Where the model pulls ahead is on long-document tasks. With a 256K context window and linear-time SSM layers handling the bulk of long-range processing, Jamba 2 outperforms all competitors on multi-document QA, long-form summarization, and needle-in-a-haystack retrieval at extreme context lengths.

AI21 claims the cost advantage compounds at longer contexts. At 256K tokens, Jamba 2 inference is roughly 8x cheaper than a comparable pure-transformer model because the SSM layers avoid the quadratic attention blowup entirely.

Three Model Sizes

The Jamba 2 family includes three tiers:

The open-weight Mini release gives developers and researchers access to the hybrid architecture for experimentation and fine-tuning, following the trend set by DeepSeek R2 and other recent open-weight releases.

Why It Matters

Jamba 2 is the strongest evidence yet that pure transformer architectures may not be the final answer. The hybrid SSM-transformer approach addresses the two biggest pain points in LLM deployment — inference cost and long-context performance — without requiring the kind of hardware breakthroughs that GPU manufacturers are racing to deliver. If these efficiency gains hold at scale, other labs will face pressure to adopt similar hybrid designs or explain why they are paying five times more for equivalent results.

Learn AI for Free — FreeAcademy.ai

Take "AI Essentials: Understanding AI in 2026" — a free course with certificate to master the skills behind this story.

More in Models

Moonshot Kimi K2.6 lands open-source, scales to 300 sub-agents and 4,000 coordinated steps
Models

Moonshot Kimi K2.6 lands open-source, scales to 300 sub-agents and 4,000 coordinated steps

Moonshot AI shipped Kimi K2.6 as a generally available open-source model on April 20, posting 58.6 on SWE-Bench Pro — ahead of GPT-5.4 and Claude Opus 4.6 — while scaling agent swarms to 300 sub-agents and 4,000 coordinated steps.

9 hours ago3 min read
OpenAI's 'Spud' Caught Live in API Testing, Polymarket Jumps to 81% for April 23 Launch
Models

OpenAI's 'Spud' Caught Live in API Testing, Polymarket Jumps to 81% for April 23 Launch

API monitors detected OpenAI's next frontier model — codenamed Spud — running in production-scale testing on April 19, sending Polymarket traders to an 81% implied probability of a public launch on April 23.

1 day ago2 min read
OpenAI Launches GPT-Rosalind, Its First Domain-Specific Model Built for Life Sciences
Models

OpenAI Launches GPT-Rosalind, Its First Domain-Specific Model Built for Life Sciences

OpenAI debuts GPT-Rosalind, a specialized AI model for biology, drug discovery, and genomics, with launch partners including Amgen, Moderna, and Los Alamos National Laboratory.

4 days ago2 min read