NVIDIA has officially launched the Rubin platform — its most ambitious chip architecture to date and the successor to the record-breaking Blackwell generation. The platform comprises six co-designed chips built from the ground up to work as a single AI supercomputer.
Six Chips, One System
Unlike previous generations where the GPU did most of the heavy lifting, Rubin is a tightly integrated six-chip platform:
- Vera CPU — NVIDIA's custom ARM-based processor
- Rubin GPU — The next-generation AI accelerator
- NVLink 6 Switch — High-speed chip-to-chip interconnect
- ConnectX-9 SuperNIC — Network interface for data center fabric
- BlueField-4 DPU — Data processing unit for infrastructure offload
- Spectrum-6 Ethernet Switch — Data center networking
The key insight is extreme codesign: all six chips were developed together to eliminate bottlenecks between compute, memory, and networking. The result is 3.6 terabytes per second of bandwidth per GPU and 260 TB/s of total connectivity.
The Numbers That Matter
Compared to Blackwell, Rubin delivers:
- 10x reduction in inference cost per token
- 4x fewer GPUs needed to train mixture-of-experts (MoE) models
- 2x raw performance increase over Blackwell
For enterprises running large language models in production, that 10x inference cost reduction is the headline number. It means the same workload that costs $100,000/month on Blackwell could run for $10,000 on Rubin — or the same budget buys 10x more throughput.
Who Gets It First
Rubin is already in full production, with partner products shipping in the second half of 2026. AWS, Google Cloud, Microsoft Azure, and Oracle Cloud will be among the first to offer Rubin-based instances. Cloud partners CoreWeave, Lambda, Nebius, and Nscale are also in the first wave.
Built for Agentic AI
NVIDIA is positioning Rubin specifically for the agentic AI workloads that are defining 2026: autonomous reasoning systems that chain multiple model calls together, maintain long contexts, and interact with external tools. These workloads are inference-heavy and latency-sensitive — exactly where Rubin's architecture is optimized.
GTC Preview
With NVIDIA's GTC conference set for March 16, CEO Jensen Huang has hinted at "several new chips the world has never seen before." Whether that means Rubin Ultra variants or something entirely new, the AI compute race shows no signs of slowing down.
Rubin doesn't just raise the performance ceiling — it fundamentally changes the economics of running AI at scale.



