NVIDIA CEO Jensen Huang used his two-hour GTC 2026 keynote on March 16 in San Jose to lay out the company's most ambitious roadmap yet — projecting $1 trillion in combined purchase orders for its Blackwell and Vera Rubin chip architectures through 2027. That figure doubles a previous estimate of $500 billion and underscores how aggressively hyperscalers and enterprises are investing in AI infrastructure.
The Inference Inflection
A central theme of the keynote was what Huang called the shift from training to inference as the primary driver of AI compute demand. With frontier models now deployed at massive scale, the cost and speed of generating tokens — not training them — is becoming the bottleneck for the industry.
To address this, Huang formally introduced the Groq 3 Language Processing Unit (LPU), the first chip to emerge from NVIDIA's $20 billion acquisition of inference startup Groq. Each Groq 3 LPU contains approximately 500 MB of stacked SRAM, and a full Groq LPX rack holds 256 LPUs with roughly 128 GB of aggregate on-chip memory and 640 TB/s of scale-up bandwidth. According to NVIDIA, the Groq LPX rack can boost tokens-per-watt performance by 35 times when paired with Rubin GPUs.
Vera Rubin in Full
The Vera Rubin platform — named after the astronomer whose work revealed dark matter — now comprises seven chips across five rack-scale systems. The flagship Vera Rubin NVL72 integrates 72 Rubin GPUs and 36 Vera CPUs connected through a massive NVLink copper spine, effectively functioning as a single GPU. NVIDIA says the system delivers 10x more performance per watt than its predecessor, Grace Blackwell.
Kyber: The Next Leap
Perhaps the most forward-looking reveal was Kyber, NVIDIA's next-generation rack architecture. Kyber rotates compute trays 90 degrees to a vertical orientation, packing 144 GPUs into a single rack for significantly higher density and lower latency. At full scale, the NVL576 configuration — with 576 GPUs across 144 packages — is expected to deliver 14 times the performance of the current GB300 NVL72 for both training and inference workloads.
Kyber will first appear in Vera Rubin Ultra systems, which NVIDIA expects to ship in 2027.
What It Means
The $1 trillion projection signals that NVIDIA sees no slowdown in AI infrastructure spending. With hyperscalers like AWS and Microsoft already committed to deploying Vera Rubin systems at scale, the company is positioning itself at the center of an inference-driven economy where token generation — not model training — defines the next wave of AI value creation.



