Back to stories
Industry

Nvidia's Vera Rubin Rack Hits $7.8M as Memory Costs Surge 435%

Michael Ouroumis2 min read
Nvidia's Vera Rubin Rack Hits $7.8M as Memory Costs Surge 435%

A single next-generation Nvidia Vera Rubin VR200 NVL72 rack will cost hyperscalers around $7.8 million, with memory alone accounting for roughly $2 million — a 435% jump over the prior generation, according to Morgan Stanley Research estimates that circulated this week. The rack price is nearly double the ~$4 million of a GB300 NVL72, and it reframes the AI buildout as much a memory-procurement problem as a GPU one.

The memory bill is the story

Morgan Stanley's bill-of-materials breakdown shows memory rising from roughly $374,000 in the Grace Blackwell generation to over $2 million in Vera Rubin — about a 435% increase, and now ~25% of the rack's total cost. (Tom's Hardware framed the jump as 485% in its headline, but the underlying figure cited is 435%.)

Two structural shifts drive it. Each VR200 NVL72 carries 54TB of LPDDR5X, roughly triple the 17TB in a GB200 NVL72. And every rack now ships with $1 million or more of 3D NAND storage — a line item that was effectively zero in GB200. Layer HBM4 on the Rubin GPUs on top, and memory becomes the fastest-growing slice of the BOM at a moment when DRAM and HBM are already supply-constrained.

Where the rest goes

The Rubin GPUs remain the single largest line item at roughly $4 million aggregate per rack — Nvidia is expected to charge about $55,000 per Rubin GPU and $5,000 per Vera CPU when selling in volume inside VR200 chassis. Higher-end estimates from Tom's Hardware put fully configured racks as high as $8.8 million, with thin margins for the server OEMs assembling them. Vera Rubin is in production, with first shipments expected in Q3 2026 and volume ramp in Q4.

What changes for buyers

For anyone modeling cost-per-token or capacity planning against a fixed capex envelope, the takeaway is blunt: memory inflation, not silicon, is now the swing variable. A 435% memory increase inside one generation means the DRAM/HBM supercycle — already visible in SK Hynix, Samsung, and Micron sold-out HBM4 lines — is being passed straight through to rack prices. Hyperscalers spending $600 billion-plus on 2026 capex will absorb fewer racks per dollar than roadmaps assumed a year ago.

The practical implication for infra teams: lock memory-sensitive procurement early, and treat per-rack TCO — not headline FLOPS — as the planning unit. With memory at a quarter of the BOM and climbing, the marginal cost of inference capacity is increasingly set by the DRAM market, not by Nvidia's compute roadmap.

By Michael Ouroumis

Learn AI for Free — FreeAcademy.ai

Take "AI for Business: Practical Implementation" — a free course with certificate to master the skills behind this story.

More in Industry

Qualcomm Surges to Record High on AI Inference Chips and Stellantis Snapdragon Deal
Industry

Qualcomm Surges to Record High on AI Inference Chips and Stellantis Snapdragon Deal

Qualcomm shares closed up roughly 12% Friday and are up about 75% in a month as a Stellantis Snapdragon expansion, AI200/AI250 data-center inference chips, and an OpenAI device tie-up reprice it as a diversified AI compute supplier.

1 hours ago2 min read
Intuit Cuts 3,000 Jobs in AI Restructuring — While Insisting 'None of It Had to Do With AI'
Industry

Intuit Cuts 3,000 Jobs in AI Restructuring — While Insisting 'None of It Had to Do With AI'

Intuit is cutting roughly 17% of its 18,200-person workforce and taking up to $340M in charges as it redirects capital toward multi-year model deals with Anthropic and OpenAI. The stock fell ~13%.

3 hours ago2 min read
Modal Labs Raises $355M at $4.65B as Coding Agents Reshape GPU Demand
Industry

Modal Labs Raises $355M at $4.65B as Coding Agents Reshape GPU Demand

Serverless AI infrastructure startup Modal Labs closed a $355M Series C at a $4.65B valuation, roughly 4x its September mark, as ARR jumped from $60M to ~$300M on demand from AI coding agents needing sandboxed compute.

4 hours ago2 min read