A single next-generation Nvidia Vera Rubin VR200 NVL72 rack will cost hyperscalers around $7.8 million, with memory alone accounting for roughly $2 million — a 435% jump over the prior generation, according to Morgan Stanley Research estimates that circulated this week. The rack price is nearly double the ~$4 million of a GB300 NVL72, and it reframes the AI buildout as much a memory-procurement problem as a GPU one.
The memory bill is the story
Morgan Stanley's bill-of-materials breakdown shows memory rising from roughly $374,000 in the Grace Blackwell generation to over $2 million in Vera Rubin — about a 435% increase, and now ~25% of the rack's total cost. (Tom's Hardware framed the jump as 485% in its headline, but the underlying figure cited is 435%.)
Two structural shifts drive it. Each VR200 NVL72 carries 54TB of LPDDR5X, roughly triple the 17TB in a GB200 NVL72. And every rack now ships with $1 million or more of 3D NAND storage — a line item that was effectively zero in GB200. Layer HBM4 on the Rubin GPUs on top, and memory becomes the fastest-growing slice of the BOM at a moment when DRAM and HBM are already supply-constrained.
Where the rest goes
The Rubin GPUs remain the single largest line item at roughly $4 million aggregate per rack — Nvidia is expected to charge about $55,000 per Rubin GPU and $5,000 per Vera CPU when selling in volume inside VR200 chassis. Higher-end estimates from Tom's Hardware put fully configured racks as high as $8.8 million, with thin margins for the server OEMs assembling them. Vera Rubin is in production, with first shipments expected in Q3 2026 and volume ramp in Q4.
What changes for buyers
For anyone modeling cost-per-token or capacity planning against a fixed capex envelope, the takeaway is blunt: memory inflation, not silicon, is now the swing variable. A 435% memory increase inside one generation means the DRAM/HBM supercycle — already visible in SK Hynix, Samsung, and Micron sold-out HBM4 lines — is being passed straight through to rack prices. Hyperscalers spending $600 billion-plus on 2026 capex will absorb fewer racks per dollar than roadmaps assumed a year ago.
The practical implication for infra teams: lock memory-sensitive procurement early, and treat per-rack TCO — not headline FLOPS — as the planning unit. With memory at a quarter of the BOM and climbing, the marginal cost of inference capacity is increasingly set by the DRAM market, not by Nvidia's compute roadmap.
By Michael Ouroumis



