NVIDIA has begun shipping its Blackwell Ultra GPUs to major cloud providers, marking the start of the next cycle of AI infrastructure upgrades. AWS, Microsoft Azure, and Google Cloud Platform are among the first recipients, with instances expected to be available to customers within weeks.
The Numbers
Blackwell Ultra delivers substantial improvements over the previous Hopper generation:
- 4x inference throughput for large language models
- 2.5x training performance per watt
- 192GB HBM3e memory per GPU, up from 80GB on H100
- 1.8TB/s memory bandwidth
- NVLink 5 interconnect supporting up to 576 GPUs in a single domain
These specs translate directly into cost savings for companies running AI at scale. A workload that previously required a cluster of 100 H100s could potentially run on 25 Blackwell Ultra units.
Cloud Provider Plans
AWS
Amazon is deploying Blackwell Ultra in new P6 instances, available initially in us-east-1 and eu-west-1. The instances will support up to 8 GPUs per node with 400Gbps networking.
Microsoft Azure
Azure is integrating the GPUs into its ND-series virtual machines, with tight integration into Azure AI Studio for model training and deployment.
Google Cloud
GCP is offering Blackwell Ultra through its A4 accelerator-optimized instances, with integration into Vertex AI for managed model serving.
Supply Constraints
Despite the shipments, supply remains tight. NVIDIA CEO Jensen Huang acknowledged on a recent earnings call that demand continues to outstrip supply, with lead times extending to several months for large orders. The company has ramped production at TSMC's facilities in Taiwan, but the AI infrastructure buildout shows no signs of slowing.
What This Means for AI Development
The performance improvements in Blackwell Ultra lower the cost floor for training and serving large models. Startups that previously couldn't afford to train competitive models may find the economics more favorable, potentially leading to more competition in the foundation model space.
For inference-heavy applications — chatbots, code assistants, real-time translation — the 4x throughput improvement means significantly lower per-query costs, which could accelerate deployment of AI features in consumer products. NVIDIA's hardware dominance has been a key factor in the company becoming the first to reach a $5 trillion valuation, with Meta's 1.3 million GPU deal illustrating the staggering demand.


