Back to stories
Industry

AWS and NVIDIA Announce Million-GPU Deployment in Expanded AI Infrastructure Partnership

Michael Ouroumis2 min read
AWS and NVIDIA Announce Million-GPU Deployment in Expanded AI Infrastructure Partnership

Amazon Web Services and NVIDIA announced a significantly expanded partnership at GTC 2026, with AWS committing to deploy more than one million NVIDIA GPUs across its cloud regions — a scale of AI infrastructure investment that underscores just how aggressively hyperscalers are racing to meet enterprise AI demand.

A Million GPUs and Counting

The headline number is staggering: over one million NVIDIA GPUs deployed across AWS regions starting in 2026. The commitment makes AWS one of the largest single customers for NVIDIA's AI accelerators and positions Amazon's cloud division to capture a growing share of enterprise AI workloads that require massive compute.

The deployment spans multiple NVIDIA GPU generations. Notably, AWS will be the first major cloud provider to offer NVIDIA RTX PRO 4500 Blackwell Server Edition GPUs on Amazon EC2, giving developers access to the latest Blackwell architecture for inference and training workloads.

Nemotron Models Come to Bedrock

Beyond hardware, the partnership brings NVIDIA's open AI models directly into Amazon's managed AI platform. NVIDIA Nemotron 3 Super — a hybrid mixture-of-experts model with 120 billion total parameters but only 12 billion active per forward pass — is coming to Amazon Bedrock. The model is designed for complex multi-agent workloads including software development, cybersecurity triaging, and extended reasoning tasks.

The smaller Nemotron 3 Nano is already available on Amazon Bedrock, and is also available separately within Salesforce Agentforce on NVIDIA's own infrastructure. Critically, developers will soon be able to fine-tune Nemotron models directly on Bedrock using reinforcement fine-tuning, lowering the barrier to customizing open models for enterprise-specific use cases.

Infrastructure Innovations

The collaboration extends into infrastructure optimization. AWS is integrating NVIDIA NIXL for interconnect acceleration to support disaggregated large language model inference on its Elastic Fabric Adapter network. The companies also highlighted 3x faster Apache Spark performance using Amazon EMR on Elastic Kubernetes Service with EC2 G7e instances — a meaningful improvement for data engineering pipelines that feed AI workloads.

The Bigger Picture

This partnership fits within NVIDIA CEO Jensen Huang's broader GTC narrative about the "inference inflection" — the idea that AI has crossed the threshold where it can do productive work at scale, and the bottleneck is now infrastructure supply rather than model capability.

For enterprises evaluating where to run AI workloads, the AWS-NVIDIA expansion means more GPU availability, tighter model integration, and a clearer path from prototyping on managed services to production-scale deployment. As AI spending accelerates — industry analysts project total global AI spending to reach $2.5 trillion by the end of 2026, with AI infrastructure accounting for approximately $1.37 trillion of that total — partnerships of this scale will determine which cloud platforms capture the next wave of enterprise AI adoption.

How AI Actually Works — Free Book on FreeLibrary

A free book that explains the AI concepts behind the headlines — no jargon, just clarity.

More in Industry

NVIDIA Unveils GR00T N2 Robot Foundation Model, Pushes Physical AI From Lab to Factory Floor
Industry

NVIDIA Unveils GR00T N2 Robot Foundation Model, Pushes Physical AI From Lab to Factory Floor

NVIDIA announced GR00T N2 at GTC 2026, a next-generation robot foundation model based on DreamZero research that doubles task success rates over leading alternatives.

12 hours ago2 min read
NVIDIA GTC 2026: Jensen Huang Projects $1 Trillion AI Chip Pipeline, Previews Kyber Architecture
Industry

NVIDIA GTC 2026: Jensen Huang Projects $1 Trillion AI Chip Pipeline, Previews Kyber Architecture

NVIDIA CEO Jensen Huang opened GTC 2026 by forecasting $1 trillion in combined Blackwell and Vera Rubin orders through 2027, while previewing the Kyber rack architecture that will power the next era of AI supercomputing.

12 hours ago2 min read
NVIDIA GTC 2026: Jensen Huang Projects $1 Trillion in AI Chip Orders, Previews Kyber Architecture
Industry

NVIDIA GTC 2026: Jensen Huang Projects $1 Trillion in AI Chip Orders, Previews Kyber Architecture

At GTC 2026, NVIDIA CEO Jensen Huang projected $1 trillion in combined Blackwell and Vera Rubin orders through 2027 and unveiled the Kyber rack architecture shipping in Vera Rubin Ultra systems next year.

12 hours ago2 min read