Back to stories
Industry

AWS and NVIDIA Announce Million-GPU Deployment in Expanded AI Infrastructure Partnership

Michael Ouroumis2 min read
AWS and NVIDIA Announce Million-GPU Deployment in Expanded AI Infrastructure Partnership

Amazon Web Services and NVIDIA announced a significantly expanded partnership at GTC 2026, with AWS committing to deploy more than one million NVIDIA GPUs across its cloud regions — a scale of AI infrastructure investment that underscores just how aggressively hyperscalers are racing to meet enterprise AI demand.

A Million GPUs and Counting

The headline number is staggering: over one million NVIDIA GPUs deployed across AWS regions starting in 2026. The commitment makes AWS one of the largest single customers for NVIDIA's AI accelerators and positions Amazon's cloud division to capture a growing share of enterprise AI workloads that require massive compute.

The deployment spans multiple NVIDIA GPU generations. Notably, AWS will be the first major cloud provider to offer NVIDIA RTX PRO 4500 Blackwell Server Edition GPUs on Amazon EC2, giving developers access to the latest Blackwell architecture for inference and training workloads.

Nemotron Models Come to Bedrock

Beyond hardware, the partnership brings NVIDIA's open AI models directly into Amazon's managed AI platform. NVIDIA Nemotron 3 Super — a hybrid mixture-of-experts model with 120 billion total parameters but only 12 billion active per forward pass — is coming to Amazon Bedrock. The model is designed for complex multi-agent workloads including software development, cybersecurity triaging, and extended reasoning tasks.

The smaller Nemotron 3 Nano is already available on Amazon Bedrock, and is also available separately within Salesforce Agentforce on NVIDIA's own infrastructure. Critically, developers will soon be able to fine-tune Nemotron models directly on Bedrock using reinforcement fine-tuning, lowering the barrier to customizing open models for enterprise-specific use cases.

Infrastructure Innovations

The collaboration extends into infrastructure optimization. AWS is integrating NVIDIA NIXL for interconnect acceleration to support disaggregated large language model inference on its Elastic Fabric Adapter network. The companies also highlighted 3x faster Apache Spark performance using Amazon EMR on Elastic Kubernetes Service with EC2 G7e instances — a meaningful improvement for data engineering pipelines that feed AI workloads.

The Bigger Picture

This partnership fits within NVIDIA CEO Jensen Huang's broader GTC narrative about the "inference inflection" — the idea that AI has crossed the threshold where it can do productive work at scale, and the bottleneck is now infrastructure supply rather than model capability.

For enterprises evaluating where to run AI workloads, the AWS-NVIDIA expansion means more GPU availability, tighter model integration, and a clearer path from prototyping on managed services to production-scale deployment. As AI spending accelerates — industry analysts project total global AI spending to reach $2.5 trillion by the end of 2026, with AI infrastructure accounting for approximately $1.37 trillion of that total — partnerships of this scale will determine which cloud platforms capture the next wave of enterprise AI adoption.

Learn AI for Free — FreeAcademy.ai

Take "AI for Business: Practical Implementation" — a free course with certificate to master the skills behind this story.

More in Industry

SoftBank Spins Out Roze, Eyes $100B IPO to Robotize AI Data Centers
Industry

SoftBank Spins Out Roze, Eyes $100B IPO to Robotize AI Data Centers

SoftBank is preparing to spin out Roze, a new AI and robotics company that uses autonomous robots to build data centers, with founder Masayoshi Son targeting a US listing at a valuation near $100 billion as soon as the second half of 2026.

4 min ago3 min read
Huawei Targets $12B AI Chip Revenue as ByteDance, Alibaba, Tencent Scramble for Ascend 950PR
Industry

Huawei Targets $12B AI Chip Revenue as ByteDance, Alibaba, Tencent Scramble for Ascend 950PR

Huawei expects 60% AI chip revenue growth to roughly $12B in 2026 as Chinese cloud giants reopen procurement talks for the Ascend 950PR following DeepSeek V4's launch optimized for Huawei silicon.

1 hours ago2 min read
Half of Google's and Amazon's 'Blowout' AI Profits Came From Their Anthropic Stake
Industry

Half of Google's and Amazon's 'Blowout' AI Profits Came From Their Anthropic Stake

A Fortune analysis of Q1 2026 earnings shows roughly $28.7 billion of Alphabet's $62.6 billion profit and more than half of Amazon's pre-tax income came from unrealized markups on their Anthropic equity — not the cloud and AI businesses Wall Street is celebrating.

6 hours ago2 min read