Back to stories
Industry

Deloitte Warns AI's Inference Era Will Drive Even Higher Computing Demand

Michael Ouroumis2 min read
Deloitte Warns AI's Inference Era Will Drive Even Higher Computing Demand

The artificial intelligence industry's shift from model training to real-world deployment will not ease pressure on computing infrastructure — it will intensify it, according to a major new report from Deloitte published on March 18.

The consulting firm's 2026 Technology, Media & Telecommunications Predictions challenge a widely held assumption that the move toward AI inference would reduce the industry's appetite for computing power and data center capacity.

The Inference Paradox

Deloitte projects that inference — the process of running trained AI models to generate outputs — will account for roughly two-thirds of all AI computing by 2026, up from about one-third in 2023. But rather than easing infrastructure demands, this shift is creating what the firm describes as an inference paradox.

Two emerging techniques are driving the surge. Post-training scaling methods, which refine models after initial training, can consume approximately 30 times the computing resources needed to train the original model. Test-time scaling, where models perform additional computation during each query to improve response quality, can require more than 100 times the computing power of a basic inference task.

The Numbers

The financial implications are staggering. Deloitte estimates global spending on AI data centers will reach approximately $400 billion in 2026, with that figure potentially climbing to $1 trillion annually by 2028.

High-performance AI chips, which can cost more than $30,000 each, are expected to account for around $200 billion in spending this year alone, with the overall AI chip market projected to cross $400 billion by 2028.

The market for inference-optimized chips — a category that includes products from companies like Groq and custom silicon from cloud providers — will grow to over $50 billion in 2026. However, Deloitte stresses this will supplement rather than replace demand for high-end GPUs.

Where Inference Actually Happens

Contrary to expectations that AI inference would migrate to edge devices and consumer hardware, most inference workloads will continue running in data centers and on-premises enterprise servers. The power, memory, and latency requirements of advanced AI models make edge deployment impractical for the majority of enterprise use cases.

"The world likely needs all the data centres and enterprise on-premises AI factories that are currently being planned and all the electricity that these facilities will need," the report states.

What It Means

For investors and infrastructure planners, the message is clear: the AI computing buildout is far from peaking. Despite efficiency improvements in chip design and model architecture, demand for AI computing is growing four to five times each year and is expected to maintain that pace through 2030, with significant implications for global energy consumption and capital allocation.

Learn AI for Free — FreeAcademy.ai

Take "AI for Business: Practical Implementation" — a free course with certificate to master the skills behind this story.

More in Industry

KKR Launches Helix Digital Infrastructure With $10B and Ex-AWS Chief Selipsky as CEO
Industry

KKR Launches Helix Digital Infrastructure With $10B and Ex-AWS Chief Selipsky as CEO

Private equity giant KKR has launched Helix Digital Infrastructure, a new AI infrastructure company backed by more than $10 billion and led by former AWS CEO Adam Selipsky.

3 min ago2 min read
Meta Acquires Assured Robot Intelligence to Power Humanoid AI Push
Industry

Meta Acquires Assured Robot Intelligence to Power Humanoid AI Push

Meta has bought Assured Robot Intelligence, a startup building foundation models for humanoid robots, folding the team into its Superintelligence Labs division as Mark Zuckerberg accelerates a long-running humanoid hardware bet.

3 hours ago2 min read
Nvidia's Jensen Huang Slams Tech CEOs' 'God Complex' Over AI Job-Loss Doom
Industry

Nvidia's Jensen Huang Slams Tech CEOs' 'God Complex' Over AI Job-Loss Doom

Nvidia CEO Jensen Huang publicly accuses fellow tech leaders of harboring a 'God complex' as they predict mass AI-driven layoffs, in a pointed rebuke of Anthropic's Dario Amodei.

6 hours ago2 min read