Back to stories
Industry

Nvidia Is Building a Secret Inference Chip With Groq Tech — And OpenAI Is the First Customer

Michael Ouroumis2 min read
Nvidia Is Building a Secret Inference Chip With Groq Tech — And OpenAI Is the First Customer

Nvidia has dominated AI training for years. Now it wants to own inference too — and it's using $20 billion worth of acquired technology to do it.

The Secret Chip

According to a report from SiliconANGLE, Nvidia is preparing to unveil a new inference-focused processor at its annual GTC developer conference in San Jose later this month. The chip integrates Language Processing Unit (LPU) architecture that Nvidia licensed from Groq Inc. in December for $20 billion, along with hiring Groq's founding CEO Jonathan Ross and President Sunny Madra.

Groq's LPU architecture takes a fundamentally different approach to inference. Instead of repurposing GPUs designed for training, LPUs are built from the ground up to decode language model outputs with dramatically lower latency and energy consumption.

OpenAI Signs On First

The biggest signal of the chip's potential: OpenAI has already committed as the lead customer. The deal includes a massive purchase of dedicated inference capacity, backed by a $30 billion investment from Nvidia into OpenAI's infrastructure. That's not a research partnership — it's a production-scale commitment.

For OpenAI, which runs ChatGPT for over 900 million users, inference costs dwarf training costs. A chip purpose-built for fast, efficient model serving could meaningfully change the economics of running frontier models at consumer scale.

Why Inference Matters Now

The AI industry has reached an inflection point. Training the biggest models still requires enormous GPU clusters, but the real cost center has shifted. Every ChatGPT response, every Copilot suggestion, every Claude conversation is an inference workload. Companies are spending more on running models than building them.

Nvidia currently controls over 90% of the GPU market for AI training, but inference is more competitive. AMD, Intel, AWS custom silicon, and startups like Cerebras are all targeting the inference market. The Groq acquisition gives Nvidia a purpose-built architecture rather than just optimizing existing GPUs.

What to Watch at GTC

GTC 2026 runs later this month and is expected to be Nvidia's biggest product launch since the Blackwell architecture. Beyond the inference chip, CEO Jensen Huang is expected to detail the full Rubin platform roadmap and new software tools for agentic AI workloads.

The inference chip could reshape how AI companies budget their compute. If it delivers on the efficiency promises of Groq's LPU architecture, running frontier models just got a lot cheaper.

Learn AI for Free — FreeAcademy.ai

Take "AI for Business: Practical Implementation" — a free course with certificate to master the skills behind this story.

More in Industry

Horizon Robotics Unveils Xingkong, China's First Cockpit-Driving Fusion Chip
Industry

Horizon Robotics Unveils Xingkong, China's First Cockpit-Driving Fusion Chip

Horizon Robotics launched Xingkong (codenamed 'Stellar') today at the Smart EV Development Forum, fusing cockpit and autonomous-driving compute onto a single chip and cutting per-vehicle costs by 1,500-4,000 yuan.

7 min ago2 min read
Cadence and NVIDIA Expand Partnership to Close the Sim-to-Real Gap for Robotics and Chip Design
Industry

Cadence and NVIDIA Expand Partnership to Close the Sim-to-Real Gap for Robotics and Chip Design

At CadenceLIVE 2026, Cadence and NVIDIA announced an expanded partnership combining agentic AI, physics simulation, and digital twins — targeting robotics sim-to-real, AI factory efficiency, and 10x productivity in chip design.

3 hours ago2 min read
SoundHound AI to Acquire LivePerson in $43M All-Stock Deal, Forging Omnichannel Conversational AI Leader
Industry

SoundHound AI to Acquire LivePerson in $43M All-Stock Deal, Forging Omnichannel Conversational AI Leader

SoundHound AI will acquire LivePerson for $43 million in an all-stock deal valuing the combined business at a $250 million enterprise value, uniting voice agentic AI with digital messaging that powers one billion customer messages per month.

4 hours ago2 min read