What did Google announce at Cloud Next '26?

Google announced its eighth-generation Tensor Processing Units, split into two purpose-built chips: TPU 8t for large-scale model training and TPU 8i for high-speed inference. The company framed the launch as the beginning of the 'agentic era' of AI.

How do the new TPUs compare to Google's previous Ironwood chip?

A TPU 8t superpod scales to 9,600 chips and delivers 121 ExaFlops of compute, nearly triple the per-pod performance of the prior Ironwood generation. TPU 8i offers roughly 80% better performance-per-dollar for inference, and both chips deliver up to 2x better performance-per-watt.

When will the TPU 8t and TPU 8i be available?

Google said both chips will reach general availability later in 2026, with interested customers able to request access through Google Cloud.

Google Splits Eighth-Gen TPUs Into Training and Inference Chips for Agentic Era

Google used the opening keynote at Cloud Next '26 on April 22 to declare the arrival of what it called the 'agentic era' of AI — and to unveil the silicon it believes will power it. The company announced its eighth-generation Tensor Processing Units, splitting the line into two purpose-built chips: TPU 8t for training frontier models and TPU 8i for serving them at scale. The message to Nvidia was unsubtle: Google now intends to compete head-on at both ends of the AI workload.

Two chips instead of one

For seven generations, Google's TPU roadmap produced a single accelerator that tried to balance training and inference. With TPU 8, the company abandoned that compromise. The training-focused TPU 8t scales to 9,600 chips in a single superpod with two petabytes of shared high-bandwidth memory and 121 ExaFlops of compute, delivering close to three times the per-pod performance of the previous Ironwood generation. Google also claims 97% 'goodput' — the share of time the chips spend doing useful work rather than stalling on failures or communication overhead.

The TPU 8i, tuned for inference, takes a different shape. Each chip carries 288 GB of high-bandwidth memory, 384 MB of on-chip SRAM (roughly triple the prior generation), and 19.2 Tb/s of interconnect bandwidth. Google positions it as offering about 80% better performance-per-dollar for inference workloads than Ironwood, and says the two chips together deliver up to 2x better performance-per-watt versus the previous generation.

Virgo Network and the fabric behind the chips

Alongside the accelerators, Google introduced a new data-center networking architecture it calls Virgo Network. According to reporting on the keynote, Virgo provides roughly a 4x increase in bandwidth per accelerator versus the previous generation and can link up to 134,000 TPU 8t chips through a non-blocking bi-sectional fabric of up to 47 petabits per second. That scale matters because agentic workloads — long-running reasoning chains, multi-tool pipelines, continuous background tasks — stress interconnect and memory bandwidth far more than classic chatbot traffic.

Implications: Nvidia, Anthropic, and the cost curve

The split architecture is Google's sharpest attempt yet to undercut Nvidia on the economics of inference, which is increasingly where AI dollars are actually spent. Google's cloud business has been riding a wave of TPU demand from Anthropic, which the company confirmed earlier this year would have access to up to one million TPU chips and more than a gigawatt of capacity in 2026. A cheaper, denser inference chip makes that commitment more defensible — and gives Google a clearer story to tell enterprise customers weighing Nvidia GPUs against custom silicon.

For developers, the takeaway is narrower but still meaningful: the chips that will run the next wave of autonomous agents, long-context assistants, and multi-step workflows are starting to look different from the ones that trained them. Google is betting that bifurcation — and the 'later in 2026' rollout window it set today — will be enough to keep pace as rivals push their own custom accelerators. Nvidia will not concede the inference market without a fight, but after today, it has a visibly sharper challenger.

Google Splits Eighth-Gen TPUs Into Training and Inference Chips for Agentic Era

Two chips instead of one

Virgo Network and the fabric behind the chips

Implications: Nvidia, Anthropic, and the cost curve

More in Industry

IBM Beats Q1 on AI-Fueled Software Growth, But Shares Drop 6%

SpaceX Locks In $60B Option to Acquire Cursor, Pairs Coding Startup With Colossus

Hackers Breach Anthropic's 'Too Dangerous' Mythos Model via Third-Party Vendor