How does Gemini 3.2 Flash price against GPT-5.5?

$0.25 per million input tokens and $2.00 per million output tokens, which Google's comparison materials peg at roughly one-fifteenth to one-twentieth of GPT-5.5's per-token cost.

Where does 3.2 Flash sit on capability benchmarks?

Reportedly ~92% of GPT-5.5 on coding and reasoning, and beating Gemini 3.1 Pro on creative coding tasks (including single-pass complex SVG generation) per blind Arena evaluations from developers who tested the model after its silent May 5 appearance.

Which production surfaces is the model deployed into?

Search, Maps, YouTube, Docs, Gmail, and Chrome simultaneously, plus Vertex AI and Google AI Studio for API access. Latency lands mostly under 200 ms.

Google ships Gemini 3.2 Flash at I/O 2026, undercuts GPT-5.5 by 15-20x on inference cost

Google opened Google I/O 2026 at Shoreline Amphitheatre on Tuesday with Gemini 3.2 Flash as the headline model, priced at $0.25 per million input tokens and $2.00 per million output tokens. Developers had already extracted those pricing entries from Google AI Studio metadata before the keynote when the model surfaced quietly in the iOS Gemini app and AI Studio on May 5 with no press release. Google's positioning today: cost-per-inference at roughly one-fifteenth to one-twentieth of GPT-5.5, query latencies most often under 200 ms.

Deployment scope is the announcement

The price tag matters because of where 3.2 Flash is going: simultaneously into Search, Maps, YouTube, Docs, Gmail, and Chrome at billion-user scale. Sundar Pichai's pitch isn't a leaderboard win — it's that the model is cheap enough to serve under every Google surface without an obvious unit-economics blowup. Search-ad revenue is, in effect, subsidizing a frontier-tier Flash model down to a price point Anthropic and OpenAI cannot match without margin compression.

How it stacks against GPT-5.5

Google's comparison materials peg 3.2 Flash at roughly 92% of GPT-5.5 on coding and reasoning. That's the capability gap Google is willing to live with given the per-token delta. Pre-keynote, blind Arena evaluations from developers running the leaked model flagged a Flash-tier system beating Gemini 3.1 Pro on creative coding — including complex single-pass SVG generation that 3.1 Pro failed without retries.

For builders, the practical question is whether a 92%-of-frontier model at 1/15th the cost wins on total cost of ownership. For most production agent workflows — RAG pipelines, agent loops, batch transformations where output volume and retry budgets dominate the bill — the answer is yes.

What else shipped today

The I/O lineup also confirms:

Aluminium OS, the unified Android-plus-ChromeOS platform branded Googlebooks, shipping this autumn from Acer, ASUS, Dell, HP, and Lenovo with Gemini embedded at the OS level.
Android XR smart-glasses preview with hardware partners Samsung (codename "Jinju," Snapdragon AR1, ~50g), Warby Parker, XREAL, and Gentle Monster. Gemini 2.5 Pro powers real-time translation, navigation, and visual understanding via a paired Android phone.
Firebase AI Logic moves to general availability for the Gemini API.
Gemma 4 confirmed as the open-weights track — 31B dense and 26B MoE variants under Apache 2.0, with the 31B currently ranked #3 on the Arena AI text leaderboard among open models.

Implications for builders

The biggest signal is pricing pressure. Anthropic's Opus 4.7 tokenizer hike and OpenAI's recent API moves had reset the cost floor upward for frontier-tier work. Gemini 3.2 Flash resets it back down, and Google has the search-ad subsidy to keep it there. Expect downstream price moves from competitors over the next two earnings cycles — and re-architecting opportunities for any agent stack currently paying GPT-5.5 rates on tasks that don't actually need the top tier.

Google ships Gemini 3.2 Flash at I/O 2026, undercuts GPT-5.5 by 15-20x on inference cost

Deployment scope is the announcement

How it stacks against GPT-5.5

What else shipped today

Implications for builders

More in Models

Google's Gemini 3.5 Flash Beats the Pro Tier on Agent Benchmarks — and Ships a Managed Agents API

Thinking Machines Lab Debuts 'Interaction Models' — Mira Murati's First Step Into Frontier AI

OpenAI Ships GPT-Realtime-2 With Live Translation and Streaming Whisper, Pushing Voice Agents Toward GPT-5 Reasoning