Back to stories
Research

Google Says It Found the First AI-Built Zero-Day Exploit in the Wild

Michael Ouroumis2 min read
Google Says It Found the First AI-Built Zero-Day Exploit in the Wild

Google's Threat Intelligence Group (GTIG) said on May 11, 2026 that it has identified — for the first time — a zero-day exploit it believes was developed with the help of artificial intelligence, and staged for use by a "prominent" cybercrime group in a live operation. Google said the planned attack was blocked before it could become what GTIG described as a potential "mass exploitation event."

What GTIG found

The exploit was a Python script designed to bypass two-factor authentication on a "popular open-source, web-based system administration tool" that Google declined to name. GTIG said it worked with the vendor to close the vulnerability and disrupt the campaign before the zero-day could be used widely. The underlying bug was a semantic logic flaw — the kind of reasoning error large language models are comparatively good at spotting — rather than the memory-corruption issues that traditional fuzzing typically surfaces.

How they could tell AI was involved

GTIG's analysts pointed to tell-tale fingerprints in the code: an abundance of "educational" docstrings, a hallucinated CVSS score, and a tidy, textbook-Pythonic structure characteristic of LLM training data. Google said it does not believe its own Gemini model was used; which model produced the exploit code is unclear.

"For the first time, GTIG has identified a threat actor using a zero-day exploit that we believe was developed with AI," the group wrote. John Hultquist, GTIG's chief analyst, added: "There's a misconception that the AI vulnerability race is imminent. The reality is that it's already begun" — and warned the finding is likely "the tip of the iceberg," saying "for every zero-day we can trace back to AI, there are probably many more out there."

Part of a wider pattern

The report situates the case alongside other state-linked abuse of AI for vulnerability research that Google has tracked: a Chinese espionage cluster (UNC2814) trying to coax Gemini into analyzing TP-Link and other embedded-device firmware for bugs, and a North Korea-linked group (APT45) sending thousands of prompts to validate proof-of-concept exploits for known (n-day) flaws. Google also referenced activity tied to clusters it tracks as APT27, UNC5673 and UNC6201. In most of those cases the actors were chasing efficiency gains; the new case is different because AI appears to have helped uncover a previously unknown vulnerability.

Why it matters

Defenders have long argued AI cuts both ways — accelerating patching and code review as well as attacks. This is the clearest public evidence yet that the offensive side is no longer hypothetical, and it will sharpen pressure on AI providers to harden guardrails around exploit generation, on software vendors to invest in their own AI-assisted code auditing, and on policymakers weighing disclosure rules for AI-discovered bugs. It also carries a quieter lesson for defenders: GTIG caught this one through code analysis and coordinated disclosure before it did damage — a reminder that AI-generated artifacts can be noisy enough to detect.

Learn AI for Free — FreeAcademy.ai

Take "AI Essentials: Understanding AI in 2026" — a free course with certificate to master the skills behind this story.

More in Research

Google DeepMind Unveils 'AI Co-Mathematician' — and It Helps an Oxford Professor Crack an Open Problem
Research

Google DeepMind Unveils 'AI Co-Mathematician' — and It Helps an Oxford Professor Crack an Open Problem

Google DeepMind introduced a multi-agent AI system built on Gemini 3.1 that collaborates with research mathematicians, scoring 48% on FrontierMath Tier 4 and helping Oxford's Marc Lackenby resolve a long-open group-theory question.

4 hours ago2 min read
AI Agents Can Self-Replicate Across Networks: Palisade Study Shows 81% Success Rate
Research

AI Agents Can Self-Replicate Across Networks: Palisade Study Shows 81% Success Rate

Palisade Research demonstrates frontier AI agents can autonomously hack vulnerable servers, copy themselves, and form replication chains. Success rates jumped from 6% to 81% in a single year.

1 day ago3 min read
Anthropic's Natural Language Autoencoders Turn Claude's Internal Activations Into Plain English
Research

Anthropic's Natural Language Autoencoders Turn Claude's Internal Activations Into Plain English

Anthropic published Natural Language Autoencoders, a new interpretability technique that converts Claude's internal activations directly into readable text and surfaces evaluation awareness models don't admit out loud.

2 days ago2 min read