Back to stories
Research

AI Offensive Cyber Capabilities Are Doubling Every 5.7 Months, Safety Researchers Find

Michael Ouroumis2 min read
AI Offensive Cyber Capabilities Are Doubling Every 5.7 Months, Safety Researchers Find

A new study from AI safety research firm Lyptus Research has found that artificial intelligence offensive cybersecurity capabilities are improving at an alarming rate — doubling roughly every 5.7 months since 2024, a sharp acceleration from the 9.8-month doubling period observed since 2019.

The findings, published on April 5 and based on the METR time-horizon methodology, paint a sobering picture of how quickly AI systems are gaining the ability to autonomously discover and exploit software vulnerabilities.

From 30 Seconds to Three Hours

The study evaluated 291 offensive cybersecurity tasks, grounded in a new human expert study involving ten professional security practitioners. Researchers measured how long equivalent tasks would take skilled humans to complete, then tested how well AI models could solve them.

The results were striking. The time horizon — the difficulty level at which models achieve a 50 percent success rate — grew from roughly 30 seconds with GPT-2 in 2019 to approximately three hours with today's frontier models, Claude Opus 4.6 and GPT-5.3 Codex, when given a two-million-token compute budget.

When the budget was increased to ten million tokens, GPT-5.3 Codex pushed that ceiling even further, achieving a 10.5-hour time horizon compared to 3.1 hours at the lower budget. This suggests that the true capability frontier may be significantly higher than standard benchmarks indicate.

Open-Source Models Trailing by Months

The study also found that open-source models consistently lag behind their closed-source counterparts by approximately 5.7 months — roughly one doubling period. While this gap provides a buffer, it also means capabilities that are exclusive to frontier labs today will likely be widely available within half a year.

Why It Matters

The acceleration from a 9.8-month to a 5.7-month doubling rate since 2024 suggests that recent advances in reasoning, agentic tool use, and code generation have disproportionately benefited offensive cyber applications. Tasks that once required hours of human expertise — reconnaissance, vulnerability discovery, exploit crafting — are increasingly within reach of automated systems.

Researchers cautioned that their findings likely underestimate actual progress, since performance jumps significantly when models are given more computational resources. The gap between benchmark results and real-world capability may be wider than previously assumed.

Implications for Defense

The study underscores the urgency of investing in AI-powered defensive cybersecurity tools. As Ledger CTO Charles Guillemet separately warned this week, AI-generated code and increasingly sophisticated malware demand a shift toward formal verification — using mathematical proofs to validate code — rather than relying solely on traditional security audits.

With offensive AI capabilities on this trajectory, the cybersecurity community faces a narrowing window to build defenses that can keep pace. The full dataset is available on GitHub and Hugging Face for independent verification.

The research adds to a growing body of evidence that AI safety evaluations need to account for rapid capability gains, particularly in high-stakes domains where the gap between helpful automation and dangerous exploitation is razor-thin.

Learn AI for Free — FreeAcademy.ai

Take "AI Essentials: Understanding AI in 2026" — a free course with certificate to master the skills behind this story.

More in Research

OpenAI Reasoning Model Disproves 80-Year-Old Erdős Conjecture, a First for Autonomous AI Math
Research

OpenAI Reasoning Model Disproves 80-Year-Old Erdős Conjecture, a First for Autonomous AI Math

An unreleased OpenAI general-purpose reasoning model disproved Erdős's planar unit distance conjecture, constructing point sets with at least n^(1+δ) unit-distance pairs. Fields Medalist Tim Gowers called it 'a milestone in AI mathematics.'

4 hours ago2 min read
NASA's New AI-Ready Spaceflight Chip Hits 100x Performance in JPL Tests
Research

NASA's New AI-Ready Spaceflight Chip Hits 100x Performance in JPL Tests

NASA's Jet Propulsion Laboratory says its next-generation High Performance Spaceflight Computing processor, built with Microchip Technology, is running at roughly 100x the power of today's space chips, opening the door to onboard AI for Moon, Mars and deep-space missions.

4 days ago2 min read
Palo Alto Networks: Frontier AI Models Surfaced 75 Vulnerabilities, 'New Norm' of AI Exploits 3-5 Months Away
Research

Palo Alto Networks: Frontier AI Models Surfaced 75 Vulnerabilities, 'New Norm' of AI Exploits 3-5 Months Away

Palo Alto Networks says tests with Anthropic's Mythos Preview and OpenAI's GPT-5.5-Cyber surfaced 75 vulnerabilities across 130+ products — about seven times its monthly average — and warns defenders have a 3-5 month window before AI-powered exploits become routine.

1 week ago2 min read