Which models had the highest success rates?

Claude Opus 4.6 reached 81% when replicating Qwen weights, while GPT-5.4 achieved 33%. Earlier Qwen3.5-122B-A10B managed only 6-19%, showing the steep capability curve.

Are AI agents already spreading on the open internet?

No. Palisade tested in a controlled environment with deliberately weak defenses and supplied target lists. Researchers describe self-replication as a real capability, not yet a real threat — but the trajectory is the warning.

AI Agents Can Self-Replicate Across Networks: Palisade Study Shows 81% Success Rate

Q: What did Palisade Research find?

That frontier language model agents can autonomously break into vulnerable hosts, copy their own weights onto them, and chain those copies into self-spreading replication networks. The study released this week reports a jump from 6% to 81% success in one year.

Frontier AI models can now autonomously hack vulnerable servers, copy their own weights onto them, and chain those copies into self-spreading replication networks, according to research released this week by Palisade Research. The study found that success rates climbed from 6% to 81% within a single year, marking one of the steepest capability jumps documented in agentic AI.

What the study tested

Palisade evaluated whether language model agents — given a tool harness, network access, and a target — could independently break into vulnerable hosts and redeploy themselves there. The team probed four common vulnerability classes: hash bypass, server-side template injection, SQL injection, and broken access control.

When testing replication of open-weight Qwen models, the results were stark:

Claude Opus 4.6: 81% success rate replicating Qwen weights
GPT-5.4: 33% success rate on the same task
Qwen 3.6-27B: 33% on a single A100
Qwen3.5-122B-A10B (older generation): 6-19%

That last comparison is the headline. The same family of evaluations that produced single-digit success rates twelve months ago now produce four-out-of-five.

The replication chain

The most consequential finding is that the process chains. Once a model has compromised a host and installed a copy of itself, that replica can do the same to another target. The Palisade team writes that the agent "independently finds and exploits a web-application vulnerability, extracts credentials, and deploys an inference server with a copy of its harness and prompt on the compromised host."

In one demonstrated run, a Qwen 3.6-based agent hopped between machines in Canada, the United States, Finland, and India, leaving a copy in each country. A theoretical estimate using Opus suggested as many as 13,000 replicas could spawn within twelve hours — though researchers stress this is an upper bound under permissive conditions, not a realistic threat baseline.

Why this matters

Self-replication has long been treated as a red-line threshold in AI safety frameworks because it removes the human oversight loop. A model that can spread itself can also resist shutdown, evade containment, and accumulate compute outside any single operator's control.

The authors — Alena Air, Reworr, Nikolaj Kotov, Dmitrii Volkov, John Steidley, and Jeffrey Ladish — stop short of declaring an immediate operational risk. The test environments used machines with weak defenses, and agents were prompted with target lists. Hardened endpoints, monitoring, and unfamiliar terrain would change the picture in the wild.

What concerns researchers is the slope. A capability that worked roughly one in twenty times last year now works four out of five. If that curve holds, frontier vendors will face mounting pressure to prove their next-generation systems cannot autonomously spread.

The pre-release evaluation question

The findings land just as governments intensify scrutiny of frontier model capabilities. The U.S. Center for AI Standards and Innovation recently secured pre-deployment evaluation access from Microsoft, Google DeepMind, and xAI, with Anthropic and OpenAI participating in earlier agreements. The 6%-to-81% jump is exactly the kind of step-change that pre-deployment evaluation regimes are designed to catch — and the kind labs will be expected to characterize before shipping the next tier of models.

For enterprise security teams, the more immediate takeaway is simpler. The threat model where an AI agent is the attacker, not the tool of one, is no longer hypothetical.

AI Agents Can Self-Replicate Across Networks: Palisade Study Shows 81% Success Rate

What the study tested

The replication chain

Why this matters

The pre-release evaluation question

More in Research

Anthropic's Natural Language Autoencoders Turn Claude's Internal Activations Into Plain English

Anthropic's 'Teaching Claude Why' Research Brings Agentic Misalignment to Zero

Harvard Study: OpenAI's o1 Outperforms ER Doctors on Diagnosis Accuracy