The most widely deployed personal AI agent fails basic security tests — poisoning one part of its memory triples attack success

What happened

Researchers tested OpenClaw, an AI agent with full access to your email, files, and payment systems, against real attacks. They found that corrupting any single part of the agent's memory (what it can do, who it thinks it is, or what it knows) increases successful attacks from 25% to 64-74%, and even the strongest defenses still fail 64% of the time against capability-targeted attacks.

Why this matters

OpenClaw is already deployed at scale in early 2026 with privileges that make it useful but also make it a liability. The paper shows the vulnerabilities are baked into the architecture itself, not fixable with better sandboxing or detection. This means either the deployment model changes (agents lose broad system access), or users are running a system where a single corrupted instruction can hijack their email, bank account, and files with a two-thirds success rate. The gap between what makes these agents useful and what makes them safe is not a tuning problem.

The signal

What happens next

Whether OpenClaw's deployment terms change in the next 12 months to restrict filesystem or service access, or whether attack success rates against live instances in the wild match the lab results.