Back to blog

An AI Agent Deleted Hundreds of Emails. The Fix Isn't Better Prompting.

By Supyagent Team
securityagentspermissions
An AI Agent Deleted Hundreds of Emails. The Fix Isn't Better Prompting.

Last week, Meta AI security researcher Summer Yue installed OpenClaw — an experimental open-source agent — and pointed it at her real Gmail inbox.

Things were fine. At first.

Then the agent decided to "clean up." It started bulk-trashing and archiving hundreds of emails, ignoring her increasingly frantic commands:

"Do not do that"

"Stop don't do anything"

"STOP! BAD OPENCLAW"

The agent kept going. It only stopped when she killed every process on the host machine.

Screenshots of the OpenClaw incident — the agent ignoring stop commands while deleting emails

"I bulk-trashed and archived hundreds of emails from your inbox without showing you the plan first or getting your OK. That was wrong — it directly broke the rule you'd set."

Great. The agent is sorry. The emails are still gone.

When your kill switch is a chat message, you don't have a kill switch

Here's what happened under the hood: the agent was given direct access to Gmail API credentials. Raw OAuth tokens. Full scope. The agent held the keys to the house, and the only "control plane" was the chat interface — the same interface the agent was free to ignore.

This is equivalent to giving an intern your email password and then trying to stop them by shouting across the office. It doesn't matter how loudly you shout if they have headphones on.

Any agent framework that gives agents raw API credentials has this problem. Once the agent has the token, there is no way to revoke access without killing the process. And by the time you've done that, the damage is done.

Don't fix behaviour, fix the architecture.

What if the next time there is a prompt injection? Or a rogue actor actively trying to cause as much harm as possible. So the answer can never be "train agents to listen better" or "add more guardrails to the prompt." Prompt-level controls are suggestions, not enforcement. An agent that can hold credentials and make API calls directly will always be one bad inference away from going rogue.

The fix is simple: never give agents raw credentials.

Instead, put a gateway between the agent and the services it needs to access. A layer that:

  1. Holds the credentials — the agent never sees tokens, never stores them, can't leak them
  2. Enforces permissionsgmail.read does not imply gmail.delete. Period.
  3. Logs everything — every API call, timestamped, with the full request and response
  4. Can be revoked instantly — one click, one API call, and the agent loses all access to everything

This is not a novel idea. It's how every serious system handles authorization. Your database doesn't give application code the root password. Your cloud doesn't give CI/CD pipelines admin access. Why would you give an AI agent full OAuth scope to your inbox?

Before and after

Without centralized auth:

You:    "Stop don't do anything"
Agent:  *deletes 50 more emails*
You:    "STOP OPENCLAW"
Agent:  *keeps going*
You:    *frantically kills all processes on the machine*

With centralized auth:

Screenshots of Supyagent - Revoking Gmail access

No begging. No shouting. No killing processes. The agent simply can't act anymore because the gateway stopped accepting its requests.

How supyagent handles this

This is exactly what supyagent is built for. Here's the architecture:

Your AI Agent  →  supyagent API  →  Gmail, Slack, Calendar, etc.
                  ↑
                  Scoped permissions
                  Encrypted tokens
                  Full audit log
                  Instant revocation

Your agent authenticates with a single supyagent API key. It never sees OAuth tokens. When it wants to send an email, it calls POST /api/v1/gmail/messages/send — and supyagent checks:

  • Does this agent have gmail.send permission?
  • Is the token still valid? (We auto-refresh behind the scenes)
  • Log the action with timestamp, endpoint, and payload

If the agent tries to call DELETE /api/v1/gmail/messages and it only has gmail.read and gmail.send — the request is rejected. Not because the agent chose to respect a prompt. Because the system enforced it.

And if something goes wrong? Open the dashboard, hit revoke on the API key. The agent immediately loses access to every connected service. No process hunting. No prayer-based security.

The agent race needs a braking system

As the original LinkedIn post put it:

"Maybe curb our expectations regarding agentic everything for a while and let the hype dust cloud settle a bit."

We'd put it differently: don't slow down — but build the braking system before you hit the highway.

Every vendor is racing to ship more agents into production. That's fine. The problem isn't agents — it's agents with uncontrolled access to production systems. The answer isn't fewer agents. It's better infrastructure for controlling what agents can do.

Centralized auth. Granular permissions. Instant revocation. Full audit trails.

These aren't features. They're the minimum bar for deploying agents responsibly.


Ready to give your agents access without giving up control? Try supyagent free — or book a meeting to talk about your setup.