I gave an AI agent deploy access. Then I added a kill switch.

I run several autonomous pipelines: content publishing, deployment automation, price monitoring. They work. But somewhere around week 3 of "it just runs", I hit a wall.

Then one morning I found a LinkedIn post in my drafts queue — written by the agent, scheduled for 9 AM — about a product I hadn't shipped yet. I almost approved it without reading. That was the moment.

Reflex answer: add a human checkpoint. But that's where the actual design work starts.

What I tried first — and rejected

Watch the logs. Logs are reactive. By the time I see the entry, the action is done. Not a decision interface — a post-mortem one.

Telegram yes/no bot. Felt like the obvious answer — I already use it for alerts. But "Deploy now?" with no target, no diff, no blast radius is just rubber-stamping with extra steps. Rejected.

Block the pipeline until I respond. Synchronous human loops kill autonomous throughput. If I'm asleep, everything stalls. The whole point of autonomous pipelines is that they don't wait for me.

The pattern that works: an async approval queue

The agent creates a structured request with full context, then continues other work. I review whenever I have attention. The agent polls and acts when I decide.

Agent → creates request → polls every 30s
              ↓
    Supabase cmd_approvals table
              ↓
    ntfy push → I decide → agent acts

Asynchronous. Neither side blocks.

The design problem

Here's where design thinking enters.

The approval card is a decision-support interface. Its job: give me enough context to decide confidently — not more. Too little context: I rubber-stamp. Too much: cognitive overload, I defer everything.

First version had six fields including affected services, estimated risk, rollback steps. I spent 30 seconds reading each card and still wasn't sure what action to take. Cut it to four. Decision time dropped to under 10 seconds.

The structure I landed on:

  • Title — action as verb + object ("Deploy X to production")
  • Context — 1–2 sentences with the constraint that makes this a question
  • Project tag — immediate blast radius signal
  • Status — pending / approved / deferred

No threads. No back-and-forth. One decision per card. Click to expand & decide:

Approval Queue — click cards to expand & decide
Deploy new onboarding flow to productioncanaristPendingApprovedDeferred

Onboarding v2 ready. Affects first-run UX for all new signups. No rollback yet. Deploy now or after rollback is built?

Deploy now — rollback can follow
Build rollback first
2 min ago·claude-code
Publish LinkedIn post about design system releasecontentPendingApprovedDeferred

Draft ready. Scheduled 9:00 AM. Audience: designers + founders. Confirm before auto-posting.

Post at 9 AM — looks good
Needs edit — hold for now
14 min ago·claude-code
Switch PriceLog to free tier — engineering as marketingpriceshiftApproved
Engineering as marketing — ship it
2h ago·claude-code
Migrate all products to shared auth (Google OAuth)infraDeferred
After PitchFlint launch — not now
5h ago·claude-code

Architecture

Three components:

cmd_approvals table in Supabase — RLS open for agent inserts, owner-only for updates. Fully decoupled from the agent runtime.

ntfy push notification — fires when a new approval arrives. I decide in under 10 seconds if it's obvious, defer if it needs thought.

/approve Claude Code command — the agent creates the request, then polls status before executing the action. Agent proposes, human disposes.

Trade-offs and honest status

What works: 3–5 approval requests per week. Decision latency under 2 hours. Zero accidental deploys since launch.

What I haven't solved: if I miss a notification and the pipeline times out, the action silently defers. Still deciding whether that's a bug or a feature.

The real failure mode: it only works if the card has enough context. A vague title ("Update config?") is as bad as no approval at all. Discipline in how the agent formats requests matters as much as the infrastructure.