I gave an AI agent deploy access. Then I added a kill switch.
I run several autonomous pipelines: content publishing, deployment automation, price monitoring. They work. But somewhere around week 3 of "it just runs", I hit a wall.
Then one morning I found a LinkedIn post in my drafts queue — written by the agent, scheduled for 9 AM — about a product I hadn't shipped yet. I almost approved it without reading. That was the moment.
Reflex answer: add a human checkpoint. But that's where the actual design work starts.
What I tried first — and rejected
Watch the logs. Logs are reactive. By the time I see the entry, the action is done. Not a decision interface — a post-mortem one.
Telegram yes/no bot. Felt like the obvious answer — I already use it for alerts. But "Deploy now?" with no target, no diff, no blast radius is just rubber-stamping with extra steps. Rejected.
Block the pipeline until I respond. Synchronous human loops kill autonomous throughput. If I'm asleep, everything stalls. The whole point of autonomous pipelines is that they don't wait for me.
The pattern that works: an async approval queue
The agent creates a structured request with full context, then continues other work. I review whenever I have attention. The agent polls and acts when I decide.
Agent → creates request → polls every 30s
↓
Supabase cmd_approvals table
↓
ntfy push → I decide → agent acts
Asynchronous. Neither side blocks.
The design problem
Here's where design thinking enters.
The approval card is a decision-support interface. Its job: give me enough context to decide confidently — not more. Too little context: I rubber-stamp. Too much: cognitive overload, I defer everything.
First version had six fields including affected services, estimated risk, rollback steps. I spent 30 seconds reading each card and still wasn't sure what action to take. Cut it to four. Decision time dropped to under 10 seconds.
The structure I landed on:
- Title — action as verb + object ("Deploy X to production")
- Context — 1–2 sentences with the constraint that makes this a question
- Project tag — immediate blast radius signal
- Status — pending / approved / deferred
No threads. No back-and-forth. One decision per card. Click to expand & decide:
Deploy new onboarding flow to productioncanaristPendingApprovedDeferred▼
Onboarding v2 ready. Affects first-run UX for all new signups. No rollback yet. Deploy now or after rollback is built?
Publish LinkedIn post about design system releasecontentPendingApprovedDeferred▼
Draft ready. Scheduled 9:00 AM. Audience: designers + founders. Confirm before auto-posting.
Switch PriceLog to free tier — engineering as marketingpriceshiftApproved▼
Migrate all products to shared auth (Google OAuth)infraDeferred▼
Architecture
Three components:
cmd_approvals table in Supabase — RLS open for agent inserts, owner-only for updates. Fully decoupled from the agent runtime.
ntfy push notification — fires when a new approval arrives. I decide in under 10 seconds if it's obvious, defer if it needs thought.
/approve Claude Code command — the agent creates the request, then polls status before executing the action. Agent proposes, human disposes.
Trade-offs and honest status
What works: 3–5 approval requests per week. Decision latency under 2 hours. Zero accidental deploys since launch.
What I haven't solved: if I miss a notification and the pipeline times out, the action silently defers. Still deciding whether that's a bug or a feature.
The real failure mode: it only works if the card has enough context. A vague title ("Update config?") is as bad as no approval at all. Discipline in how the agent formats requests matters as much as the infrastructure.