Safe defaults
Agents are fast. That’s good when they do something right and bad when they do something wrong — errors scale too. The Prysm:ID stack limits blast radius in three layers: authentication scoped to the human, human confirmation on destructive operations, and strict REST API parity (no MCP “special modes” the API itself wouldn’t allow).
Layer 1 — Authentication: the agent acts as you
Section titled “Layer 1 — Authentication: the agent acts as you”The MCP server (@prysmid/mcp) authenticates via device flow (RFC 8628). You sign in once on auth.prysmid.com and the server caches an access_token (~12 hours) + refresh_token (~30 days) at ~/.config/prysmid-mcp/token.json.
Consequences:
- The agent can only operate workspaces you have access to. It’s not a universal “service key” — it’s your credentials delegated via OAuth, so the server-side applies the same validations the dashboard does.
- Every workspace endpoint validates membership. Even if the agent guesses someone else’s workspace slug, the handler returns 404 (not 403, to avoid leaking existence) — same treatment as the dashboard.
- You can revoke any time. Sign out from Prysm:ID or delete
token.jsonand the agent loses auth. - The audit log tags every action with
actor=user:<your-email>. There’s no separatebotactor by design — you know what happened because the tool did it on your behalf.
Layer 2 — Human confirmation on destructive ops
Section titled “Layer 2 — Human confirmation on destructive ops”The official handoff prompts (Claude Code ES/EN, Antigravity ES/EN) explicitly instruct the agent:
For destructive actions (
delete_workspace,delete_oidc_app,delete_idp), ask EXPLICIT confirmation before each call — I’d lose users + apps + IdPs irreversibly.
The agent doesn’t auto-confirm. If you say “no” or “wait”, the agent stops.
Operations the prompt marks as destructive
Section titled “Operations the prompt marks as destructive”| Tool | Why |
|---|---|
delete_workspace | Irreversible. Drops the entire Zitadel instance + DNS + SMTP. |
delete_oidc_app | Breaks any active logins of that app. |
delete_idp | If it was the only IdP and password+register was off, you’d leave the workspace without a sign-in method. |
delete_user | Loses sessions, historical attribution, etc. |
set_spending_cap when it reduces below current usage | May cut billing abruptly. |
Operations that DON’T require confirmation (and why)
Section titled “Operations that DON’T require confirmation (and why)”| Action | Why |
|---|---|
Any reversible creation (create_workspace, create_oidc_app, add_idp) | If the agent creates one too many, you delete it. Low cost. |
Reads (list_*, get_*) | Obvious. |
update_branding | Changing logo / colors is non-destructive; reversible in one call. |
update_login_policy | Curated tools auto-promote the per-org override without touching other orgs; changes are reversible via PATCH. |
set_spending_cap upward | Paying more never broke a customer (cost: leaves overage cap higher, reversible). |
Layer 3 — Strict REST API parity
Section titled “Layer 3 — Strict REST API parity”The MCP server has no “special modes.” Every MCP tool ultimately makes the same HTTP call the dashboard would to api.prysmid.com. This means:
- There’s no MCP endpoint the browser couldn’t hit. If the dashboard won’t let you delete X without a modal confirmation, the agent can’t silently do it either — the handler 4xx’s with the same validation.
- Rate limits are the same. A runaway agent trying to create 1000 workspaces hits the same cap on the REST API, not a special MCP cap.
- There’s a single audit log. Both dashboard and agent activity appear in the same feed with the same
actor=user:<email>.
Auditing what your agent did
Section titled “Auditing what your agent did”In app.prysmid.com → audit (when the panel ships — today via API), filter by:
- Your email as actor → see everything that happened “as you”, including agent actions.
- The agent’s session time window → all agent activity gets grouped.
There’s no separate via_mcp flag in the audit log because the agent is you via device flow. If you need to distinguish, mark a sandbox workspace vs your prod one, and only let agents work in the sandbox until you trust them.
Roadmap
Section titled “Roadmap”- Approval queues: for teams where an agent proposes changes and a different human approves (not the one talking to the agent). In design.
- Customer-facing service accounts: machine keys with scoped permissions, explicit expiration, audit
actor=key:<id>. Pending. - Fine-grained per-tool scopes: today the access_token has a broad scope (everything your account can see). We’re debating whether a
read-onlyscope for side-by-side agents makes sense.
Philosophy
Section titled “Philosophy”The agent is a first-class user. But “first class” doesn’t mean “no guardrails”. It means the same respect and intelligent distrust you’d apply to a new colleague with prod access.
The gates aren’t there to slow you down — they’re there so when your agent hallucinates one day, the blast radius is recoverable, not terminal.