Skip to content

Safe defaults

Agents are fast. That’s good when they do something right and bad when they do something wrong — errors scale too. The Prysm:ID stack limits blast radius in three layers: authentication scoped to the human, human confirmation on destructive operations, and strict REST API parity (no MCP “special modes” the API itself wouldn’t allow).

Layer 1 — Authentication: the agent acts as you

Section titled “Layer 1 — Authentication: the agent acts as you”

The MCP server (@prysmid/mcp) authenticates via device flow (RFC 8628). You sign in once on auth.prysmid.com and the server caches an access_token (~12 hours) + refresh_token (~30 days) at ~/.config/prysmid-mcp/token.json.

Consequences:

  • The agent can only operate workspaces you have access to. It’s not a universal “service key” — it’s your credentials delegated via OAuth, so the server-side applies the same validations the dashboard does.
  • Every workspace endpoint validates membership. Even if the agent guesses someone else’s workspace slug, the handler returns 404 (not 403, to avoid leaking existence) — same treatment as the dashboard.
  • You can revoke any time. Sign out from Prysm:ID or delete token.json and the agent loses auth.
  • The audit log tags every action with actor=user:<your-email>. There’s no separate bot actor by design — you know what happened because the tool did it on your behalf.

Layer 2 — Human confirmation on destructive ops

Section titled “Layer 2 — Human confirmation on destructive ops”

The official handoff prompts (Claude Code ES/EN, Antigravity ES/EN) explicitly instruct the agent:

For destructive actions (delete_workspace, delete_oidc_app, delete_idp), ask EXPLICIT confirmation before each call — I’d lose users + apps + IdPs irreversibly.

The agent doesn’t auto-confirm. If you say “no” or “wait”, the agent stops.

Operations the prompt marks as destructive

Section titled “Operations the prompt marks as destructive”
ToolWhy
delete_workspaceIrreversible. Drops the entire Zitadel instance + DNS + SMTP.
delete_oidc_appBreaks any active logins of that app.
delete_idpIf it was the only IdP and password+register was off, you’d leave the workspace without a sign-in method.
delete_userLoses sessions, historical attribution, etc.
set_spending_cap when it reduces below current usageMay cut billing abruptly.

Operations that DON’T require confirmation (and why)

Section titled “Operations that DON’T require confirmation (and why)”
ActionWhy
Any reversible creation (create_workspace, create_oidc_app, add_idp)If the agent creates one too many, you delete it. Low cost.
Reads (list_*, get_*)Obvious.
update_brandingChanging logo / colors is non-destructive; reversible in one call.
update_login_policyCurated tools auto-promote the per-org override without touching other orgs; changes are reversible via PATCH.
set_spending_cap upwardPaying more never broke a customer (cost: leaves overage cap higher, reversible).

The MCP server has no “special modes.” Every MCP tool ultimately makes the same HTTP call the dashboard would to api.prysmid.com. This means:

  • There’s no MCP endpoint the browser couldn’t hit. If the dashboard won’t let you delete X without a modal confirmation, the agent can’t silently do it either — the handler 4xx’s with the same validation.
  • Rate limits are the same. A runaway agent trying to create 1000 workspaces hits the same cap on the REST API, not a special MCP cap.
  • There’s a single audit log. Both dashboard and agent activity appear in the same feed with the same actor=user:<email>.

In app.prysmid.com → audit (when the panel ships — today via API), filter by:

  • Your email as actor → see everything that happened “as you”, including agent actions.
  • The agent’s session time window → all agent activity gets grouped.

There’s no separate via_mcp flag in the audit log because the agent is you via device flow. If you need to distinguish, mark a sandbox workspace vs your prod one, and only let agents work in the sandbox until you trust them.

  • Approval queues: for teams where an agent proposes changes and a different human approves (not the one talking to the agent). In design.
  • Customer-facing service accounts: machine keys with scoped permissions, explicit expiration, audit actor=key:<id>. Pending.
  • Fine-grained per-tool scopes: today the access_token has a broad scope (everything your account can see). We’re debating whether a read-only scope for side-by-side agents makes sense.

The agent is a first-class user. But “first class” doesn’t mean “no guardrails”. It means the same respect and intelligent distrust you’d apply to a new colleague with prod access.

The gates aren’t there to slow you down — they’re there so when your agent hallucinates one day, the blast radius is recoverable, not terminal.