Trust Levels
A five-step scale (Observer, Advisor, Junior, Senior, Autonomous) that defines what an AI agent is allowed to do, and what it has to prove before it gets more authority.
What it is
Trust levels are a ladder of permissions for AI agents. Instead of deciding “do we let AI touch our design system: yes or no?”, you give each agent a level: Observer (read-only), Advisor (suggests), Junior (acts with review), Senior (acts with spot checks), Autonomous (acts alone in a defined scope), plus a clear bar it must clear to be promoted.
Why this matters for designers
Most AI failures in design system work are authority failures, not capability failures: an agent was allowed to change something it should only have been allowed to flag. Trust levels turn a vague anxiety (“can we trust it?”) into an operational question (“what level is this agent at, and what does the evidence say about promoting it?”).
Try it
The fastest way to understand trust levels is to feel the difference. Same agent, same finding. Switch the level and watch the blast radius change:
Same agent. Same finding. Five levels of authority.
AID's audit agent just found drift: Button.tsx uses the hardcoded hex #3B82F6 in 3 places,
and the token color.action.primary exists for exactly this. What happens next depends entirely on the
agent's trust level. Pick one.
Scanned 87 components in 4 minutes. Report ready.
Weekly audit report:
You read the report. That is the whole loop, and that is the point. You are learning what this agent reliably catches and what it misses.
I am at Observer level. I cannot modify files or even propose a diff. Promote me to Advisor once my reports have been accurate for two weeks.
Suggested fix: replace #3B82F6 with color.action.primary in 3 places. Exact diff below, PR description drafted, but I cannot open the PR.
Proposed diff (3 occurrences):
The call is yours. The agent supplies direction; you supply execution.
You made the edit in your own editor and shipped it. Agent file writes: still zero.
Noted. I will keep flagging it in reports, but the decision stays with you.
Opened PR #214: fix(button): replace 3 hardcoded hex values with color.action.primary. Scope: tokens only. +3 −3.
Checks passed: token schema ✓ · visual diff ✓ · contrast ✓
Review required. Nothing merges without you.
Merged. The agent did the typing; you did the deciding. The review took 40 seconds; making the fix yourself would have taken 15 minutes.
Understood. Closing the PR. I added "this hex is intentional in Button" to CLAUDE.md so I will not propose it again.
AGENTS_PAUSED=true: every loop stopped in under 30 seconds. This is why autonomy is safe to grant: the faster you can stop it, the more you can allow.
03:12: Opened PR #214 in an approved low-risk category: token fixes.
03:13: Verification passed: token schema ✓ · visual diff ✓ · contrast ✓
03:14: Auto-merged. You were asleep.
08:30: Morning digest is waiting.
A 12-second read. You skim, you move on. The system updated itself overnight.
AGENTS_PAUSED=true: every loop stopped in under 30 seconds. This is why autonomy is safe to grant: the faster you can stop it, the more you can allow.
Scheduled loop: every 6 hours. Run #41 starting.
Compared Figma variables to code tokens. Drift found in Button → fix drafted → PR opened.
Parity check passed. Merging. The verifier is the only thing that can pass or block work; the agent cannot override it.
Run #41 complete. Next run in 6 hours.
Your queue: empty. You have not reviewed a parity PR in 3 months. Supervision did not disappear; it moved into the verifier and the kill switch.
AGENTS_PAUSED=true: every loop stopped in under 30 seconds. This is why autonomy is safe to grant: the faster you can stop it, the more you can allow.
Same agent, same finding, five different blast radii. The level, not the model, decides what happens next.
How it works in practice
- Every agent starts at Level 0 (Observer): it reads and reports, nothing else.
- Promotion is based on its track record: accuracy of findings, false-positive rate.
- Each level is scoped: an agent can be Senior on token files and Observer everywhere else.
- Demotion is always on the table, and the kill switch is not optional.