Do I need the API to do this, or does Claude.ai work?

Claude.ai works for the first three versions. To use prompt caching and a persistent system prompt across many runs, you want the API or the Anthropic Console.

Why does Claude misread my Figma export?

Almost always because you gave it no context. The model tries to reason from pixels alone. Tell it what design system it is looking at, what components exist, and what good looks like, and the answers change completely.

How many examples should I give Claude?

Start with three to five hand-labeled gray cases. The easy ones are not worth your time. The hard ones are where examples earn their cost.

How to Prompt Claude to Audit a Figma Design

Five versions of one audit prompt

01
Bare prompt

No context. Claude guesses the domain.
02
Task + tone

Domain locked. Confidence calibrated.
03
System prompt

Tokens, components, rules cached.
04
Analysis order

Form before sketch. Components first.
05
Output format

XML tags. Parseable. No preamble.

What this guide covers

Most prompts fail the same way: you gave the model nothing to work with, then blamed the model.

This guide shows the fix. I take a bare prompt asking Claude to review a Figma screenshot, and walk it through five versions until it produces a clean, structured audit you could paste into a ticket. The method is the same one Anthropic’s Applied AI team uses internally. I am translating it from a Swedish car insurance form (their example) into a design audit (your job).

The setup

You have a screenshot of a Figma frame. You want Claude to tell you:

Which tokens are used correctly
Which values look hardcoded
Which components drift from the design system
What to fix first

Five iterations. Each one fixes something specific.

Version 1: the bare prompt

Review this design and tell me what is wrong.
[screenshot attached]

What you get: a generic critique that mentions spacing, contrast, and “consider using a design system.” Claude is guessing because you gave it nothing to anchor on. This is the equivalent of the Anthropic example where Claude thought a Swedish car form was a skiing accident.

Version 2: add task context and tone

You are reviewing a screen from a SaaS dashboard against our design system.
Stay factual. If you cannot tell whether something is correct, say so.
Do not invent token names you have not been shown.

[screenshot attached]

What changes: Claude stops making things up and starts flagging what it cannot verify. The output gets more cautious, which is what you want from an auditor.

Version 3: put the design system in the system prompt

Move the stable context (tokens, components, rules) into the system prompt. Move the screen into the user prompt. The system prompt is for things that never change. The user prompt is for things that do.

System prompt:

You are a design system auditor. You review screens against the system below.

<design_system>
  <tokens>
    color.background.primary: #FFFFFF
    color.background.surface: #F5F5F5
    color.text.primary: #171717
    color.text.secondary: #525252
    color.border.default: #E5E5E5
    spacing.xs: 4px
    spacing.sm: 8px
    spacing.md: 16px
    spacing.lg: 24px
    spacing.xl: 32px
    radius.sm: 4px
    radius.md: 8px
  </tokens>

  <components>
    Button: 40px height, radius.md, spacing.md horizontal padding. Three variants: primary, secondary, ghost.
    Input: 40px height, radius.sm, color.border.default 1px stroke.
    Card: radius.md, color.background.surface, spacing.lg internal padding.
  </components>

  <rules>
    No raw hex values in any element. All colors must reference a token.
    No off-grid spacing. All spacing is a multiple of 4px and should map to a token.
    No off-system component variants. If a button does not match the three variants above, flag it.
  </rules>
</design_system>

If a value cannot be matched to a token, flag it as <hardcoded>.
If a component does not match the system, flag it as <drift>.
If you cannot tell, say so.

User prompt:

Audit this screen.
[screenshot attached]

What changes: Claude stops asking what your design system looks like and starts comparing against it. The output names specific tokens. Hardcoded values get flagged.

Version 4: tell Claude the order of analysis

A human auditor does not look at colors, spacing, components, and accessibility all at once. Neither should Claude. Add a step-by-step block to the system prompt:

<analysis_order>
1. First, identify every component on the screen. List them.
2. For each component, check the variant against the system. Flag drift.
3. For each component, check the colors used. Flag hardcoded values.
4. Then check spacing. Flag off-grid values.
5. Last, check overall layout and accessibility.
</analysis_order>

What changes: the audit becomes structured and consistent. Claude stops mixing concerns. You can read the output top to bottom and act on it.

Version 5: lock the output format

Drop this at the end of the system prompt:

<output_format>
Wrap your final audit in <audit> tags.
Inside, use these sections in order:
  <components_found>
  <drift_issues>
  <hardcoded_values>
  <spacing_issues>
  <priority_fixes>

Each issue must include: what, where, recommended_fix.
Do not include preamble. Start directly with <audit>.
</output_format>

What changes: the output is now parseable. You can pipe it into a script, paste it into Linear, or post it to Slack. The preamble is gone.

What you have now

A repeatable prompt that turns a screenshot into a structured audit. The system prompt is reusable across every screen in the product. The user prompt is one line plus an attachment.

This is the method, not the magic. The magic is that the method works.

Beyond Figma: auditing live pages

A screenshot is fine when the design lives in Figma. When it lives at a URL, you have better options. Both let you audit the real thing, not a static image of it.

Firecrawl: clean markdown plus screenshot from any URL

Firecrawl takes a URL and returns clean markdown and a full-page screenshot. You feed both into the same audit prompt. Markdown gives Claude the actual text content (so it can flag copy issues, missing alt text, semantic structure). The screenshot gives Claude the visual layer.

This is the move when:

You are auditing a competitor or an inspiration site
You are checking a deployed staging URL
You want to audit dozens of pages without manually exporting each one

The user prompt becomes one line plus a Firecrawl-fetched bundle:

Audit this page using the design system rules in the system prompt.
<page_url>https://example.com/pricing</page_url>
<page_markdown>[firecrawl markdown]</page_markdown>
[full-page screenshot attached]

Playwright: programmatic screenshots at every breakpoint

Playwright automates a real browser. It can screenshot every component, every page, and every breakpoint without you clicking through. Pair it with this audit prompt and you have a script that audits your whole product on every deploy.

Two starter guides:

Automate Browser Testing Without Writing Code, the no-code path, good for designers running their first audit script
Automate Component Screenshots with Playwright, the component-level version, better for design system maintenance

The pattern: Playwright captures the screenshots, you batch them through the audit prompt, and the structured XML output lands in a markdown report or a Linear ticket. The audit goes from “I look at one screen for 20 minutes” to “the script audits 80 screens overnight.”

What to try next

Add 3 to 5 few-shot examples of past audits to the system prompt for tricky cases (see Few-Shot Examples for Component Naming)
Turn on extended thinking when an audit goes wrong, and read the scratchpad to see why (see Extended Thinking as a Design System Debugger)
Lift the system prompt into a reusable template (see Design System Work System Prompt)
Wire it into a real browser flow with Playwright or pull live pages with Firecrawl

Source

This walkthrough adapts the iterative method shown in Anthropic’s Prompting 101 session by Hannah and Christian from the Applied AI team. Their example was an insurance claim form. I translated the same five-version arc into design audit work because the underlying logic is identical: invariant context goes in the system prompt, variable content goes in the user prompt, order of analysis matters, output formatting closes the loop.

Finished this lesson?

Mark it complete to track your progress through "Agentic Design Systems".

Lesson

13 / 16

Progress

81%

How to Prompt Claude to Audit a Figma Design

Bare prompt

Task + tone

System prompt

Analysis order

Output format

What this guide covers

The setup

Version 1: the bare prompt

Version 2: add task context and tone

Version 3: put the design system in the system prompt

Version 4: tell Claude the order of analysis

Version 5: lock the output format

What you have now

Beyond Figma: auditing live pages

Firecrawl: clean markdown plus screenshot from any URL

Playwright: programmatic screenshots at every breakpoint

What to try next

Source

Finished this lesson?

Build a Component Audit Skill for Claude Code

Audit 500 Components in 10 Minutes

Bare prompt

Task + tone

System prompt

Analysis order

Output format

What this guide covers

The setup

Version 1: the bare prompt

Version 2: add task context and tone

Version 3: put the design system in the system prompt

Version 4: tell Claude the order of analysis

Version 5: lock the output format

What you have now

Beyond Figma: auditing live pages

Firecrawl: clean markdown plus screenshot from any URL

Playwright: programmatic screenshots at every breakpoint

What to try next

Source

Finished this lesson?

Create an account to continue

Read this next

Audit a Design System in 30 Minutes

Build a Component Audit Skill for Claude Code

Audit 500 Components in 10 Minutes