The model is the only reviewer who reads every line of the file. Rohan Kartik.

The model is the only reviewer who reads every line of the file.

Most teams I have worked with treat AI in design as a generation problem. Generate the screen, generate the icon, generate the copy. The interesting work is somewhere else. Generation is fast and shallow. Review is slow and deep, and review is where products live or die. Generation gets the headline. Review is where the model actually earns its keep.

I ran a WCAG 2.1 AA audit on a fintech file recently, with the model and Figma’s MCP doing the reading. One hour. Fifteen issues surfaced. Two critical, six high, four medium, three low.

The critical findings were embarrassing in the way only pre-launch findings can be. A button component still had the placeholder text “Label” instead of “Proceed.” A screen reader would have read it as “Label button.” A privacy policy screen contained legal text sourced from a third-party conference site, with references to a Finuture Conference LLC still in the body. Both should have been caught by a human, and they were not, because humans get tired and humans do not read placeholder text the way a screen reader will.

The high-severity findings were systemic. One neutral token, used across forty screens, was passing through every placeholder, every legal consent line, every “watch video” link. On white, it gave a 3.3:1 contrast ratio. WCAG requires 4.5:1. A token swap, applied across the system, fixed three findings in one move.

The model did not decide what to do about any of these. It did not decide whether the Proceed button should be filled or outlined. It did not name the next iteration. The senior reviewer in the room still does that work. What the model did was read the file completely, without fatigue, without stake, and without the human tendency to skip placeholder text after the fortieth screen.

The honest limitation is that the model is only as sharp as the question. A bad prompt yields a generic checklist. The skill that survives the augmentation is knowing what to ask.

The shift is to stop measuring AI in design by what it can generate and start measuring it by what it can review. Generation gets attention. Review keeps the product from shipping with placeholder text in the button.