Make Your AI Code-Review Like a Senior

Four ordered review passes, real severity tiers, and the rule that makes an AI review feel senior instead of a wall of nitpicks.

June 12, 202610 min readai code review,code review,claude code,cursor,developer workflow

MaxtDesign

Engineering

Overhead macro of a set of stacked metal document sorting trays on a dark slate desk, papers filed into separate ordered slots, one slot catching cool directional light.

Ask most AI tools to review a pull request and you get a flat list. Twenty bullet points, all formatted the same, all sounding equally urgent. Rename this variable. Add a comment here. Consider extracting a helper. Somewhere in the middle, buried between a spacing nit and a suggestion to add a docstring, is the line that actually matters: this endpoint trusts a user-supplied ID and runs a query with it. The model found it. It just filed it next to the docstring, so you skimmed past it.

That is not how a senior engineer reviews code. A senior does not hand you a list. They triage. They tell you the one thing that stops the merge, and they tell it to you first. The nitpicks come later, if they come at all. The skill is not finding more issues. It is ordering them so the important one cannot hide.

The good news is that this is a method, not a personality trait, and you can hand the method to an AI. Here is the one I use. It works in Claude Code, Cursor, Codex, or a plain chat window with a diff pasted in.

The flat-list problem

The default failure of AI review is not that it misses things. Modern models are good at spotting issues. The failure is that they present every finding at the same altitude. A SQL injection and an inconsistent quote style arrive as sibling bullet points, and the reader has to do the triage the reviewer should have done.

Worse, length reads as thoroughness. A review with twenty items feels more diligent than a review with two, so the model is quietly rewarded for padding. You end up scrolling a wall of low-value polish suggestions, your attention thins out, and the one finding that should have blocked the merge gets the same three seconds as "this could be a const."

Fixing this is not about a smarter model. It is about giving the model a structure that makes hiding impossible: a fixed order of passes, and severity tiers that force a verdict.

Review in lanes, in order

A senior review runs as four passes, always in the same order, and the order is the whole point:

Security: auth and authorization, input validation, secrets in the diff, injection, SSRF, anything that trusts data it should not. This runs first because a security hole outranks every other category of problem combined.
Correctness: logic errors, off-by-ones, unhandled edge cases, error paths that swallow failures, concurrency mistakes. The code does what it says only if this pass is clean.
Performance: N+1 queries, allocations in hot paths, blocking calls on the request thread, accidental quadratic loops. Real, but it never outranks a correctness bug.
Readability: naming, structure, dead code, control flow that takes three reads to follow. This pass runs last because it is the cheapest to defer and the easiest to let crowd out the rest.

The ordering is not decoration. It encodes a priority a senior holds without thinking about it: a naming nit never gets to distract from an auth hole. When the passes are unordered, the model is free to lead with the readability finding because it is the easiest to articulate. When the order is fixed, security has to be addressed before readability is even mentioned.

Severity tiers force a verdict

Within each pass, every finding gets a tier. Four of them, and the names matter because they map to an action, not a feeling:

Block: do not merge. This is the only tier that stops the train. If a review has zero Blocks, the change is mergeable, full stop.
Fix-before-merge: must be addressed, but it is not a hard stop on its own. The distinction from Block is real: Block means "this is dangerous," Fix-before-merge means "this is wrong and easy."
Nit: optional polish. Take it or leave it.
Praise: what was done well, and why.

Tiers do the work that a flat list refuses to do: they commit. A model that has to label something Block or Nit can no longer hide behind "you might want to consider." It has to take a position on whether this is allowed to ship.

The lane method

A senior works top to bottom and stops you at the first lane that matters. The order is the product: a security hole outranks a naming nit every time, and nits stay hidden until every Block is cleared.

The rule that makes it senior

Here is the move that separates a senior review from a thorough one: if there is a Block, the nits do not get shown at all.

Think about why. If your change has a SQL injection in it, you do not care that a variable could be renamed. Showing you the nit alongside the Block actively hurts, because it dilutes the one message that matters into a list of five, and now you are deciding which of five things to fix instead of fixing the one thing that is dangerous. A senior would not even mention the rename. They would say "stop, this is exploitable, fix that, come back."

So the rule is mechanical: when any Block exists, suppress every Nit. Output one line in their place, something like "Nits suppressed, address Blocks first," and move on. The nits are not gone, they are just not your problem yet. Once the Blocks are cleared on the next pass, the polish suggestions come back. This single rule does more to make AI review feel senior than any amount of model tuning, because it enforces the one thing juniors and untuned models both get wrong: they say everything they noticed instead of only what you need right now.

Always name a strength

One more rule, and it is not a courtesy. Every review names at least one specific thing the code did well, and specific is the operative word. Not "looks good." Something like "the error handling on the upload path covers the partial write case, which is the one most people miss."

Praise that specific is a signal that the reviewer actually read the code rather than pattern-matching for problems. It also tells you what to keep doing, which a list of complaints never does. The anti-pattern here is the review that only criticizes. It trains you to dread reviews and teaches you nothing about what good looks like in this codebase. Force the strength, and make it earn its place.

A prompt scaffold you can paste today

None of this needs special tooling. Here is the method as a prompt you can drop in front of any diff, in any AI tool, right now:

Review this diff as a senior engineer. Run four passes in this exact
order and do not reorder them: Security, Correctness, Performance,
Readability.

For each finding, assign one tier:
- Block (do not merge)
- Fix-before-merge (must address, not a hard stop)
- Nit (optional polish)
- Praise (something done well, be specific)

Rules:
1. Cite file and line for every finding.
2. If ANY Block exists, do not show Nits at all. Replace them with the
   single line "Nits suppressed, address Blocks first."
3. End with a one-word verdict: Approve, Fix-before-merge, or Block.
4. Always include at least one specific Praise.

Diff:
<paste your diff here>

That scaffold alone will change the shape of every review you get. The findings stop arriving as a flat list and start arriving as a decision. If you want the per-instruction reasoning behind why this phrasing works, the constraint stacking and explicit output shape, our companion piece on prompt engineering for code covers the patterns underneath it.

Where a single prompt stops being enough

The scaffold is genuinely useful, and for most changes it is all you need. But there is a class of diff where one reviewer, however well-prompted, is the wrong shape for the job: the high-stakes change. A diff that touches authentication, payments, a webhook handler, secret management, or a database migration is not a single-reviewer problem. In a real team, those changes pull in a second and third set of eyes on purpose, because the cost of missing something is measured in incidents, not nitpicks.

The senior instinct is to escalate based on concrete triggers, not vibes. Auth or payment code in the diff. A migration file present. More than a few hundred lines changed, or more than a handful of files touched. When one of those fires, a single reviewer should stop and convene a panel: the reviewer chairs, a security pass and a performance pass run alongside, and the change does not merge until all three agree. That is the same multi-agent pattern that makes orchestration worth its overhead, applied to the one place it clearly pays off: the review that, if it is wrong, wakes you up at 2am.

You can wire this yourself. Run the review prompt, and when the diff trips one of those triggers, run a second focused security-only pass and a third performance-only pass, then reconcile the three. It is more steps, and on the changes that warrant it, the extra steps are the entire point.

The method, packed

The lane method is one of the skills in the Senior Solo Coder Skillpack, a set of 31 AI skills for developers who ship solo. The version in the pack runs the escalation so you do not have to remember it: ask for a review on a diff that touches auth or payments and it pulls the security and performance passes into the same review on those exact triggers, so the high-stakes findings are in front of you before you sign off. You still decide what merges. It also reads a profile of your stack so the per-framework checks are already tuned to what you actually use. The point of the pack is that the skills cooperate, a reviewer that knows when to call for backup is worth more than one that waits to be asked. The lane method above is yours regardless. Paste the scaffold, get a review that triages instead of lists, and you have most of the value today.

Need help putting this into practice?

MaxtDesign builds the AI-powered web stacks the articles describe, from agentic workflows to performance-first WordPress + WooCommerce. Talk to us about your project.

Start a conversation More on AI Tools