// blog

LLM code review doesn't replace human review. It multiplies it.

The position paper that ships with v1.0 — what an LLM catches, what it doesn't, and why a human reviewer's job changes shape rather than disappearing.

· Muhammet Şafak

The two questions I get most often since shipping v1.0: Is this going to put human reviewers out of a job? and Can a CLI really stand in for a code review practice that took years to build?

Both are the wrong question. They share the same flawed premise — that an LLM either does code review or doesn’t, either replaces a human or doesn’t.

The reality is less dramatic and considerably more useful. An LLM doesn’t replace code review. It multiplies it. Those are two different things, and this post is a statement of intent meant to draw the line between them clearly.

What a human reviewer catches

The most valuable thing a senior reviewer catches is “this function answers the wrong question.” A tool that inspects code mechanically catches internal inconsistencies — it doesn’t catch a solution to the wrong problem.

When a reviewer opens a PR, three layers of inspection run in their head:

  1. Do these lines do correctly what they are doing? (mechanical correctness)
  2. Is what they’re doing the work they should be doing? (intent correctness)
  3. Is this work right for this product at this moment? (strategic correctness)

An LLM is competent at the first — partially competent at the second. It is not competent at the third. The answer to the third layer doesn’t live in the code; it lives in the roadmap that got decided last Tuesday, in the two customer escalations from last month, in the three priorities the team chose for this quarter. No prompt carries that context at sufficient density.

Mentoring a junior developer, buying a one-hour pair-programming session for five lines of code, making the call that “this corner is worth refactoring” — none of these are an LLM’s job. That’s the senior reviewer’s real value. The rest, as we’ll see, is noise.

What an LLM catches

The other half of the honesty: a human reviewer looks at a PR once every four hours. An LLM looks at the same PR once every eight seconds.

When that speed difference translates into a quality difference — especially in the “small, frequently repeated mistakes” class — the LLM’s consistency reaches a level no human reviewer can match. Three concrete examples:

  • Pattern violation. The repo uses pgx everywhere; one file just slipped back to database/sql. A reviewer catches this, but starts to slip after the 30th PR. The LLM catches it on the 3,000th with the same attention.
  • Typical bug shape. A panic instead of a return immediately after an err != nil check. A tired reviewer misses it. The LLM doesn’t get tired.
  • Consistency drift. The codebase uses slog throughout, but a new file slipped in a log.Printf. Detail-level stuff — exactly where a human reviewer’s attention naturally fades, and where the LLM stays sharp.

These details actively prevent a senior reviewer from doing the work that matters. The reviewer wants to focus on “is the business layer right?” but their eye keeps drifting to “where is this half-finished import?” Offloading that mechanical energy expenditure to the LLM both lightens the reviewer’s load and creates room for their deeper observations.

The zeroth-reviewer position

This is why I don’t position CommitBrief as a “second reviewer.” Zeroth reviewer.

When you say “second reviewer” you logically imply it absorbs part of the first reviewer’s job — which is the start of replacing humans. A zeroth reviewer, by contrast, is the filter that runs before any human reviewer is involved at all — before git push, before the PR is opened, on your own machine, with your own rules.

The zeroth reviewer’s job:

  • Catch mechanical inconsistencies
  • Flag pattern violations
  • Warn on typical bug shapes
  • Answer “is there anything worth talking about in this PR?” in eight seconds

The senior reviewer’s job:

  • Confirm the code solves the right problem
  • Judge fit between this solution and this project at this moment
  • Mentor the junior developer
  • Spot the refactor opportunity

The second list cannot be automated. The first list mostly can.

Concretely: CommitBrief defaults to --staged scope. That isn’t accidental. Making “changes not yet committed” the default is the architectural manifestation of the zeroth-reviewer position:

git add internal/auth/session.go
commitbrief

The output of those two lines is not a review. It’s a self-review. Before the PR opens, before the commit is locked in, on your own machine. With --json output the same flow plugs into CI — you can wire a gate that fails on a severity threshold. Even that isn’t bypassing the human reviewer; it’s pre-answering “is this PR worth opening?”

How the reviewer’s role changes

When a team wires the zeroth reviewer in correctly, the senior reviewer’s workload doesn’t shrink. It changes. The reviewer who used to write “fix the import order” comments now writes “you used context.Background() here instead of the incoming context.Context — did you think through the cancellation behavior?”

That is not less work. It’s higher-value work that requires more thought. The senior’s time on the PR stays roughly the same; a larger share of that time goes to something only the senior can actually do.

There’s a second dimension worth naming: code review burnout. The senior’s day-after-day fatigue of writing the same comment in slightly different words is a real problem, and it shows up in teams as a quiet productivity loss. When the zeroth reviewer takes over the mechanical repetition, the work left over for the senior is less corrosive. That’s an underrated win for team-morale sustainability.

And — most importantly — it’s a win for the junior developer. When a junior hears “cancellation behavior” instead of “fix the import order,” they’re being mentored. Knowledge is being transferred. Review comments are teaching someone something again.


LLM review doesn’t replace human review. It raises the focus of human review.

I designed CommitBrief with that stance. That’s why it’s a CLI, not a GitHub App. That’s why the default scope is --staged. That’s why OUTPUT.md is open to personal preference. That’s why you can choose between six providers — four API providers plus subprocess wrappers around Claude Code and Gemini CLI — and that’s why the Ollama path is fully offline.

This post is a position statement, not a tool pitch. Over the coming months I’ll work through what that position implies in practice — the pre-commit pattern, COMMITBRIEF.md design, the three-layer filter, the multi-provider strategy. But every one of those posts will lean on this stance.

A concrete first step: brew install CommitBrief/tap/commitbrief, then commitbrief --staged before you open your next PR. Tell me — in the issue tracker — which finding categories you find valuable and which you treat as noise. That signal is what will steer this series.


← all posts