Why Most AI Code Reviewers Miss the Point

Gijs van de Nieuwegiessen•June 4, 2026•5 Min Read

TL;DR: The AI agent sees the diff. It does not see the task. That missing context is the root cause of most painful review cycles. The fix: include the task context.

Most AI code reviewing solutions are perfectly fine at what they do.

They catch obvious bugs. They flag naming inconsistencies. They push back on approaches they would have done differently. For a lot of reviews, that is enough.

But there is one thing they consistently miss. It causes more friction than anything else in the review cycle.

They cannot see why a decision was made.

The problem is not the reviewer

When a developer makes a trade-off, they usually have a reason.

Maybe the cleaner approach would have required touching three other parts of the system they were told not to change. Maybe the "obviously better" pattern does not work with the auth middleware that was added six months ago. Maybe the task had a hard deadline and a specific scope, and they made a call.

The reviewer sees none of that.

The reviewer sees an approach that looks wrong and flags it. An AI agent picks up those comments and applies the suggested fix — reverting approach A, switching to approach B. Except approach A was correct. The reviewer just did not know why.

This is not a judgment problem. The reviewer is not bad at its job. It is working with incomplete information. It sees the diff. It does not see the task.

Sean Goedecke described it well: the highest-impact code review comments come not from the diff itself, but from understanding the rest of the system.¹ Understanding the reason the code exists in the first place is the part that is almost always missing.

The diff is the wrong unit of review

Here is the structural problem.

Code lives in git. Context lives in your project management tool. These two things almost never talk to each other at review time.

When an AI reviewer opens a pull request, it sees filenames, line changes, and a PR description that usually says something like "fixes #423." What #423 actually was, what constraints were in play, what alternatives were considered — none of that is in the diff.

So it reviews the code it can see, against the patterns it knows, and flags things that look wrong from that angle. Often those things were not wrong. They were correct given context the reviewer did not have. The PR discussion thread is where that context could live — but only if someone put it there.

A developer's desk with sticky notes and task references, representing the context behind a code change

The fix is not complicated

You do not need to rebuild your process.

You need to connect the review to the task it came from.

The simplest version: include the task ID in the PR title. Not buried in a description, not in a comment — in the title. Something like [OH-1234] Refactor payment retry logic. That makes it a one-click lookup for any reviewer agent to understand what this was supposed to do, what was in scope, and what was explicitly out of scope.

The better version: use a tool that pulls that context automatically. This is what One Horizon does.

When your tasks live in One Horizon and your pull requests reference an initiative, bug or task task ID, a reviewer with access to the One Horizon MCP can pull the full task record into their review context — what the task was, what constraints were noted, what decisions were logged.

The reviewer stops guessing and starts reviewing against intent.

We use a review prompt for this ourselves, paired with a repo skill and One Horizon skills that pull task context directly into the review. When the reviewer has access to the initiative or bug record, the constraints, the intent — the review is grounded in what the feature was supposed to do, not just what the code looks like. That changes what gets flagged and what gets ignored.

A task board connecting work items to code changes, representing context-aware code review

What changes when context travels with the code

Fewer false comments.

That is the most immediate effect. The reviewing agent stops flagging decisions that were correct given constraints it did not know about. An AI agent picking up those comments stops reverting things that should not be reverted. The whole loop gets tighter.

But there is a second effect that matters more over time.

The review becomes useful signal instead of noise. When the reviewer can see the task, it reviews against the actual goal. It flags real risks — the kind that only make sense once you understand what the feature was supposed to do. That is the review that actually improves the code.

Most teams try to fix this by adding more process: better PR templates, mandatory descriptions, structured commit messages. It helps at the edges.

The root problem is that the reviewer is missing the context that would let them do the review well.

Give them that, and most of the friction resolves itself.

That is the missing layer. One Horizon is built to be exactly that.

Try One Horizon

Sean Goedecke. "Mistakes I see engineers making in their code reviews." https://www.seangoedecke.com/good-code-reviews/ ↩

Share this article

What a Dev Team Looks Like in 2030

By 2030, the best dev teams will be smaller, more senior, and much less tolerant of fuzzy work. Agents will handle more first drafts. Humans will spend more time on specs, review, and judgment.

Alex van der Meer•April 14, 2026•8m

What Is AI Slop?

AI slop is not anything touched by AI. It is the polished-looking code, docs, tickets, and summaries that show up without enough context, judgment, or accountability. The fastest test is simple: did this save work, or did it just dump review cost on somebody else?

Alex van der Meer•April 13, 2026•8m

Slack Apps Every Engineering Team Actually Needs

A practical shortlist of Slack apps that reduce engineering handoffs, tighten incident response, and keep delivery context in one place, without turning your workspace into notification soup.

Tijn van Daelen•April 21, 2026•10m

Product Roadmaps vs. Engineering Reality

Product teams plan around outcomes. Engineers build in technical deliverables. An entire layer of people exists just to translate between the two. What if you could eliminate that layer entirely?

Gijs van de Nieuwegiessen•February 16, 2026•7m

Back to blogs

Why Most AI Code Reviewers Miss the Point

Gijs van de Nieuwegiessen•June 4, 2026•5 Min Read

TL;DR: The AI agent sees the diff. It does not see the task. That missing context is the root cause of most painful review cycles. The fix: include the task context.

Most AI code reviewing solutions are perfectly fine at what they do.

They catch obvious bugs. They flag naming inconsistencies. They push back on approaches they would have done differently. For a lot of reviews, that is enough.

But there is one thing they consistently miss. It causes more friction than anything else in the review cycle.

They cannot see why a decision was made.

The problem is not the reviewer

When a developer makes a trade-off, they usually have a reason.

The reviewer sees none of that.

This is not a judgment problem. The reviewer is not bad at its job. It is working with incomplete information. It sees the diff. It does not see the task.

The diff is the wrong unit of review

Here is the structural problem.

Code lives in git. Context lives in your project management tool. These two things almost never talk to each other at review time.

The fix is not complicated

You do not need to rebuild your process.

You need to connect the review to the task it came from.

The better version: use a tool that pulls that context automatically. This is what One Horizon does.

The reviewer stops guessing and starts reviewing against intent.

What changes when context travels with the code

Fewer false comments.

But there is a second effect that matters more over time.

Most teams try to fix this by adding more process: better PR templates, mandatory descriptions, structured commit messages. It helps at the edges.

The root problem is that the reviewer is missing the context that would let them do the review well.

Give them that, and most of the friction resolves itself.

That is the missing layer. One Horizon is built to be exactly that.

Try One Horizon

Sean Goedecke. "Mistakes I see engineers making in their code reviews." https://www.seangoedecke.com/good-code-reviews/ ↩

Share this article

What a Dev Team Looks Like in 2030

By 2030, the best dev teams will be smaller, more senior, and much less tolerant of fuzzy work. Agents will handle more first drafts. Humans will spend more time on specs, review, and judgment.

Alex van der Meer•April 14, 2026•8m

What Is AI Slop?

Alex van der Meer•April 13, 2026•8m

Slack Apps Every Engineering Team Actually Needs

A practical shortlist of Slack apps that reduce engineering handoffs, tighten incident response, and keep delivery context in one place, without turning your workspace into notification soup.

Tijn van Daelen•April 21, 2026•10m

Product Roadmaps vs. Engineering Reality

Product teams plan around outcomes. Engineers build in technical deliverables. An entire layer of people exists just to translate between the two. What if you could eliminate that layer entirely?

Gijs van de Nieuwegiessen•February 16, 2026•7m

Why Most AI Code Reviewers Miss the Point

The problem is not the reviewer

The diff is the wrong unit of review

The fix is not complicated

What changes when context travels with the code

Share this article

Related Posts

What a Dev Team Looks Like in 2030

What Is AI Slop?

Slack Apps Every Engineering Team Actually Needs

Product Roadmaps vs. Engineering Reality

Why Most AI Code Reviewers Miss the Point

The problem is not the reviewer

The diff is the wrong unit of review

The fix is not complicated

What changes when context travels with the code

Share this article

Related Posts

What a Dev Team Looks Like in 2030

What Is AI Slop?

Slack Apps Every Engineering Team Actually Needs

Product Roadmaps vs. Engineering Reality

The problem is not the reviewer

The diff is the wrong unit of review

The fix is not complicated

What changes when context travels with the code

Footnotes

Share this article

Related Posts

What a Dev Team Looks Like in 2030

What Is AI Slop?

Slack Apps Every Engineering Team Actually Needs

Product Roadmaps vs. Engineering Reality

The problem is not the reviewer

The diff is the wrong unit of review

The fix is not complicated

What changes when context travels with the code

Footnotes

Share this article

Related Posts

What a Dev Team Looks Like in 2030

What Is AI Slop?

Slack Apps Every Engineering Team Actually Needs

Product Roadmaps vs. Engineering Reality