What Is AI Slop?

Alex van der Meer•April 13, 2026•8 Min Read

AI slop is not "anything made with AI." It is the stuff that looks finished before anyone has actually understood it: cheap to generate, expensive to verify, and usually someone else's review problem.

A pull request lands from an agent.

At first, it all looks reassuring... The summary is clean. The file names make sense. The tests are green.

Ten minutes later, you realize it duplicated an abstraction you already had, ignored a constraint buried in an old incident, and added 400 lines nobody wants to own.

From a developer's point of view, that's AI slop. Not because the code was AI-generated. Because it looked done long before it was actually understood.

The term stuck because it names something real. It gives people a useful name for a pattern that now shows up everywhere: content, code, specs, tickets, status updates, and summaries that look finished from a distance but fall apart the moment someone has to rely on them.

AI slop is output with responsibility stripped out

Merriam-Webster now explicitly defines one sense of "slop" as low-quality digital content produced, usually in quantity, by AI.¹

Useful, but I think Simon Willison's framing is better for actual work: not all AI-generated content is slop. It becomes slop when it is mindlessly generated, barely reviewed, and pushed onto other people who did not ask for it.²

That difference matters.

Bad content existed long before LLMs. Spam existed. Clickbait existed. Empty consulting prose existed too.

AI changed the math.

Now you can make something that looks credible in seconds. So the temptation is no longer "can I make this?" It is "can I get away with shipping this without thinking very hard?"

Developers reviewing work together at computer screens, representing the human judgment AI output still requires

For developers, this is where the definition gets practical. Slop is not just a bad blog post or a cursed Facebook image.

Here is a boring but very real version of it. An agent opens a PR called "improve retry handling."

The description says it standardizes resilience across services. The tests pass. The diff is tidy, but the repo already has a retry helper. The patch adds a second one. It also touches a billing path where retries are risky, and the PR never mentions idempotency once.

Nothing in it is obviously broken: That's the trap.

You still have to read every line because the output sounds more trustworthy than it deserves.

It is also:

the PR description that sounds thoughtful but says nothing about why the change exists
the generated spec that expands a small problem into three pages of padded prose
the bug summary that reads smoothly but cannot be tied back to logs, tasks, or an actual incident
the code patch that looks plausible but does not fit the repo it landed in

The tool is not the common trait, what is missing is judgment.

The internal version is the one that hurts

The public version of AI slop is annoying, the internal version is expensive.

It wastes review time, pollutes context, and creates the feeling of progress without much actual progress underneath it.

I do not think there is a paper out there formally measuring "slop.", but there are adjacent signals pointing in the same direction.

DORA's 2025 report makes the broad systems point cleanly: AI mostly acts as an amplifier. It magnifies the strengths and weaknesses of the system you already have.³

GitClear's large-scale analysis found more churn and less reuse as AI-assisted coding rose, which is pretty close to what you'd expect when generation gets cheap and review discipline does not rise with it.⁴

Put those two together and the risk is pretty obvious.

My read is simple: if your team already has vague specs, weak ownership, and poor traceability, AI will not fix that. It will help you produce more artifacts with the same weakness baked in.

That's the real developer problem with slop: it is cheap to produce and expensive to absorb.

One person saves fifteen minutes. Three other people lose an hour trying to verify what they were handed.

How to spot AI slop in a dev workflow

The easiest way to spot slop is to stop asking whether it was made with AI and start asking what kind of review burden it creates.

I would use five tests.

1. It sounds local, but ignores local reality

This is the classic one.

The output uses the right language but misses the repo's actual abstractions, naming patterns, or historical constraints. It can talk fluently about caching, auth, retries, or migrations while still violating the exact conventions your system depends on.

That's not intelligence. It's syntax cosplay. If the summary could fit any repo, it probably fits none.

2. It is specific in syntax and vague in intent

AI slop often looks impressively detailed until you ask one level deeper.

Why this approach? What tradeoff did we choose? What previous decision does this respect? What can it safely touch and what must stay stable?

If the artifact gets blurry the moment those questions show up, you are not looking at finished work. You are looking at a polished first draft.

3. It expands the surface area faster than it sharpens the result

Slop loves volume.

More helpers. More wrapper functions. More bullets. More sections. More tickets. More "comprehensive" documentation.

The artifact gets bigger, so everybody gets to pretend the work got sharper.

But if the output keeps getting longer while the problem stays the same size, somebody is paying for that later.

Usually the reviewer.

4. It has no provenance

Good work leaves a trail behind it.

What ticket triggered this? What incident or request does it answer? Which logs, docs, customer notes, or commits support the summary? Which discussion made the tradeoff?

Slop has none of that.

It appears as a free-floating artifact with no evidence attached.

5. It makes review harder, not easier

This one matters most.

If I need to reverse-engineer assumptions, diff through avoidable boilerplate, or throw most of it away to salvage the useful 20%, the tool did not save time. It moved the labor.

That's slop.

What good AI use looks like instead

The alternative is not "never use AI". That would be shallow too. A better rule is simpler: use AI to compress labor, not to outsource judgment.

Good AI-assisted work usually has a few visible traits.

the scope is clear before generation starts
the output is smaller, not just longer
assumptions are stated instead of hidden
claims can be checked
the reviewer can understand the why without playing detective

A good AI-assisted PR usually feels narrower, not broader.

It says: use the existing helper in shared/http, do not touch the billing path, this came from incident X and task Y, and here is what the reviewer should verify.

Very different from "comprehensive refactor of retry handling" followed by 600 lines of fresh boilerplate.

A quiet office desk in front of a window, representing the calmer and more disciplined workflow good AI use should create

So I do not think the best AI workflow is the one that generates the most.

It is the one that leaves the cleanest trace:

smaller diffs
clearer specs
fewer invented abstractions
sharper summaries
better links between tasks, decisions, commits, and outcomes

If AI is genuinely helping, the artifact should become easier to trust.

The standard I would use on a dev team

If an AI-generated artifact has no owner, no source trail, no constraint story, and no review path, reject it.

Not "clean it up later", reject it.

That's the bar.

AI is supposed to remove drudge work while keeping judgment visible, not flood the system with plausible-looking work. If the output makes review harder, it is not acceleration. It is waste.

If you are adopting agents, build the system that keeps specs, tasks, commits, PRs, and decisions connected tightly enough that the output stays legible.

That's the layer we care about at One Horizon.

Give it a try

Share this article

Why Most AI Code Reviewers Miss the Point

Code reviewers are good at finding bugs. They are not good at understanding why a decision was made. That gap is the root cause of most painful review cycles.

Gijs van de Nieuwegiessen•June 4, 2026•5m

What a Dev Team Looks Like in 2030

By 2030, the best dev teams will be smaller, more senior, and much less tolerant of fuzzy work. Agents will handle more first drafts. Humans will spend more time on specs, review, and judgment.

Alex van der Meer•April 14, 2026•8m

Why Skills Might Be the Most Important AI Concept of 2026

We taught AI to talk, code, and reason. But something is still missing between models and real work. In 2026, skills may become the layer that finally makes AI reliable inside teams.

Tijn van Daelen•January 23, 2026•6m

Slack Apps Every Engineering Team Actually Needs

A practical shortlist of Slack apps that reduce engineering handoffs, tighten incident response, and keep delivery context in one place, without turning your workspace into notification soup.

Tijn van Daelen•April 21, 2026•10m

Back to blogs