Scrubby
System

Co-Change Analysis

"I forgot to update the test." "I missed the migration." "The serializer needed to change too." The class of bug that passes code review and CI and shows up in production three weeks later isn't a logic bug. It's a missing co-change. Scrubby catches them, automatically, by reading the same git history your team has been writing for years.

The class of bug it catches

Every codebase has files that should change together. When you update a database model, you probably need a migration. When you change an API endpoint, the client code likely needs updating too. These relationships aren't enforced by the compiler — they live in your team's collective knowledge and surface only when someone with enough context happens to be reviewing the PR.

When that someone isn't around, the half-finished change merges. The migration ships without the model update, or vice-versa. CI passes. The bug appears in staging, or worse, in production. Then your team spends an afternoon figuring out what was missing, fixing it, and writing a postmortem that says "we should have caught this in review."

That entire failure mode is mechanical. Detecting it doesn't require judgment. It requires a list of file pairs that historically change together and a check that both sides of each pair are present in the changeset. That's what co-change analysis is.

How Scrubby builds the model

During indexing, Scrubby reads your commit history and tracks which files changed in each commit. No author data, no PII gets stored — only what changed and when. From that history, Scrubby identifies pairs of files that change together in more than 50% of commits where either file appears.

The 50% threshold is high enough to filter out coincidental co-changes. A pair of files that happened to change together a few times because of one big refactor won't get flagged. A pair that changes together routinely, across many small commits over months, will.

Why git history is enough signal

You might think a static-analysis tool could spot these relationships from imports or type annotations. Sometimes it can — but the most important co-change pairs are exactly the ones that aren't visible in the static graph. A model and its migration don't import each other. An endpoint and its test live in separate trees. Git history is the only place these relationships show up reliably, and it's already there in your repo.

Where you see it

On every PR. The Scrubby GitHub App runs co-change analysis on every changeset. Missing co-changes appear as a list in the PR comment — "these files have changed together in 89% of the last 47 commits, but only one of them is in this changeset." The reviewer (or the author) decides whether the omission is intentional and either adds the file or moves on.

Pre-push, in your editor. Ask your AI agent to run scrubby_review_changeset on your staged files before you push. Same engine, same threshold, same findings — only earlier in your loop. You catch the missing file locally instead of in PR review.

Common catches

What if the model gets it wrong?

Co-change analysis improves with more git history, and it adapts as your team's practice changes. If two files used to be coupled but you've intentionally decoupled them, new commits won't include both, and the pair's co-change score will drift below 50% over time. The flag goes away on its own.

If Scrubby flags a co-change gap that is intentional — you really did mean to change one without the other — dismiss it on the PR. That signal flows back into the learning loop and weakens the pair.

The bug class where someone updates a model but forgets the serializer disappears almost entirely once Scrubby is in your loop. It's not magic. It's git history doing the work it was always able to do.

Limitations

No history, no co-change. A brand-new repo with three commits doesn't have enough signal to form pairs. Conventions and domain analysis still work; co-change needs evidence to operate.

File renames. Renames done with git mv are tracked. A delete-and-add (which looks identical in the diff) breaks the chain. Scrubby tries to detect this with similarity heuristics, but very large renames may need a re-index after the dust settles.

Force-pushed history. Squashing or rewriting history will reset the co-change view of the affected commits on the next index. The model rebuilds; nothing is lost permanently.

Stop shipping the change you forgot.

Join the Scrubby beta Read the docs →