Conventions
Every codebase has unwritten rules. They live in the heads of the engineers who've been there longest and surface only during code review when someone says "we don't do it that way here." Scrubby extracts those rules from the actual code your team has been writing and uses them to review every change — against your team's practice, not against a style guide somebody might have written down two years ago.
The unwritten-rules problem
A ten-person team accumulates a few dozen unwritten conventions. A fifty-person org working across multiple services can easily have hundreds. They cover everything from how you name files to how you structure error handling to which directory a new service goes in. None of it is in a linter rule. Most of it isn't in a style guide. All of it shows up in PR comments, repeatedly, on PR after PR.
AI agents have this problem at its most extreme. A model can produce syntactically correct, well-typed code that passes every linter and CI check, but still violate the architectural intent of the system. The output looks fine until a reviewer who's been there long enough notices the code isn't shaped right for that part of the codebase. That reviewer then writes the same comment they wrote on three other PRs this month.
How Scrubby extracts conventions
During indexing, Scrubby samples code from each segment (a cluster of related files within a domain) and runs a set of pattern analyzers over it. The analyzers look for consistent shape across the segment — not exact text matches, but recurring structural patterns.
Each detected convention gets a confidence score from 0 to 1. A pattern that's followed in 95% of files in a segment gets high confidence. A pattern that's followed in 60% of files gets medium confidence. Conventions above 0.7 are prioritized in reviews; lower-confidence ones are tracked but only surfaced when the evidence is strong.
The eight categories Scrubby learns
- Naming — snake_case vs camelCase, class naming, file naming patterns.
- Imports — absolute vs relative, barrel files, autoloading patterns.
- Structure — file and method length, public/private separation, module organization.
- Error handling — specific vs generic rescue, custom error classes, error reporting patterns.
- Testing — test framework, factory usage, mocking style, assertion patterns.
- Documentation — comment density, docstring style, inline comment patterns.
- State management — how state is stored and accessed across the codebase.
- API design — endpoint patterns, parameter handling, response formatting.
Descriptive, not prescriptive
This is the part that surprises people. Scrubby doesn't enforce a popular style guide. It enforces your team's actual practice. If your codebase consistently uses a pattern that contradicts the canonical advice, Scrubby will follow your team. The source of truth is what shows up in the code, every day, across the segment.
Why this matters
Style guides go stale. Linter rules ossify. Scrubby stays current because its source of truth is the code itself. When the team's practice shifts, the conventions shift with it — the next index re-extracts them from the new state of the code.
Where conventions show up
In your AI editor. When the agent calls scrubby_review on a file before editing it, the response includes the conventions that apply to that file's segment. The code the agent writes lines up with the patterns your team already uses, instead of needing to be rewritten in review.
On every PR. Scrubby checks each changed file against the conventions for its segment. Violations are flagged with inline GitHub suggestions that show the expected pattern, ready to commit directly from the PR.
When conventions feel wrong
If Scrubby flags a convention you disagree with, it's usually one of three things:
- The convention is stale. The team has moved on but enough old code still uses the pattern that confidence is high. The next index will adjust the score.
- The segment is misclustered. Two distinct subsystems lumped into one segment can produce conflicting patterns. Re-indexing after directory restructures usually fixes it.
- It's a genuine false positive. Dismiss it on the PR; that signal flows back into the learning loop and weakens the connection.
Conventions aren't rules someone wrote down. They're patterns Scrubby observed across the actual code your team has been writing for months or years — and they get applied with the same context wherever a review happens.