If you’re a CTO, VP of Engineering, or operating exec responsible for technology, you’re likely looking at the same uncomfortable picture everyone else is. Your company has invested heavily in AI tooling, your engineers are using it constantly, and when the board asks what return you’re getting on that investment, the honest answer is usually some version of “the team feels faster, but I can’t put a clean number on it.”
That gap between AI activity and AI outcomes is the headline story of 2026. Gartner has flagged it directly: roughly 74% of companies have not yet achieved tangible value from AI initiatives, despite near-universal adoption. The MIT report on enterprise AI rollouts has been even sharper, suggesting 95% of generative AI pilots fail to deliver measurable ROI.
Scrubby is built around a thesis about why this gap exists, and what to do about it. This post lays it out from the seat you’re sitting in.
Why your AI investment isn’t compounding
The uncomfortable truth is that most AI coding tools are individual productivity multipliers layered onto a team-level system that wasn’t designed to absorb them. An engineer using Cursor or Claude Code may be writing code 30 to 50% faster. But the rest of the system, including code review, onboarding, and deployment risk, wasn’t redesigned around that velocity. So the gains get absorbed in the gaps. Faster code that doesn’t fit the codebase still has to be reviewed and rewritten. AI-generated PRs that look reasonable still introduce subtle convention violations that compound into tech debt. Onboarding still takes months because the unwritten rules of your codebase still aren’t documented anywhere queryable.
The deeper problem is security. AI-authored code has been measured to have 2.74× more security vulnerabilities than human-authored code without strong context. The bottleneck isn’t model quality. The bottleneck is context. Your AI tools don’t know your codebase, so everything they produce has to be re-evaluated by your most senior people before it can be trusted.
What changes when codebase intelligence is in the loop
Scrubby is a codebase intelligence layer that sits between your AI tools and your codebases. It builds a structured understanding of each repo (covering domains, connections, conventions, co-change patterns, and change velocity) and makes that knowledge available wherever AI is being used. Engineers’ editors get it via MCP, and every pull request gets it via the GitHub App.
For executives, the practical effects are three.
1. Your AI investments start producing measurable outcomes. When AI-generated code fits the codebase on the first pass, the gains stop getting absorbed by review and rework. PR cycle time drops. Defect rates drop. The “AI made us faster” feeling starts showing up in throughput dashboards.
2. You get architectural visibility you’ve never had. Scrubby’s domain map is generated from your actual code, instead of from a diagram someone drew in 2023. It shows you the architectural regions of every repo, how strongly they’re coupled, and where the change velocity is concentrated. For a CTO who needs to talk credibly about the structure of the platform to a board, an acquirer, or the security team, this is concrete, queryable, and current.
3. You get a real signal for engineering health. Scrubby tracks which domains are changing, how fast, and how often together. You can see when a domain that should be stable is suddenly hot, and you can see when AI-generated changes are introducing convention violations that future engineers will have to fix.
The architecture you didn’t know you had
Most CTOs we talk to don’t have a good answer to the question “how is this codebase actually organized?” They have an org chart, and they have whatever architecture diagrams the team last updated. Neither of those reliably matches the structure of the code itself, and the gap between intended architecture and actual architecture is where the worst tech debt lives.
Scrubby gives you the actual architecture. The domains it discovers are the ones that emerge from the code your team has actually been writing. The connections it surfaces are the ones that show up in import graphs and commit history, weighted by how much they actually matter. The conventions it extracts are the ones your team is enforcing in practice, instead of the ones the style guide claims.
For a CTO who’s been around the block, this is the kind of map that makes a thousand decisions easier. Should we extract this domain into its own service? What’s the blast radius if we change this contract? These questions stop being a multi-week investigation and become queryable.
How AI ROI actually shows up
Jellyfish’s AI Impact framework and Worklytics’ guidance on AI metrics both make the same point: activity-based measures (like number of AI-assisted commits or AI suggestions accepted) don’t translate to boardroom-relevant outcomes. The metrics that matter are the ones that connect AI usage to delivery, including cycle time and defect rates.
Scrubby’s effect on those metrics is structural. Because the AI tooling in your engineers’ editors is producing better code on the first pass, cycle time drops. Because PR review is catching convention violations and missed co-changes before merge, defect rates drop. These aren’t projections. They’re the mechanical consequence of giving your AI tools the codebase context they were missing.
If you’ve been struggling to put a real number on your AI ROI, the answer in many cases is to add the missing context layer and watch the numbers start showing up.
The risk story
The flip side of AI velocity is AI risk. As Fortune put it earlier this year, trust is becoming the bottleneck on AI-assisted development. Code that ships fast but isn’t trustworthy isn’t actually shipping fast. It’s accumulating an invisible debt that gets paid in incidents and rewrites.
Scrubby reduces this risk in a few specific ways. It catches convention violations before merge, so the code that ships fits the codebase. It flags missing co-changes (the bug class where a model is updated but its serializer isn’t), so half-finished changes don’t get deployed. It surfaces domain boundary crossings, so changes that touch areas the author didn’t expect to touch get a second look.
For a CTO thinking about the risk profile of AI-assisted development, this is the difference between trusting AI output blindly and having a structured review layer that catches the class of mistake AI is most prone to make.
What you can take to the board
A short list, in roughly the language a board cares about:
- A defensible ROI story for AI. Cycle time, defect rate, and throughput numbers that are demonstrably moving because AI tools have the context they need.
- An architectural map of every repo. Generated from the code itself, kept current automatically, and queryable by you and the team.
- A risk-mitigation layer for AI-assisted development. Convention enforcement, co-change checking, and domain-aware review on every PR.
- An onboarding story. New engineers reaching productivity faster, because the unwritten rules of every codebase are surfaced in real time.
- A stable foundation for whatever comes next. As AI tooling continues to evolve, the codebase intelligence layer underneath it gets more valuable, instead of less.
The bigger picture
The teams winning with AI right now are the ones treating it as a system instead of a feature. They’re investing in the context layer that makes AI tools effective, and they’re investing in the visibility layer that turns AI activity into AI outcomes.
Scrubby is the layer that does both at once. For a CTO trying to convert AI investment into measurable engineering progress, it’s the kind of foundational capability that pays for itself the first time a regression doesn’t ship.
Sources: