Software Complexity Metrics: The Definitive Guide for Team Leads
Your codebase has a cyclomatic complexity of 47 in payment_processor.py. Should you refactor it?
You have no idea. And neither does anyone else looking at that number.
Cyclomatic complexity measures the number of independent paths through code. It's been the go-to metric since 1976. It's also completely useless in isolation.
I've seen teams obsess over reducing complexity scores while their most critical bugs came from simple functions that changed every week. I've watched engineers refactor low-complexity code that nobody touched, while ignoring the tangled mess in the authentication layer that three people were simultaneously modifying.
The problem isn't that complexity metrics are wrong. They're just incomplete. What actually predicts bugs and maintenance nightmares is the intersection of complexity, change frequency, and team dynamics.
Cyclomatic complexity counts decision points. A function with three if-statements and two loops scores higher than a function with none. Fair enough.
But consider two functions:
Function A: Complexity 25, changed twice in three years, owned by one developer who wrote it.
Function B: Complexity 8, changed 47 times this quarter, touched by five different developers.
Which one is riskier?
Function B will bite you. Every single time.
Traditional complexity tools report that Function A is the problem. They'll flag it red in your dashboard. Some overzealous linter will demand you refactor it. Meanwhile, Function B—your actual time bomb—sits there looking innocent with its complexity score of 8.
This is why code review backlog analysis often misses the real issues. You're measuring the wrong thing.
The Three Dimensions That Actually Matter
Code health exists in three dimensions: complexity, churn, and ownership.
Complexity tells you how hard code is to understand. High cognitive load, lots of branches, nested logic. This matters. A complex function requires more working memory to reason about.
Churn tells you how often code changes. Files that change frequently are either actively developed (expected churn) or unstable (problem churn). The difference is everything.
Ownership tells you how knowledge is distributed. Code touched by many people without a clear owner accumulates inconsistencies. Code owned by one person becomes a bottleneck.
The magic happens at the intersections:
High complexity + high churn: Your bugs live here. Complex code that keeps changing breaks. Constantly.
High complexity + concentrated ownership: Bus factor risk. One person understands this. They quit, you're screwed.
Low complexity + high churn: Feature boundary problem. Either the abstraction is wrong, or requirements keep shifting.
High complexity + low churn: Probably fine. Complex code that doesn't change is just complicated, not dangerous.
Real Numbers from Real Codebases
I analyzed incident reports from a 200-developer organization last year. They tracked 83 production bugs back to specific code sections.
67% originated in code with both high complexity (>15 cyclomatic) AND high churn (>10 changes/quarter)
21% came from simple code with very high churn (>25 changes/quarter)
12% came from complex, stable code
The worst offender? A 380-line authentication middleware function. Cyclomatic complexity of 31. Changed 64 times in six months by 11 different developers. It caused nine separate incidents.
The codebase had 47 functions with higher complexity that caused zero problems. They were stable. They were understood. They worked.
How to Actually Measure Code Health
Start by mapping your complexity hotspots. Run your favorite tool—McCabe, SonarQube, whatever. Get the numbers. This is baseline data.
Now overlay churn. Pull git history for the last 90 days. Count commits per file. Weight by lines changed if you want precision, but honestly, commit count alone works.
Create a scatter plot: complexity on X-axis, churn on Y-axis. The top-right quadrant is your problem zone.
Then add ownership concentration. For each file, calculate:
A score above 0.8 means one person owns it. Below 0.3 means it's a free-for-all.
Now you have something useful. High complexity + high churn + diffuse ownership? That's your critical list.
This is where platforms like glue.tools become essential. You can manually pull git logs and calculate complexity, but doing this across thousands of files every week is insane. Glue indexes your codebase continuously and maps these intersections automatically, showing you which functions are actually risky based on the three-dimensional view.
The Feature Boundary Problem
Here's something nobody talks about: churn patterns tell you where your abstractions are wrong.
If a simple function changes constantly, the problem isn't the code. It's that the feature boundary is poorly defined. Requirements keep shifting, or the abstraction doesn't match the domain model.
Look at your high-churn, low-complexity code. Group it by feature area. You'll find clusters.
That cluster in your user service that keeps changing? Your product team hasn't figured out user roles yet. They keep adding special cases.
Those constant tweaks to your pricing calculator? Your pricing model doesn't match your code structure. Every new pricing rule requires changes in five different places.
This is architectural debt, not code debt. Refactoring the functions won't help. You need to restructure how features are organized.
Team Dynamics and Code Ownership
The ownership dimension reveals team problems early.
Code with low ownership concentration (many contributors, no clear owner) correlates with:
Inconsistent patterns and style
Duplicated logic
Defensive coding (nobody trusts the existing code)
Slower reviews (nobody feels responsible)
Code with too-high ownership concentration creates:
Knowledge silos
Review bottlenecks (only one person can approve changes)
Bus factor risks
Career progression blockers (junior devs can't learn)
The sweet spot is around 0.6-0.7 ownership concentration. One person wrote most of it, but others have contributed enough to understand it.
When you see ownership shifting rapidly (multiple new contributors in a short period), that's a leading indicator. Either you're onboarding people to this area (good), or the original owner left and everyone's confused (bad).
Glue's team insights surface these patterns across your entire codebase, showing you where knowledge is dangerously concentrated and where too many cooks are creating chaos.
Actionable Thresholds for Team Leads
Stop using arbitrary complexity limits. Use combined thresholds:
High complexity + low churn + high ownership (complex but stable)
Low complexity regardless of churn (simple code is resilient)
What This Looks Like in Practice
I worked with a team that had a 2,300-line service class. Complexity through the roof. Every static analysis tool screamed about it.
They spent three months planning a refactor. Big project, lots of architecture discussions.
Then we looked at the churn. Eight commits in two years. All minor config updates. Owned by one senior engineer who knew it cold.
We killed the refactor project.
Instead, we focused on a 200-line controller with complexity of 12 that had been changed 89 times in six months. It looked fine to every code analysis tool. But six different developers had touched it, none owned it, and it had caused four production incidents.
We spent two weeks on that controller. Clarified the feature boundaries, split it into three focused pieces, assigned clear ownership. Incidents stopped.
Integrating Metrics into Your Workflow
Don't make this a dashboard-checking ritual. Integrate metrics into your existing process.
In pull requests, surface the complexity+churn+ownership score for changed files. Make it visible when someone's about to modify your riskiest code.
In sprint planning, factor code health into estimates. A feature touching high-risk code needs more time, more careful review, better testing.
In architecture reviews, use ownership concentration to identify knowledge gaps. If your payment system is 90% owned by one person, that's an architecture problem disguised as a people problem.
Glue's MCP integration with Cursor and Claude puts this data directly in your editor. When you're writing code, you see the health metrics for the files you're touching. Before you add another conditional to that already-complex function, you know its change history and ownership pattern.
The Metrics That Actually Matter
Forget vanity metrics. Here's what predicts problems: