Dealing with technical debt

We’re drowning in technical debt. We have a mountain to climb and don’t really know where to start. Sound familiar? For many of us working on legacy code bases this is the day-to-day reality. But what to do about it?

How did we get here?

Technical debt is always the fault of those “other guys”. Those idiot developers that were here a few years ago. Morons. Obviously couldn’t code their way out of a game of life if their on-going existence depended on it.

I hate to tell you but: we are those other guys. The decisions we make today will look foolish tomorrow. We’ll have more information then, a different perspective; we’ll know how the product and technology were going to evolve. We can’t know that today, so many of our decisions will turn out to be wrong.

Where to start

Classes are like best-selling novels – some are spectacularly more popular / more debt-laden than others. One class, one package, one module – will be much worse than the others. There’ll be a handful of classes, packages etc… that are much worse than all the rest.

How does this happen? Well, one class ends up with a bit of technical debt. Next time I come to change that class, I’m too lazy to fix the debt, so I just hack something together to get the new feature done. Then next time round, there’s a pile of debt – only that guy’s too busy to fix it so he adds in a couple of kludges and leaves it. Before you know it, this one class has become a ten thousand line monster that’s pure technical debt.

It’s like the broken-windows theory – if code is already crappy, its much easier to just make it a little more crappy. If the code’s clean, it’s a big step to add in a hack. So little by little, technical debt accumulates in areas that were already full of debt. I suspect technical debt in code follows a power law – most classes have a little bit of debt, but a few are really shitty, with one diabolical class in particular:

technical debt graph

Where to start? Resist the temptation to make easy changes to relatively clean classes – start with the worst offender. It will be the hardest to fix, it might take a long time – but you’ll get the best bang-for-buck by fixing the most debt-heavy piece of crap code. If you can fix the worst offender, your debt will have to find somewhere else to hide.

The 80/20 rule

There’s a cliché that 80% of the cost of software is maintenance, only 20% is the initial build.

Let’s imagine a team that has 1200 hours a month to spend on useful work. For that 1200 hours of useful work, we’ll spend four times that much over the lifetime of the software maintaining it – from the 80/20 rule. Although we completed 1200 hours of feature work this month, we committed ourselves to 4800 hours of maintenance over the lifetime of the code.

That means next month, we have to spend a little of the 4800 hours of maintenance, with the rest of the time spent on useful, feature-adding work. However, adding new features commits us to even more maintenance work. The following month, we’ve got nearly twice the code to maintain so spend nearly twice the amount of time maintaining it and even less time producing value-adding features. Month-by-month we spend more and more time dealing with the crap that was there before and less and less time adding new features.

productivity graph

Does this sound familiar? This is what technical debt feels like. After a couple of years the pace has dropped; you’re spending half your time refactoring and fixing the junk that was there before. “If only we could get rid of this technical debt”, you cry.

What is technical debt?

We can all point to examples of crappy, debt-laden code we’ve seen. But what’s the impact of technical debt? Technical debt is simply an inability to quickly make changes to an existing system. This is the cost to the business of technical debt – what should be quick changes take an unpredictably long time.

What do we do when we remove technical debt? We generalise and find more abstract solutions. We clarify and simplify. We remove duplication and unnecessary complexity.

The net effect of reducing technical debt, is to reduce inventory.

Perhaps the amount of code – our inventory – is a good approximation for the amount of technical debt in a system. If I’m confronted with a million lines of code and need to make a change, it will probably take a while. However, if I’m only confronted by 1000 lines of code the change will be much quicker. But, if I’m confronted by zero lines of code, then there’s zero cost – I can do whatever I like. The cost of making a change to a system is roughly proportional to the size of the system. Large, complex systems take longer to make changes to than small, self-contained ones.

All code is a liability – the more code you have, the bigger the debt. When we’re paying back technical debt – are we really just reducing inventory? Is what feels like technical debt actually interest payments on all the inventory we hold?

What are the options?

Big bang

One option is to down-tools and fix the debt. Not necessarily throw everything out and rewrite, but spend some time cleaning up the mess. The big bang approach to dealing with technical debt. It’s pretty unusual for the business to agree to a plan like this – no new features for a year? Really? With no new features for a year what would all those product managers do all day?

From the 80/20 rule, the lifetime cost for any piece of code is four times what it cost to create. If it took three months to make, it will take a year to pay back. So wait, we’re gonna down tools for a year and only pay back three months of technical debt? Seriously? We’ll be marginally better off – but we’ll still be in a debt-laden-hell-hole and we’ll have lost a year’s worth of features. No way!

Dedicated Team

Even if you try to do big bang, it ends up becoming the dedicated team approach. As a compromise, you get a specific team together to fix the debt, meanwhile everyone else carries on churning out new features. One team are removing debt; while another team are re-adding it. What are the chances that debt is being removed faster than it’s being added? Exactly. Nil.

It makes sense – you need a team removing debt four times bigger than the team adding new features just to stay still.

Boy Scout

You could adopt a policy of trying to remove technical debt little and often – the boy scout approach. On every single development task try and remove debt near where you’re working. If there are no tests, add some. If the tests are poor, improve them. If the code’s badly factored, refactor it. The boy scout rule – leave the camp cleaner than you found it.

This is generally much easier to sell, there’s only minimal impact on productivity: it’s much cheaper to make changes to a part of the system you understand and are already working in than to open up whole new ones. But over time you can massively slow down the rate at which debt grows. Inevitably the system will still grow, inevitably the amount of debt will increase. But if you can minimise the maintenance cost you’ll keep the code small and nimble for as long as possible.


If we can lessen the maintenance cost of code even just a little we can save ourselves a fortune over the life of the code. If we can reduce the multiple to just three times the initial cost, so our 1200 hours work only costs 3600 hours in maintenance, we’ve saved enough development capacity to build another feature of the same size! For free! Hey, product manager, if we do our job better it’ll take no longer and you’ll get free features. Who doesn’t want free features?

If we can create well-crafted, DRY, SOLID code with good test coverage we have a good chance of minimising the lifetime maintenance cost. This is the best way we can keep our productivity up, to try and avoid getting mired in technical debt and keep the code base responsive to changing requirements. It’s the only way we can remain productive and agile.

Frankly, anything else is just unprofessional. If you’re deliberately committing your company to spend excessive amounts maintaining your shitty code – what the fuck, exactly, are they paying you for?

Measuring Code

How good is your code? If you’re like the other 80% of above average developers, then I bet your code is pretty awesome. But are you sure? How can you tell? Or perhaps you’re working on a legacy code base – just how bad is the code? And is it getting better? Code metrics provide a way of measuring your code – some people love ’em, but some hate ’em.

The Good

Personally I’ve always found metrics very useful. For example – code coverage tools like Emma can give you a great insight into where you do and don’t have test coverage. Before embarking on an epic refactor of a particular package, just how much coverage is there? Maybe I should increase the test coverage before I start tearing the code apart.

Another interesting metric can be lines of code. While working in a legacy code base (and who isn’t?), if you can keep velocity consistent (so you’re still delivering features) but keep the volume of inventory the same or less, then you’re making the code less crappy while still delivering value. Any idiot can implement a feature by writing bucket loads of new code, but it takes real craftsmanship to deliver new features and reduce the size of the code base.

The Bad

The problem with any metric is who consumes it. The last thing you want is for an over eager manager to start monitoring it.

You can’t control what you can’t measure
— Tom DeMarco

Before you know it, there’s a bonus attached to the number of defects raised. Or there’s a code coverage target everyone is “encouraged” to meet.

As soon as there’s management pressure on a metric, smart people will game the system. I’ve lost count of the number of times I’ve seen people gaming code coverage metrics. In an effort to please a well meaning but fundamentally misguided manager, developers end up writing tests with no assertions. Sure, the code ran and didn’t blow up. But did it do the right thing? Who knows! And if you introduce bugs, will your tests catch it? Hell, no! So your coverage is useless.

The target was met but the underlying goal – improving software quality – has not only been missed, it’s now harder to meet in future.

The Ugly

The goal of any metric is to measure something useful about the code base. Take code coverage, for example – really what we’re interested in is defect coverage. That is, out of the universe of all possible defects in our code, how many would cause a failure in at least one test? That’s what we want to know – how protected are we against regressions in the code base.

The trouble is, how can I measure “the universe of all possible defects” in a system? Its basically unknowable. Instead, we use code coverage as an approximation. Given that tests assert the code did the right thing, the percentage of code that has been executed is a good estimation of the likelihood of bugs being caught by them. If my tests execute 50% of the code, at best I can catch bugs in 50% of the code. If there are bugs in the other 50%, there’s zero chance my tests will find them. Code coverage is an upper bound on test coverage. But, if your tests are shoddy, test coverage can be much lower. To the point where tests with no assertions are basically useless.

And this is the difficulty with metrics: measuring what really matters – the quality of our software – is hard, if not impossible. So instead we have to measure what we can, but it isn’t always clear how that relates to our underlying goal.

But what does it mean?

There are some excellent tools out there like Sonar that give you a great overview of your code using a variety of common metrics. The trouble often is that developers don’t know (or care) what they mean. Is a complexity of 17.0 / class good or bad? I’m 5.6% tangled – but maybe there’s a good reason for that. What’s a reasonable target for this code base? And is LCOM4 a good thing or a bad thing? It sounds like a cancer treatment, to be honest.

Sure, if I’m motivated enough I can dig in and figure out what each metric means and we can try and agree reasonable targets and blah blah blah. C’mon, I’m busy delivering business value. I don’t have time for that crap. It’s all just too subtle so it gets ignored. Except by management.

A Better Way

Surely there’s got to be a better way to measure “code quality”?

1. Agree

Whatever you measure, its important the team agree and understand what it means. If there’s a measure half the team don’t agree with, then its unlikely it will get better. Some people will work towards improving it, others won’t so will let it get worse. The net effect is likely to be heartache and grief all round.

2. Measure What’s Important

Downward chartYou don’t have to measure the “standard” things – like code coverage or cyclomatic complexity. As long as the team agree its a useful thing to measure, everyone agrees it needs improving and can commit to improving it – then its a useful measure.

A colleague of mine at youDevise spent his 10% time building a tool to track and graph various measures of our code base. But, rather unusually, these weren’t the usual metrics that the big static analysis tools gather – these were much more tightly focused, much more specific to the issues we face. So what kind of things can you measure easily yourself?

  • If you have a god class, why not count the number of lines in the file? Less is better.
  • If you have a 3rd party library you’re trying to get rid of, why not count the number of references to it.
  • If you have a class you’re trying to eliminate, why not count the number of times its imported?

These simple measures represent real technical debt we want to remove – by removing technical debt we will be improving the quality of our code base. They can also be incredibly easy to gather, the most naive approach only needs grep & wc.

It doesn’t matter what you measure, as long as the team believe whatever you do measure should be improved; then it gives you an insight into the quality of your code base, using a measure you care about.

3. Make It Visible

Finally, put this on a screen somewhere – next to your build status is good. That way everyone can see how you’re doing and gets a constant reminder that quality is important. This feedback is vital – you can see when things are getting better and, just as importantly, when things start to slip and the graph veers ominously upwards.

Keep It Simple, Stupid

Code quality is such an abstract concept its impossible to measure. Instead, focus on specific things you can measure easily. The simpler the metric is to understand the easier it is to improve. If you have to explain what a metric means you’re doing it wrong. Try and focus on just a few things at any one time – if you’re tracking 100 different metrics its going to be sheer luck that on average they’re all getting better. If we instead focus on half a dozen, I can remember them – the very least I’ll do is not let them get worse; and if I can, they’ll be clear in my mind so I can improve them.

Do you use metrics? If so, what do you measure? If not, do you think there’s something you could measure?

The true cost of technical debt

Whether you like to think of it as technical debt or an unhedged call option we’re all surrounded by bad code, bad decisions and their lasting impact on our day to day lives. But what is the long term impact of these decisions? Are we really making prudent choices? Martin Fowler talks about the four classes of technical debt – from reckless and deliberate to inadvertent and prudent.

Deliberate reckless debt

Deliberate reckless technical debt is just that: developers (or their managers) allowing decisions to be made that offer no upside and only downside – e.g. abandoning TDD or not doing any design. Whichever way you look at it, this is just plain unprofessional. If the developers aren’t capable of making sensible choices, then management should have stepped in to bring in people that could. Michael Norton labels this “cruft not technical debt“. If we’re getting no benefit and simply giving ourselves an excuse to write crappy code then it’s not technical debt, it’s just crap.

Inadvertent debt

Inadvertent technical debt is tricky. If we didn’t know any better, how could we have done any differently? Perhaps industry standards or best practices have moved on. I’m sure once upon a time EJBs were seen as a good idea, now they look like pure technical debt. Today’s best practice so easily becomes tomorrow’s code smell.

Or perhaps we’d never worked in this domain before, if we had the domain knowledge when we started – maybe the design would have turned out different. Sometimes technical debt is inevitable and unavoidable.

Prudent deliberate debt

Then there’s prudent, deliberate debt. Where we make a conscious choice to add technical debt. This is a pretty common decision:

We need to recognise the revenue this quarter, no matter what

We’ve got marketing initiatives lined up so we’ve got to hit that date

We’ve committed to a schedule so don’t have time for rework

Sometimes, we have to make compromises: by doing a sub-standard job now, we get the benefit of finishing faster but we pay the price later.

Unlike the other types of technical debt, this is specifically a technical compromise. We’ve made a conscious decision to leave the code in a worse state than we should do. We know this will slow us down later; we know we need to come back and fix it in “phase 2”; but to hit the date, we accept compromise.

Is it always the right decision?

1. Compromise is always faster in the short-term and slower in the long-term

Given a choice between what we need now and some unspecified “debt” to deal with later – the obvious choice is always to accept compromise.

2. Each compromise is minor, but they compound

Unlike how most people experience “debt”, technical debt compounds. Each compromise we accept increases the cost of all the existing debt as well. This means the cost of each individual compromise may be small, but together they can have a massive impact.

First of all I decide not to refactor some code. Next time round, because its not well factored, it’s harder to test, so I skip some unit tests. Third time round, my test coverage is poor, so I don’t have the confidence to refactor extensively. Now it’s getting really hard to test and really hard to change. Little by little, but faster and faster, my code is deteriorating. Each compromise piles on the ones that went before and amplifies them. Each decision is minor, but they add up to a big headache.

3. It’s hard to quantify the long term cost

When we agree not to rework a section of code, when we agree to leave something half-finished, when we agree to rush something through with minimal tests: it’s very difficult to estimate the long term cost. Sure, we can estimate what it would take to do it right – we know what the principal is. But how can we estimate the interest payments? Especially when they compound.

How can I estimate the time wasted figuring out the complexity next time? The time lost because a bug is harder to track down. The extra time it takes because I can’t refactor the code so easily next time. The extra time it takes because I daren’t refactor the code next time; and the extra debt this forces me to take on. How can I possibly quantify this?

At IMVU they found they:

underestimated the long-term costs [of technical debt] by at least an order of magnitude

Is it always wrong?

There are obviously cases where it makes sense to add technical debt; or at least, to do something half-assed. If you want to get a new feature in front of customers to judge whether its valuable, it doesn’t need to be perfect, it just needs to be out there quickly. If the feature isn’t useful for users, you remove it. You’ve saved the cost of building it “perfectly” and quickly gained the knowledge you needed. This is clearly a cost-effective, lean way to manage development.

But what if the feature is successful? Then you need a plan for cleaning it up; for refactoring it; making sure it’s well tested and documented sufficiently. This is the debt. As long as you have a plan for removing it, whether the feature is useful or not – then it’s a perfectly valid approach. What isn’t an option is leaving the half-assed feature in the code base for years to come.

The difference is having a plan to remove the debt. How often do we accept compromise with a vague promise to “come back and fix later” or my personal favourite “we’ll fix that in phase 2”. My epitaph should be “now working on phase 2”.

Leaving debt in the code with no plan to remove it is like the guy who pays the interest on one credit card with another. You’re letting the debt mount up, not dealing with it. Without a plan to repay the debt, you will eventually go bankrupt.

The Risk of Technical Debt

Perhaps the biggest danger of technical debt is the risk that it represents. Technical debt makes our code more brittle, less easy to change. As @bertvanbrakel said when we discussed this recently:

Technical debt is a measure of code inflexibility

The harder it is to change, the more debt laden our code. With this inflexibility, comes the biggest risk of all: that we cannot change the code fast enough. What if the competitive or regulatory environment suddenly changes? If a new competitor launches that completely changes our industry, how long does it take us to catch up? If we have an inflexible code base, what will be left of our business in 2, 3 or 4 years when we finally catch up?

While this is clearly a worst case scenario, this lack of flexibility – this lack of innovation – hurts companies little by little. Once revolutionary companies become staid, unable to react and release only derivative products. As companies find themselves unable to innovate and keep up with the ever changing landscape – they risk becoming irrelevant. This is the true cost of technical debt.

Without a plan to repay the technical debt; with no way to reliably estimate the long term cost of not repaying it, are we really making prudent, deliberate choices? Isn’t the decision to add technical debt simply reckless?