VW’s rogue software developers

So Michael Horn has thrown a couple of software developers under the proverbial bus by blaming them for the defeat device at the centre of the emissions scandal. Now, it is clearly ridiculous to suggest that a couple of rogue individuals single-handedly saved VW’s clean diesel engine program and nobody else had any idea what was going on. However, I think it is fair to say that a couple of software developers did know what was going on and did nothing.

Unless VW is unlike every other organisation I’ve ever know it is inconceivable that nobody outside the dev team would have known what was going on. It’s a pretty rare organisation that leaves software developers to just bash away at the keyboard and dream up some cool stuff. Almost everywhere programmers are managed, project managed and product managed to make sure they keep churning out the good stuff. Developers aren’t given free reign to just make up emissions test defeating software for fun. What was this, VW’s equivalent of 10% time?

Let’s cut the developers some slack then – they were probably just doing what they had been told to. I’ve worked in large organisations that sailed pretty close to regulatory lines and I can well imagine that this was just one change in amongst hundreds that were in what might generously be called a “grey area”. However, did they know that the software was going to be used to cheat emissions tests? Did they know this would leave their product in breach of the law in some countries? Or were they just ignorant fools?

Maybe they didn’t know what they were doing. Maybe the exact details of the goals of the software were kept secret from them – this is entirely possible. If we assume some people in management were aware of what was being done, and the legal implications of what they were doing: every effort would be made not to commit any details to paper and to limit the number of people who have the full picture. Where possible, the developers would be given a very specific set of requirements which would lead them to implement the right thing, without them necessarily understanding the eventual impact. With an unquestioning workforce an amazing amount can be achieved while only a handful of people understand the full story.

However, this is not to excuse the developers: we are not mindless automatons, we are intelligent creatures. We are capable of questioning why. In fact, as a professional developer, I think it is my duty to ask why. If I don’t understand how a requirement fits into the environment, how can I possibly be sure I’m building it right? I think it is up to each of us to ensure we know how our software will be used. This is not to make us the world’s social conscience – but to make us better developers.

Now if they did know what the software was to be used for: they are complicit in this law-breaking. They understood what they were doing, understood it would be against the law. And yet they did it anyway. It is not sufficient to argue that they were just “following orders”. Many people throughout history were “just following orders” and through their hands great evils were perpetrated. Now a breach of the clean air act is no holocaust, but the individuals involved must bear some of the responsibility for what they have done.

But we take no responsibility in this industry. We happily churn out rubbish code that is full of bugs “because management told us to”. We will happily churn out law-breaking software “because management told us to”. When will we start taking some responsibility for our actions? When will we show some professional standards? This doesn’t mean that we should be held accountable for every single defect in every line of code. But if I’ve not followed best practices and my code has an issue which costs my customer money, am I liable? If I’d done a better job would the code have had the same issue? Maybe, maybe not. Who takes responsibility for standards in software? Is it the customer’s responsibility? Or is it about time we took responsibility? About time we showed some pride in our work. About time we showed some professionalism.

Dealing with technical debt

We’re drowning in technical debt. We have a mountain to climb and don’t really know where to start. Sound familiar? For many of us working on legacy code bases this is the day-to-day reality. But what to do about it?

How did we get here?

Technical debt is always the fault of those “other guys”. Those idiot developers that were here a few years ago. Morons. Obviously couldn’t code their way out of a game of life if their on-going existence depended on it.

I hate to tell you but: we are those other guys. The decisions we make today will look foolish tomorrow. We’ll have more information then, a different perspective; we’ll know how the product and technology were going to evolve. We can’t know that today, so many of our decisions will turn out to be wrong.

Where to start

Classes are like best-selling novels – some are spectacularly more popular / more debt-laden than others. One class, one package, one module – will be much worse than the others. There’ll be a handful of classes, packages etc… that are much worse than all the rest.

How does this happen? Well, one class ends up with a bit of technical debt. Next time I come to change that class, I’m too lazy to fix the debt, so I just hack something together to get the new feature done. Then next time round, there’s a pile of debt – only that guy’s too busy to fix it so he adds in a couple of kludges and leaves it. Before you know it, this one class has become a ten thousand line monster that’s pure technical debt.

It’s like the broken-windows theory – if code is already crappy, its much easier to just make it a little more crappy. If the code’s clean, it’s a big step to add in a hack. So little by little, technical debt accumulates in areas that were already full of debt. I suspect technical debt in code follows a power law – most classes have a little bit of debt, but a few are really shitty, with one diabolical class in particular:

technical debt graph

Where to start? Resist the temptation to make easy changes to relatively clean classes – start with the worst offender. It will be the hardest to fix, it might take a long time – but you’ll get the best bang-for-buck by fixing the most debt-heavy piece of crap code. If you can fix the worst offender, your debt will have to find somewhere else to hide.

The 80/20 rule

There’s a cliché that 80% of the cost of software is maintenance, only 20% is the initial build.

Let’s imagine a team that has 1200 hours a month to spend on useful work. For that 1200 hours of useful work, we’ll spend four times that much over the lifetime of the software maintaining it – from the 80/20 rule. Although we completed 1200 hours of feature work this month, we committed ourselves to 4800 hours of maintenance over the lifetime of the code.

That means next month, we have to spend a little of the 4800 hours of maintenance, with the rest of the time spent on useful, feature-adding work. However, adding new features commits us to even more maintenance work. The following month, we’ve got nearly twice the code to maintain so spend nearly twice the amount of time maintaining it and even less time producing value-adding features. Month-by-month we spend more and more time dealing with the crap that was there before and less and less time adding new features.

productivity graph

Does this sound familiar? This is what technical debt feels like. After a couple of years the pace has dropped; you’re spending half your time refactoring and fixing the junk that was there before. “If only we could get rid of this technical debt”, you cry.

What is technical debt?

We can all point to examples of crappy, debt-laden code we’ve seen. But what’s the impact of technical debt? Technical debt is simply an inability to quickly make changes to an existing system. This is the cost to the business of technical debt – what should be quick changes take an unpredictably long time.

What do we do when we remove technical debt? We generalise and find more abstract solutions. We clarify and simplify. We remove duplication and unnecessary complexity.

The net effect of reducing technical debt, is to reduce inventory.

Perhaps the amount of code – our inventory – is a good approximation for the amount of technical debt in a system. If I’m confronted with a million lines of code and need to make a change, it will probably take a while. However, if I’m only confronted by 1000 lines of code the change will be much quicker. But, if I’m confronted by zero lines of code, then there’s zero cost – I can do whatever I like. The cost of making a change to a system is roughly proportional to the size of the system. Large, complex systems take longer to make changes to than small, self-contained ones.

All code is a liability – the more code you have, the bigger the debt. When we’re paying back technical debt – are we really just reducing inventory? Is what feels like technical debt actually interest payments on all the inventory we hold?

What are the options?

Big bang

One option is to down-tools and fix the debt. Not necessarily throw everything out and rewrite, but spend some time cleaning up the mess. The big bang approach to dealing with technical debt. It’s pretty unusual for the business to agree to a plan like this – no new features for a year? Really? With no new features for a year what would all those product managers do all day?

From the 80/20 rule, the lifetime cost for any piece of code is four times what it cost to create. If it took three months to make, it will take a year to pay back. So wait, we’re gonna down tools for a year and only pay back three months of technical debt? Seriously? We’ll be marginally better off – but we’ll still be in a debt-laden-hell-hole and we’ll have lost a year’s worth of features. No way!

Dedicated Team

Even if you try to do big bang, it ends up becoming the dedicated team approach. As a compromise, you get a specific team together to fix the debt, meanwhile everyone else carries on churning out new features. One team are removing debt; while another team are re-adding it. What are the chances that debt is being removed faster than it’s being added? Exactly. Nil.

It makes sense – you need a team removing debt four times bigger than the team adding new features just to stay still.

Boy Scout

You could adopt a policy of trying to remove technical debt little and often – the boy scout approach. On every single development task try and remove debt near where you’re working. If there are no tests, add some. If the tests are poor, improve them. If the code’s badly factored, refactor it. The boy scout rule – leave the camp cleaner than you found it.

This is generally much easier to sell, there’s only minimal impact on productivity: it’s much cheaper to make changes to a part of the system you understand and are already working in than to open up whole new ones. But over time you can massively slow down the rate at which debt grows. Inevitably the system will still grow, inevitably the amount of debt will increase. But if you can minimise the maintenance cost you’ll keep the code small and nimble for as long as possible.

Professionalism

If we can lessen the maintenance cost of code even just a little we can save ourselves a fortune over the life of the code. If we can reduce the multiple to just three times the initial cost, so our 1200 hours work only costs 3600 hours in maintenance, we’ve saved enough development capacity to build another feature of the same size! For free! Hey, product manager, if we do our job better it’ll take no longer and you’ll get free features. Who doesn’t want free features?

If we can create well-crafted, DRY, SOLID code with good test coverage we have a good chance of minimising the lifetime maintenance cost. This is the best way we can keep our productivity up, to try and avoid getting mired in technical debt and keep the code base responsive to changing requirements. It’s the only way we can remain productive and agile.

Frankly, anything else is just unprofessional. If you’re deliberately committing your company to spend excessive amounts maintaining your shitty code – what the fuck, exactly, are they paying you for?