Friction in Software

Friction can be a very powerful force when building software. The things that are made easier or harder can dramatically influence how we work. I’d like to discuss three areas where I’ve seen friction at work: dependency injection, code reviews and technology selection.

DI Frameworks

A few years ago a colleague and I discussed this and came to the conclusion that the reason most DI frameworks suck (I’m looking in particular at you, Spring) is that they make adding new dependencies so damned easy! There’s absolutely no friction. Maybe a little XML (shudder) or just a tiny little attribute. It’s so easy!

So when we started a new, greenfield project, we decided to put our theory to the test and introduced just a little bit of friction to dependency injection. I’ve written before about the basic scheme we adopted and the AOP endpoint it reached. But the end result was, I believe, very successful. After a couple of years of development we still had of the order of only 10-20 dependencies. The friction we’d introduced was light (add a couple of lines to a single class), but it was sufficient to act as a constant reminder not to just add a new dependency because it was easy.

Code Reviews

I was reminded of this recently when discussing code reviews. I have mixed feelings about code reviews: I’ve seen them work well, and it is better to have code reviews than not to have them; but it’s better still to pair program. But not all teams, not all developers, like pair programming – so code reviews exist. The trouble with code reviews is they can provide a form of friction.

If you & I are pairing on a piece of work, we will discuss the various trade-offs as we go: do we spend time on this, do we refactor that, etc etc. The constant judgements about what warrants attention and what can be left for another day are verbalised and agreed. In general I find the code written while pairing is high in quality but also remains tightly focused on task. The long rambling refactors I’ve been guilty of in the past disappear and the lazy “quick hacks” we all try and explain away to ourselves, aren’t so easy to gloss over when pairing.

But code reviews exist outside of this dynamic. In the cold light of the following day, someone uninvolved reviews your work and passes judgement on whether they think it’s up to scratch. It’s easy to see why this becomes combative: rather than being collaborative it can be seen as a judgement being passed, on not only the code but the author, too.

When reviewing code it is easy to set a very high bar, higher than you might set for yourself and higher than you might have agreed when pairing. Now, does this mean the comments aren’t valid? Absolutely not! You’re right, there is a test case missing here, although my change is unrelated, I should have added the missing test case. And you’re right this code is a mess; it was a mess before I was here and made a simple edit; but you’re right, I should have tidied it up. Everyone should practice code gardening.

These are all perfectly valid comments. But they create a form of friction. When I worked on a team that relied on these code reviews you knew you were going to get comments: so you kept the commit small, so as to minimize the diff. A small diff minimizes the amount of extra tests you could be asked to write. A small diff keeps most of the existing mess out of the review, so you won’t be asked to start refactoring.

Now, this seems dysfunctional: we’re deliberately trying to optimize a smooth passage through the review process, instead of optimizing for code quality. Worse than this though was what never happened: refactoring commits. Looking back I realise that the only code reviews I saw (as both reviewer and reviewee) were for feature changes. There were never any code reviews submitted for purely technical debt reduction. Sure, there’d be some individual commits in amongst the feature changes. But never any dedicated, multi-commit sessions, whose sole aim was to improve the code base. Which was a shame, because like any legacy code base, there was scope for improvement.

Comparing this to teams that don’t do code reviews, where I’ve tended to see more effort on reducing technical debt. Without fearing an endless cycle of review comments, developers are free to embark on refactoring efforts (that may or may not even work out!) – but at least they can try. Instead, code reviews provide a form of friction that might actually hurt code quality in the long run.

Technology Selection

I was talking to another colleague recently who is convinced that Hibernate is still the best way to get data in and out of a relational database. I can’t really work out how to persuade people they’re wrong – surely using Hibernate is enough to persuade you? Especially in a large, legacy code base – the pain that Hibernate causes is obvious. Yet plenty of people still believe in Hibernate. There are even people that still believe in Spring. Whether or not they still believe in the tooth fairy is unclear.

But I think technology selection is another area where friction is important. When contemplating moving away from something well-known and well used in industry like Spring or Hibernate there is a lot of friction. There are new technologies to learn, new approaches to understand and new risks to manage. This all adds friction, so sometimes it’s easiest just to stick with what we know. Sometimes it really is the right choice – the technology you have expertise in is the one you’ll be most productive in immediately. But there are longer term questions too, which are much harder to answer: will the team eventually be more productive using technology X than technology Y?

Friction in software is a powerful process: we’re very lazy creatures, constantly trying to optimise. Anything that slows us down or gets in our way quickly gets side-stepped or worked around. We can use this knowledge as a tool to guide developer behaviour; but sometimes we need to be aware of how friction can change behaviours for the worse as well.

Copy & paste driven development

Software development is rife with copy & paste: all of us resort to copy and paste coding sometimes. We know we probably shouldn’t, but we do
it anyway. It’s like the industry’s dirty little secret: we mainly jucopy-paste-naileditst copy and paste code from the internet or from somewhere else in the code base then bash it till it works.

But maybe, just maybe, the fact that we all rely on this from time to time should be telling us something?

The good

Sometimes copy & paste coding can be a good thing. A while ago I was pairing with someone where we did what I would call “search and replace coding”. I love to code golf. In tools like Eclipse, Intellij or Resharper there is always an optimal way to make each change, letting the tools do as much of the typing as possible. So it was with fascination recently that a colleague showed me an interesting code golf using search & replace.

The change was around extending an existing class with a load of new fields. We had a basic class, with a couple of sample fields, and a test that verified something simple about the class – say that it could be serialized to JSON successfully. We needed to add a boatload of new fields to the class (don’t ask). This involved five separate tasks which, at a macro level, had a lot in common:

  • Adding each field
  • Initialising each field in the constructor
  • Modifying the test to setup sample values for each field
  • Modifying the test to pass the sample value for each field to the constructor
  • Writing an assert that each field had successfully serialized and deserialized

My colleague demonstrated that we could write the list of fields once. Then, copy & paste, search & replace – we have a list of parameters for the constructor. By carefully crafting the search term and a suitable replacement term you can do some limited meta-programming. Taking the list of fields and transforming it into the actual lines of code you need in each instance. The list of fields replaced one way gives you a list of parameters to the constructor; with a different replacement you get the field declarations in the test; with another replacement you get assert statements.

I found this a very interesting way to write software. It definitely optimized the number of key presses required to type the code in. The long, boring list of field names only had to be typed in once; after that merely carefully crafting a search & replace regex to do the lifting for us. But it demonstrates an underlying truth: we had five separate changes to make, which accepted a parameter. If we could have actually meta-programmed this, we would have passed in the list of fields to some meta-programming code which would output the desired lines of code.

While it’s very clever that we could use search & replace meta-programming for this, it feels like the tools are lacking somehow.

The bad

Everyone copies code from stackoverflow from time to time. Hopefully you do it sensibly, using it as a starting point for your own production code. Working through what you’ve just pasted in to understand it, experimenting with it, modifying it, making it fit for your purpose. Rather than just blindly copy & pasting random code from the internet. I mean, who’d just randomly run code from the internet?

Stackoverflow is great. It’s an amazing resource for programmers. While learning WPF I got the sense that as a technology it couldn’t have taken off without something like stackoverflow. The technology is so opaque, so hard to learn. It took me many, many months of copy & pasting xaml from stackoverflow until I started to really understand how it worked. This is a technology that is not trivial to understand. Without being able to just copy & paste code from the internet, the technology would take a lot longer to learn.

But often, we’re lazy, we try it. It works. Woohoo, next problem. I’ve written before about voodoo programming, but the problem is that it’s easy to think you’ve understood what code is doing. If you didn’t actually have to invent those lines, to reason through to them – maybe you don’t understand. Maybe you only think you know what the code’s doing? Maybe there’s some horrific bug you’re not aware of yet?

The ugly

Almost every time I’ve seen TDD done at any scale, when it comes to writing a new test, the first question to ask is: “which existing test is this most like?” Yup, where can I go copy & paste. I’m so lazy, I don’t want to have to invent a whole test setup on my own. I’ll just borrow somebody else’s homework. We’ve all done it. It seems to be an unwritten rule of TDD. By the time you’re on to the third or fourth test in a given file, I guarantee the temptation to just copy & paste to make the fifth test is incredibly strong.

The trouble with this is all sorts of weird test artefacts get copied forwards. You start off with a simple test, with some simple setup. Say an empty bank account. Then to test non-zero balance you need an account with a transaction. Then to test balance summing you need an account with two transactions. Each test so far is building on the last, so you’ve just copy & pasted the previous one as a starting point. The fourth test is inserting a new transaction, so you just copy the third test (with two existing transactions, unnecessary for this test). Test five is that a withdrawal can’t go below zero, so you copy & paste the previous test and set the amount to be a large negative value. Test six is an overdraft test, so you copy & paste the previous test but change the account setup to include an overdraft so the balance can go negative. By the time you copy & paste test seven, your starting point is an account with an overdraft, with two transactions where a third transaction is added. Test seven is about adding a standing order. None of this noise is necessary.

This might seem a ridiculous example, but I see this time and again in real code. Sometimes it’s not even me that’s done it. This noise accumulates as you read down a test file. The tests at the bottom have all sorts of weird artefacts that were only relevant to one test half way up the file. It means fixing tests and changing behaviours becomes a real problem. If I change, say, how overdrafts are defined in my example above – I might have half a dozen tests to change, only one of which even mentions overdrafts. But they’ve inherited that setup because the tests were just copy & pasted around. Not only does this make tests hard to read and understand it makes them hard to maintain.

We all do it. It seems a pretty accepted part of how TDD happens in the wild. And yet, it clearly isn’t right. With discipline, we can keep our tests clean. Yes, when we’re all being conscientious developers we start writing our tests from scratch each time. But most of the time we’re busy or lazy or whatever.

Conclusion

What do these three things have in common, besides copy & paste? In each example we’re using copy & paste to save time. To find the most efficient path through the work we’re doing. Nobody is doing it out of malice or stupidity. Laziness? Almost certainly – but the good kind. The kind of laziness that encourages elegant solutions.

But copy & paste isn’t an elegant solution. It’s a crappy solution to a more fundamental problem: our tools are deficient. Really what we’re working around is the fact that our tools don’t let us express what we really want.

What if we could write our tests in a higher level language? “A test with a bank account with two transactions”. Sure, there are internal & external DSLs you can use to do this. But typically the cost in setting up the DSL isn’t worth the hassle for unit tests. It would completely ruin the flow of TDD. Does that just mean we haven’t found a good way of doing it yet? Is there a way we could more fluidly express the intent of the test, filling in the gaps as we go?

Instead of copy & pasting code from the internet, could our tools get smarter? Could we take some of the amazing machine learning stuff that’s going on and apply it to software development? I tried playing with an IntelliJ plugin recently that promises just that. Unfortunately it’s pretty buggy at the minute and doesn’t really work. But the idea is incredibly attractive. I like the idea of being able to express intent instead of mindlessly typing in the nuts & bolts.

Finally, instead of doing search & replace coding, wouldn’t it be great if we could actually meta-program? If we could actually write code that would write code for us? Not just code generators, but something that can generate small sections of code. I had a very limited go at this some time ago with rescripter – but it turns out its very hard to write a decent meta-programming tool, that anyone except the author can understand. But I think the idea still has merit: too often I find cases where I can describe the intent of my change very succinctly, but implementing it will involve far more typing than I’d like.

Never trust a passing test

One of the lessons when practising TDD is to never trust a passing test. If you haven’t seen the test fail, are you sure it can fail?

traffic-lights-208253_1920Red Green Refactor

Getting used to the red-green-refactor cycle can be difficult. It’s very natural for a developer new to TDD to immediately jump into writing the production code. Even if you’ve written the test first, the natural instinct is to keep typing until the production code is finished, too. But running the test is vital: if you don’t see the test fail, how do you know the test is valid? If you only see it pass, is it passing because of your changes or for some other reason?

For example, maybe the test itself is not correct. A mistake in the test setup could mean we’re actually exercising a different branch, one that has already been implemented. In this case, the test would already pass without writing new code. Only by running the test and seeing it unexpectedly pass, can we know the test itself is wrong.

Or alternatively there could be an error in the assertions. Ever written assertTrue() instead of assertFalse() by mistake? These kind of logic errors in tests are very easy to make and the easiest way to defend against them is to ensure the test fails before you try and make it pass.

Failing for the Right Reason

It’s not enough to see a test fail. This is another common beginner mistake with TDD: run the test, see a red bar, jump into writing production code. But is the test failing for the right reason? Or is the test failing because there’s an error in the test setup? For example, a NullReferenceException may not be a valid failure – it may suggest that you need to enhance the test setup, maybe there’s a missing collaborator. However, if you currently have a function returning null and your intention with this increment is to not return null, then maybe a NullReferenceException is a perfectly valid failure.

This is why determining whether a test is failing for the right reason can be hard: it depends on the production code change you’re intending to make. This depends not only on knowledge of the code but also the experience of doing TDD to have an instinct for the type of change you’re intending to make with each cycle.

When Good Tests Go Bad

A tragically common occurrence is that we see the test fail, we write the production code, the test still fails. We’re pretty sure the production code is right. But we were pretty sure the test was right, too. Eventually we realise the test was wrong. What to do now? The obvious thing is to go fix the test. Woohoo! A green bar. Commit. Push.

But wait, did we just trust a passing test? After changing the test, we never actually saw the test fail. At this point, it’s vital to undo your production code changes and re-run the test. Either git stash them or comment them out. Make sure you run the modified test against the unmodified production code: that way you know the test can fail. If the test still passes, your test is still wrong.

TDD done well is a highly disciplined process. This can be hard for developers just learning it to appreciate. You’ll only internalise these steps once you’ve seen why they are vital (and not just read about it on the internets). And only by regularly practising TDD will this discipline become second nature.

How many builds?

I’m always amazed at the seemingly high pain threshold .net developers have when it comes to tooling. I’ve written before about the poor state of tooling in .net, but just recently I hit another example of poor tooling that infuriates me: I have too many builds, and they don’t agree whether my code compiles.

One of the first things that struck me when starting to develop on .net was that compiling code was still a thing. An actual step that had to be thought about. Incremental compilers in Eclipse and the like have been around for ages – to the point where, generally, Java developers don’t actually have to instruct their IDE to compile code for them. But in Visual Studio? Oh it’s definitely necessary. And time consuming. Oh my god is it slow. Ok, maybe not as slow as Scala. But still unbelievably slow when you’re used to working at the speed of thought.

Another consequence of the closed, Microsoft way of doing things is that tools can’t share the compiler’s work. So ReSharper have implemented their own compiler, effectively. It incrementally parses source code and finds compiler errors. Sometimes it even agrees with the Visual Studio build. But all too often it doesn’t. From the spurious not-actually-an-error that I have to continually instruct ReSharper to ignore; to the warnings-as-errors build failures that ReSharper doesn’t warn me about; to the random why-does-ReSharper-not-know-about-that-NuGet-package-errors.

This can be infuriating when refactoring. E.g. if an automated refactor leaves a variable unused, I will now have a compiler warning; since all my projects run with warnings-as-errors switched on, this will fail the build. But ReSharper doesn’t know that. So I apply the refactoring, code & tests are green: commit. Push. Boom! CI is red. But it was an automated refactor for chrissakes, how’ve I broken the build?!

I also use NCrunch, an automated test runner for Visual Studio (like Infinitest in the Java world). NCrunch is awesome, by the way; better even than the continuous test runner in ReSharper 10. If you’ve never used a continuous test runner and think you’re doing TDD, sort your life out and setup Infinitest or NCrunch. It doesn’t just automate pressing the shortcut key to run all your tests. Well, actually that is exactly what it does – but the impact it has on your workflow is so much more than that. When you can type a few characters, look at the test output and see what happened you get instant feedback. This difference in degree changes the way you write code and makes it so much easier to do TDD.

Anyway I digress – NCrunch, because Microsoft, can’t use the result of the compile that Visual Studio does. So it does its own. It launches MSBuild in the background, continually re-compiling your code. This is not exactly kind on your CPU. It also introduces inconsistencies. Because NCrunch is running a slightly different MSBuild on each project to the build VisualStudio does you get subtly different results sometimes; which is different again from ReSharper with its own compiler that isn’t even using MSBuild. I now have three builds. Three compilers. It is honestly a miracle when they all agree that my code compiles.

An all-too-typical dev cycle becomes:

  • Write test
  • ReSharper is happy
  • NCrunch build is failing, force reload NCrunch project
  • NCrunch builds, test fails
  • Make test pass
  • Try to run app
  • VisualStudio build fails
  • Fix NuGet problems
  • NCrunch build is now failing
  • Force NCrunch to reload at least one project again
  • Force VisualStudio to rebuild the project
  • Then the solution
  • Run app to sanity check change
  • ReSharper now shows error
  • Re-ignore perennial ReSharper non-error
  • All three compilers agree, quick: commit!

Normally then the build fails in CI because I still screwed up the NuGet packages.

Then recently, as if this wasn’t already one of the outer circles of hell. The CI build was failing for a bizarre reason. We have a command line script which applies the same build steps that CI runs, so I thought I’d run that to replicate the problem. Unfortunately the command line build was failing on my machine for a spectacularly spurious reason that was different again than the failure in CI. Great. I now have five builds which don’t all agree on whether my code compiles.

Do you really hate computers? Do you wish you had more reasons to defenestrate every last one of them? Have you considered a career in software development?

Cutting Corners

The pressure to deliver yesterday is strong. If it’s not customers nagging you, it’s project managers breathing down your neck or your own self-doubt that this should have been simpler: the desire to get the task done quicker can often be irresistible. How do you strike the right balance between cutting corners and polishing the turd?

While working through a feature I maintain a “navigator pad” of things I want to come back to. These are refactorings I’ve spotted, tests that need cleaning up, design smells to look at or just plain questions I’m curious to know the answer to (can foo ever actually be null? is this method really used?) This list ebbs and flows as I’m working through a feature: some days I seem to do nothing but add new things to it, other days I manage to cross half the list off as some much-needed refactoring becomes critical to complete the next change. But the one constant throughout a feature is the nav pad.

Recently I was nearing the end of a feature and my nav pad didn’t seem to be getting any shorter. I’d spent a good bit of time refactoring things, but new problems kept appearing – it didn’t seem like I’d ever be “Done”. The feature was way behind schedule, my self-doubt was growing: I’m trying to do a good job, I don’t want this to take any longer but I keep spotting things I got wrong before or simply missed. Suddenly one morning, within the space of a couple of hours, I crossed 20 items off the nav pad, sat back and realised: it’s empty! I was Done.

The next thing that struck me was what a strange occurrence this was: I couldn’t remember the last time I’d actually crossed everything off the nav pad. There would always be some last refactorings on the list that on balance could wait until another time; some tidy up that could wait until another day; some question that I no longer cared to know the answer to. But for the first time in a long time, I’d crossed everything off!

Then the doubt sets in: have I over-engineered this? Could I have been done quicker? The pressure to cut corners is really strong: we’re always pushed to be done faster, to do the absolute minimum we can get away with. Yet I know what needs to be done, I know what the problems are with this code: I’ve written them all down in the nav pad. If I don’t fix them now, then when?

A pattern I see all-too-frequently when I come up against a design smell: I can see the design is wrong, the tests are a mess, the production code is a mess; there’s definitely a better way, I just can’t see it at the minute. I park the refactoring on the nav pad. I come back to it later after ticking off a few more parts of the feature, but I still can’t see a way to resolve the design smell. I spend a couple of hours refactoring back and forth – in the end I declare bankruptcy and raise an issue in the issue tracker. If I’m lucky I’ll pick up the issue again in a couple of months, have a half-hearted look at it but realise I can’t remember what I was really thinking at the time and close the issue. More likely after a few months with nobody picking up the issue I’ll quietly close it. My code guilt has been neatly dealt with. But the crap code still remains.

The pressure to cut corners is incredibly strong, that pressure is strongest when you’re facing a particularly difficult design change. You’ve identified a problem in the design, probably made obvious by other changes you’ve made. You’re struggling to correct it, which means it isn’t easy to resolve; but it’s obviously a problem because you’ve already spent time trying to resolve it. That means the next time you come through here you’re going to spot the same problem and hate the you of today for not fixing it. And yet, this moment right now is the clearest you’ve ever understood the problem. If you give up now, you’ll have to reload into memory all the context you’ve got right now – what makes you think you’ll be in less of a rush in six months time? That you’ll have time to re-learn this code? Time to do what should have been done today?

The pressure to be done yesterday is strong, but today is the best you’ve ever understood this code: so use that understanding to leave it better than you found it. If you’ve removed all the sharp edges you saw on your way through then at least you’re leaving the code better than you found it. Tomorrow when you pass this way, you’ll pass through a little quicker, with fewer sharp edges to distract you. But today? Today you have code gardening to do.