Git stash driven development

I’ve found myself using a pattern quite often recently, which I’ve been calling “git stash driven development” – that is, relying heavily on the magic of git stash as part of my development workflow.

Normally I follow what I think of as a fairly typical TDD workflow:

  • Write next test, watch it fail
  • Write code to make it pass
  • Commit
  • Refactor
  • Commit
  • Push

This cycle can repeat very frequently – as often as every couple of minutes. Sometimes this cycle gets slowed down when the next test to write isn’t obvious or the refactoring needs more thought. But generally this is the process I try and follow.

Quite often having written the next test which takes me forwards on my feature I hit a problem: I can’t actually make the test pass (easily). First I need to refactor to make the problem easy. In that situation I can mark the test as ignored, commit and come back to it later. I refactor as required, commit, push; then finally unignore my test and get back to where I was before. This is a fairly neat process.

However there are a couple of times when this process doesn’t work: what if I’m part way through writing my test and I realise I can’t finish without refactoring the test infrastructure? I can’t ignore my test, it probably isn’t even compiling. I certainly don’t want to commit it in its current state. I could just bin my test and re-write it, if I’m following the 15 minute rule I’m not going to lose much work. But, with the magic of git stash, I can stash my changes and come back once I’ve refactored the test code.

The more annoying time this happens is when I’m part way through a refactor step. This happens more commonly when I’m really going through a design-change – this isn’t really refactoring as it often happens outside of the normal TDD loop. I’m trying to evolve the design to somewhere different; sometimes this is driven by tests, sometimes its a non-feature changing refactor. But often there are non-trivial changes happening across numerous source files. At this point it is very easy to get part way through a refactor and realise that something else needed to have happened first. I could bin my change, I only stand to lose 15 minutes work – but why throw it away when I have git stash?

So I git stash my changes, go and make the change I needed to have happened first. Then, all too commonly, I get part way through this second change and realise something else needs to happen first. Well, git stash again! This stack of git stashes can get quite deep, if you’re not careful. But once I’ve bottomed the stack out, once I’ve managed to commit a refactor that frees up the step above I can git stash pop, complete the next refactor, commit, git stash pop; and so on up the stack until I’m done.

Now, arguably, I’m discovering the refactor in reverse order, but this seems to me often how I find it. I could have spent more time analysing the change in detail, of course. Spent time planning out my change on paper before embarking on it in the correct order. However, this is always time consuming and there’s still the risk that I miss something and come at a change “backwards”. I find that using git stash in this way lets me discover the refactor that I need to make one step at a time. Each commit is kept small, I try and stick to the 15 minute rule so that no single commit loses more than 15 minutes. Ultimately the design change is completed in a sequence of small commits, each of which builds logically on the one before. They’ve been discovered by exploration, the commits were just discovered in reverse order.

The danger is always that I find a refactor step I can’t complete the way I’d imagined – now I can’t unwind the stack and potentially all the previous git stashes aren’t committable. Whenever this happens I normally find going one or two levels up the stack will present a different approach, from where I can continue as before.

VW’s rogue software developers

So Michael Horn has thrown a couple of software developers under the proverbial bus by blaming them for the defeat device at the centre of the emissions scandal. Now, it is clearly ridiculous to suggest that a couple of rogue individuals single-handedly saved VW’s clean diesel engine program and nobody else had any idea what was going on. However, I think it is fair to say that a couple of software developers did know what was going on and did nothing.

Unless VW is unlike every other organisation I’ve ever know it is inconceivable that nobody outside the dev team would have known what was going on. It’s a pretty rare organisation that leaves software developers to just bash away at the keyboard and dream up some cool stuff. Almost everywhere programmers are managed, project managed and product managed to make sure they keep churning out the good stuff. Developers aren’t given free reign to just make up emissions test defeating software for fun. What was this, VW’s equivalent of 10% time?

Let’s cut the developers some slack then – they were probably just doing what they had been told to. I’ve worked in large organisations that sailed pretty close to regulatory lines and I can well imagine that this was just one change in amongst hundreds that were in what might generously be called a “grey area”. However, did they know that the software was going to be used to cheat emissions tests? Did they know this would leave their product in breach of the law in some countries? Or were they just ignorant fools?

Maybe they didn’t know what they were doing. Maybe the exact details of the goals of the software were kept secret from them – this is entirely possible. If we assume some people in management were aware of what was being done, and the legal implications of what they were doing: every effort would be made not to commit any details to paper and to limit the number of people who have the full picture. Where possible, the developers would be given a very specific set of requirements which would lead them to implement the right thing, without them necessarily understanding the eventual impact. With an unquestioning workforce an amazing amount can be achieved while only a handful of people understand the full story.

However, this is not to excuse the developers: we are not mindless automatons, we are intelligent creatures. We are capable of questioning why. In fact, as a professional developer, I think it is my duty to ask why. If I don’t understand how a requirement fits into the environment, how can I possibly be sure I’m building it right? I think it is up to each of us to ensure we know how our software will be used. This is not to make us the world’s social conscience – but to make us better developers.

Now if they did know what the software was to be used for: they are complicit in this law-breaking. They understood what they were doing, understood it would be against the law. And yet they did it anyway. It is not sufficient to argue that they were just “following orders”. Many people throughout history were “just following orders” and through their hands great evils were perpetrated. Now a breach of the clean air act is no holocaust, but the individuals involved must bear some of the responsibility for what they have done.

But we take no responsibility in this industry. We happily churn out rubbish code that is full of bugs “because management told us to”. We will happily churn out law-breaking software “because management told us to”. When will we start taking some responsibility for our actions? When will we show some professional standards? This doesn’t mean that we should be held accountable for every single defect in every line of code. But if I’ve not followed best practices and my code has an issue which costs my customer money, am I liable? If I’d done a better job would the code have had the same issue? Maybe, maybe not. Who takes responsibility for standards in software? Is it the customer’s responsibility? Or is it about time we took responsibility? About time we showed some pride in our work. About time we showed some professionalism.

Your DI framework is killing your code

I read a really interesting post recently looking at the difference between typical OO code and a more functional style. There’s a lot to be said for the functional style of coding, even in OO languages like Java and C#. The biggest downside I find is always one of code organisation: OO gives you a discoverable way of organising large amounts of code. While in a functional style you might have less code, but it’s harder to organise it clearly.

It’s not Mike’s conclusion that I want to challenge, it’s his starting point: he starts out with what he describes as “typical object oriented C# code”. Unfortunately I think he’s bang on: even in this toy example, it is exactly like almost all so-called “object oriented” code I see in the wild. My issue with code like this is: it isn’t in the least bit object oriented. It’s procedural code haphazardly organised into classes. Just cos you’ve got classes, don’t make it OO.

What do I mean procedural code? The typical pattern I see is a codebase made up of two types of classes:

  1. Value objects, holding all the data with no business logic
    Extra credit here if it’s an object from your persistence layer, nhibernate or the like.
  2. Classes with one or two public functions –  they act on the value objects with no state of their own
    These are almost always “noun-verbers”

A noun-verber is the sure-fire smell of poor OO code: OrderProcessor, AccountManager, PriceCalculator. No, calling it an OrderService doesn’t make it any better. Its still a noun-verber you’re hiding by the meaningless word “service”. A noun-verber means its all function and no state, it’s acting on somebody else’s state. It’s almost certainly a sign of feature envy.

The other design smell with these noun-verbers is they’re almost always singletons. Oh you might not realise they’re singletons, because you’ve cleverly hidden that behind your dependency injection framework: but it’s still a singleton. If there’s no state on it, it might as well be a singleton. It’s a static method in all but name. Oh sure its more testable than if you’d actually used the word “static”. But it’s still a static method. If you’d not lied to yourself with your DI framework and written this as a static method, you’d recoil in horror. But because you’ve tarted it up and called it a “dependency”, you think it’s ok. Well it isn’t. It’s still crap code. What you’ve got is procedures, arbitrarily grouped into classes you laughably call “dependencies”. It sure as hell isn’t OO.

One of the properties of good OO design is that code that operates on data is located close to the data. The way the data is actually represented can be hidden behind behaviours. You focus on the behaviour of objects, not the representation of the data. This allows you to model the domain in terms of nouns that have behaviours. A good OO design should include classes that correspond to nouns in the domain, with behaviours that are verbs that act on those nouns: Order.SubmitForPicking(). UserAccount.UpdatePostalAddress(), Basket.CalculatePriceIncludingTaxes().

These are words that someone familiar with the domain but not software would still understand. Does your customer know what an OrderPriceStrategyFactory is for? No, then it’s not a real thing. Its some bullshit you made up.

The unloved value objects are, ironically, where the real answer is to the design problem. These are almost always actual nouns in the domain. They just lack any behaviours which would make them useful. Back to Mike’s example: he has a Customer class – its public interface is just an email address property, a classic value object: all state and no behaviour [if you want to play along at home, I’ve copied Mike’s code into a git repo; as well as my refactored version].

Customer sounds like a good noun in this domain. I bet the customer would know what a Customer was. If only there were some behaviours this domain object could have. What do Customers do in Mike’s world? Well, this example is all about creating and sending a report. A report is made for a single customer, so I think we could ask the Customer to create its own report. By moving the method from ReportBuilder onto Customer, it is right next to the data it operates on. Suddenly the public email property can be hidden – well this seems a useful change, if we change how we contact customers then only the customer needs to change, not also the ReportBuilder. It’s almost like a single change in the design should be contained within a single class. Maybe someone should write a principle about this single responsibility malarkey…

By following a pattern like this, moving methods from noun-verbers onto the nouns on which they operate, we end up with a clearer design. A Customer can CreateReport(), a Report can RenderAsEmail(), and an Email can Send(). In a domain like this, these seem like obvious nouns and obvious behaviours for these nouns to have. They are all meaningful outside of the made up world of software. The design models the domain and it should be clear how the domain model must change in response to each new requirement – since each will represent a change in our understanding of the domain.

So why is this pattern so uncommon? I blame the IoC frameworks. No seriously, they’re completely evil. The first thing I hit when refactoring Mike’s example, even using poor man’s DI, was that my domain objects needed dependencies. Because I’ve now moved the functionality to email a report from ReportingService onto the Report domain object, my domain object now needs to know about Emailer. In the original design it could be injected in, but with an object that’s new’d up, how can I inject the dependency? I either need to pass Emailer into my Report at construction time or on sending as email. When refactoring this I opted to pass in the dependency when it was used, but only because passing in at construction time is cumbersome without support.

Almost all DI frameworks make this a royal pain. If I want to inject dependencies into a class that also has state (like the details of the customer’s report), it’s basically impossible, so nobody does it. It’s better, simpler, quicker to just pull report creation onto a ReportBuilder and leave Customer alone. But it’s wrong. Customer deserves to have some functionality. He wants to be useful. He’s tired of just being a repository for values. If only there was a way of injecting dependencies into your nouns that wasn’t completely bullshit.

For example using NInject – pretty typical of DI frameworks – you can create a domain object that requires injected dependencies and state by string typing. Seriously? In the 21st century, you want me to abandon type safety, and put the names of parameters into strings. No. Just say no.

No wonder when faced with these compromises people settle for noun-verbers. The alternatives are absolutely horrific. The only DI framework I’ve seen which lets you do this properly is Guice‘s assisted injection. Everybody else, as far as I can see, is just plain wrong.

Is your DI framework killing your code? Almost certainly: if you’ve got value objects and noun-verbers, your design is rubbish. And it’s your DI framework’s fault for making a better design too hard.

Two people coding is twice as productive, right?

Stands to reason, doesn’t it? If one person can make 5 widgets an hour, then two people can make 10 widgets an hour. Its just the natural way of things. You can’t argue with science.

The same is obviously true of software, isn’t it? If one developer can write 10 lines of code an hour, then clearly two can write 20 lines of code an hour. If you want more code written, just hire more developers. There’s nothing mythical about my man months.

And yet… somehow… software persists in being weird stuff.

This week I had an interesting experience. Me and one other developer have been working on a new, greenfield project. We’ve been ploughing through the work, ticking off stories at a decent rate. Only now it’s getting to that difficult stage where the original design ideas are rapidly giving way to new problems and new ideas; substantial refactoring is going on as we discuss better ways of representing our problem. This seems good and healthy.

Then I had one of those days where everywhere I turned there was a design problem. Not a single line of code could be written without me getting grumpy about the design. Worst of all, it was the code my co-worker had just finished that was showing the flaws in the original design. Cue much discussion. At one point he lamented that he could finish the task (that was blocking me from making progress) “if he could just get a 30 minute run at his computer”. It was nearly 5pm.

A day where a 30 minute spell of productive coding is hard to find is not a day where much code has been written. Oh we were productive, the design was much improved by day’s end. The code? Nothing to see here, move along, please. Were we really twice as productive that day? Hell no. I spent the entire day distracting him from completing his tasks to discuss design problems; he spent the entire day trying to merge a branch that my design refactoring had made difficult. We spent the entire day working against each other.

What could we have done differently? Well the first problem was trying to maintain two streams of development activity through the same (small) code base. We were tripping up over each other like crazy. Unwinding a few days, we probably would have got more done with just one person working. That way there would only be one narrative thread through the code, one sequence of refactoring steps at a time.

Wait, what – say that again: we would have got more done last week if only one person had been working on it. Well that’s just crazy talk, let me tell you about making widgets, boy…

I think we massively underestimate the cost of coordination and communication when building software. From the outside its very easy to miss: a quick 5 minute conversation laden with jargon. And yet… this is where the magic happens: this is where the design comes from. But if that 5 minute conversation interrupted someone’s work, the next 45 minutes could be lost while they try and reload into memory what they were working on. Pile up a few of these interruptions in your day, and no wonder it feels like you’re swimming upstream.

Clearly, what we should have been doing but weren’t was pairing. That way there would only have been one narrative thread. One sequence of ideas being applied at a time. Changes neatly serialized by there only being one keyboard.  Of course, by pairing we still could have had the design discussions – but instead they would appear at a time when we were both already stuck. There is no cost of interruption when you’re both already there, immersed in the problem. By pairing we would have stopped working against each other and created an interruption-free space for design discussions.

So in fact: two people can be more productive than one. Two people pairing is definitely better than one person working on their own. It’s made me realise that we’ve been explaining pairing all wrong: we try and justify the “cost” of pairing, as though we somehow have to explain why having two people working at the same machine really isn’t halving productivity. It’s all based on a false assumption: that two people working on different machines are twice as productive as one person working alone. Once you realise that this assumption is fundamentally flawed, the “cost” of pairing evaporates. Instead pairing removes the cost of coordination between two developers: no interruptions, no divergent ideas, no merge conflicts.

Pair programming is actually a cost-saving exercise.

Dependency injection with PostSharp

I don’t really like IoC containers. Or rather, I don’t like the crappy code people write when they’re given an IoC container. Before you know it you have NounVerbers everywhere, a million dependencies and no decent domain model. Dependencies should really be external to your application; everything outside of the core domain model that your application represents.

  • A web service? That’s a dependency
  • A database? Yup
  • A message queue? Definitely a dependency
  • A scheduler or thread pool? Yup
  • Any NounVerber (PriceCalculator, StockFetcher, BasketFactory, VatCalculator) no! Not a dependency. Stop it. They’re all part of your core business domain and are actually methods on a class. If you can’t write Price.Calculate() or Stock.Fetch() or new Basket() or Vat.Calculate() then fix your domain model first before you go hurting yourself with an IoC container

A while back I described a very simple, hand-rolled approach to dependency injection. But if we wave a little PostSharp magic we can improve on that basic idea. All the source code for this is available on github.

It works like this: if we have a dependency, say an AuthService, we declare an interface that business objects can implement to request that they have the dependency injected into them. In this case, IRequireAuthService.

class User : IRequireAuthService
{
  public IAuthService AuthService { set; private get; }

We create a DependencyInjector that can set these properties:

public void InjectDependencies(object instance)
{
  if (instance is IRequireAuthService)
    ((IRequireAuthService)instance).AuthService = AuthService;
    ...
}

This might not be the prettiest method – you’ll end up with an if…is IRequire… line for each dependency you can inject. But this provides a certain amount of friction. While it is easy to add new dependencies, developers are discouraged from doing it. This small amount of friction massively limits the unchecked growth of dependencies so prevalent with IoC containers. This friction is why I prefer the hand-rolled approach to off-the-shelf IoC containers.

So how do we trigger the dependency injector to do what it has to do? This is where some PostSharp magic comes in. We declare an attribute to use on the constructor:

  [InjectDependencies]
  public User(string id)

Via the magic of PostSharp aspect weaving this attribute causes some code to be executed before the constructor. This attribute is simply defined as:

public sealed override void OnEntry(MethodExecutionArgs args)
{
  DependencyInjector.CurrentInjector.InjectDependencies(args.Instance);
  base.OnEntry(args);
}

And that’s it – PostSharp weaves this method before each constructor with the [InjectDependencies] attribute. We get the current dependency injector and pass in the object instance (i.e. the newly created User instance) to have dependencies injected into it. Just like that we have a very simple dependency injector. Even better all this aspect weaving magic is available with the express (free!) edition of PostSharp.

Taking it Further

There are a couple of obvious extensions to this. You can create a TestDependencyInjector so that your unit tests can provide their own (mock) implementations of dependencies. This can also include standard (stub) implementations of some dependencies. E.g. a dependency that manages cross-thread scheduling can be replaced by an immediate (synchronous) implementation for unit tests to ensure that unit tests are single-threaded and repeatable.

Secondly, the DependencyInjector uses a ThreadLocal to store the current dependency injector. If you use background threads and want dependency injection to work there, you need a way of pushing the dependency injector onto the background thread. This generally means wrapping thread spawning code (which will itself be a dependency). You’ll want to wrap any threading code anyway to make it unit-testable.

Compile Time Checks

Finally, the most common failure mode we encountered with this was people forgetting to put [InjectDependencies] on the constructor. This means you get nulls at runtime, instead of dependencies. With a bit more PostSharp magic (this brand of magic requires the paid-for version, though) we can stop that, too. First, we change each IRequire to use a new attribute that indicates it manages injection of a dependency:

[Dependency]
public interface IRequireAuthService
{
  IAuthService AuthService { set; }
}

We configure this attribute to be inherited to all implementation classes – so all business objects that require auth service get the behaviour – then we define a compile time check to verify that the constructors have [InjectDependencies] defined:

public override bool CompileTimeValidate(System.Reflection.MethodBase method)
{
  if (!method.CustomAttributes.Any(a => a.AttributeType == typeof(InjectDependenciesAttribute)))
  {
    Message.Write(SeverityType.Error, "InjectDependences", "No [InjectDependencies] declared on " + method.DeclaringType.FullName + "." + method.Name, method);
    return false;
  }
  return base.CompileTimeValidate(method);
}

This compile time check now makes the build fail if I ever declare a class IRequireAuthService without adding [InjectDependencies] onto the class’ constructor.

Simple, hand-rolled dependency injection with compile time validation thanks to PostSharp!