How much architecture is enough?

Software architecture is hard. Creating a simple, consistent, flexible environment in which we can solve the customer’s ever-changing problems is no easy task. Keeping it that way is harder still. Striking the right balance between all the competing demands takes real skill – so what does it take to create a good architecture? How much architecture is enough?

Software Architecture

First, I’m drawing a distinction between software architecture and enterprise architecture. By software architecture I mean the largest patterns and structures in the code you write – the highest level of design detail. I do not mean what is often called enterprise architecture: what messaging middleware to use, how are services clustered, what database platforms to support. Software architecture is the stuff we write that forms the building blocks of our solution.

The Over Architect

I’m sure we’ve all worked with him: the guy who could over think hello world. When faced with a customer requirement his immediate response is:

we need to build a framework

Obviously the customer’s problem is too simple for this genius. Instead, we can solve this whole class of problems. Once we’ve built the framework, we just need to plug the right values into the simple 400 line XML configuration file and hey presto! customer problem solved.

Sure, we’ve only been asked to do a one-time CSV import of some customer data. But think of the long-term, what will they ask for next? What about the next customer? We should write a generic data import framework that could take data in CSV, XML or JSON; communicating over HTTP, FTP or email. We can build rich, configurable validation logic. We can write the data to any number of database platforms using a variety of standard ORM frameworks. Man, this is awesome, we could be busy for months with this!

Whatever. You Ain’t Gonna Need It!

But sometimes, the lure of solving a problem where you’re the customer, is much more intellectually stimulating than solving the boring old customer’s problems. You know, the guy who’s paying the bills.

The Over Architect generalises from a sample size of one. Every problem is an opportunity to build a more general solution, despite having no evidence for what other cases might need to be solved. Every problem is an opportunity to bring in the latest and greatest technology – whether or not its a good fit, whether or not the company’s going to be left supporting some byzantine third party library that’s over kill for their simple use. An architect fully versed in CV++

The Under Architect

On the other hand, the Under Architect looks at every customer problem and thinks:

we could re-use what we did for feature X

Where by “re-use” he means copy & paste, change as necessary. There’s no real architecture, just patterns repeated ad infinitum. Every new requirement is an opportunity to write more, new code. Heaven forbid we go back and change any of that crufty old shit. No, we’ll just build shiny, brand new legacy code.

We’re building a web application: so we’ll need some Controllers, some Views and some Models. There we go, MVC – that counts as an architecture, right? Oh, we need a bit more. Well, we’ve got some DAOs here for interacting with the database. And the business logic? Well, the stuff that’s not wrapped up in the controllers we can put in FooManager classes. Sure, these things look like monolithic god classes – but its the best way to aggregate all the related functionality together.

Lather, rinse, repeat and before you know it you have a massive application with minimal structure. The trouble is, these patterns become self-perpetuating. It’s hard to start pulling out an architecture when all you have is a naming convention.

The Many Architects

The challenge in many software teams is everyone thinks it’s their job to come up with new architecture or start building a new framework. The code ends up littered with half-finished, half-forgotten frameworks. Changing anything becomes a nightmare: was all this functionality used? We have three different ways of importing data, via three different hand-rolled frameworks – which ones are used? How much of each one is used? Can I refactor them down into one? Two? What about the incompatibilities and subtle differences?

Without a clear vision changing the code becomes like archeology. As you delve down through the layers you uncover increasingly crufty old code that nobody dares touch any more. It becomes less of a software architecture and more of a taxonomy problem – like Darwin trying to identify a million different species by their class structure.

The Answer

What’s the answer? Well I’m sorry, but I just don’t buy this agile bullshit about “emergent architecture”. Architecture doesn’t emerge, it has to be imposed, often onto unwilling code.

Architecture requires a vision: somebody needs to have a clear idea about where the software is headed. Architecture needs patience: as we learn more about the problem and the solution, the architecture will have to adapt. Architecture needs consistency: if the guy calling the shots changes every year or two, you’ll be back to the Many Architects problem.

Above all, I think good architecture needs a dictator. Some, single person – taking responsibility for the architecture. They don’t need to be right, they just need to have a clear vision of where the architecture should head. If the team are on board with that vision then the whole team are pulling in the same direction, guided by one individual taking the long view.

Central Architecture Group

This sounds like I’m advocating a central architecture group? Hell no. The architect needs to be involved in the code, hands-on, day-to-day, so he can see the consequences of his decisions. He needs the feedback from how the product evolves and how our understanding of the problem evolves. The last thing you need is a group of ivory tower architects pontificating about whether an Enterprise Service Bus is going to solve all our problems. Hint: it won’t, but firing the central architecture group might.

Conclusion

Getting software architecture right is a hard problem. If you keep your code DRY and SOLID, you’re heading in the right direction. If someone has the vision for where the code should head and the team work towards that, relentlessly cleaning up old code – then maybe, just maybe you’ve got a chance.

 

Advertisements

Ability or methodology?

There’s been a lot of chatter recently on the intertubes about whether some developers are 10x more productive than others (e.g. here, here and here). I’m not going to argue whether this or that study is valid or not; I Am Not A Scientist and I don’t play one on TV, so I’m not going to get into that argument.

However, I do think these kinds of studies are exactly what we need more of. The biggest challenges in software development are people – individual ability and how we work together; not computer science or the technical. Software development has more in common with psychology and sociology than engineering or maths. We should be studying software development as a social science.

Recently I got to wondering: where are the studies that prove that, say, TDD works; or that pair programming works. Where are the studies that conclusively prove Scrum increases project success or customer satisfaction? Ok, there are some studies – especially around TDD and some around scrum (hyper-performing teams anyone?) – but a lazy google turns up very little. I would assume that if there were credible studies into these things they’d be widely known, because it would provide a great argument for introducing these practices. Of course, its possible that I’m an ignorant arse and these studies do exist… if so, I’m happy to be educated 🙂

But before I get too distracted, Steve’s post got me thinking: if the variation between individuals really can be 10x, no methodology is going to suddenly introduce an across the board 20x difference. This means that individual variation will always significantly dwarf the difference due to methodology.

Perhaps this is why there are so few studies that conclusively show productivity improvements? Controlling for individual variation is hard. By the time you have, it makes a mockery of any methodological improvement. If “hire better developers” will be 5x more effective than your new shiny methodology, why bother developing and proving it? Ok, except the consultants who have books to sell, conferences to speak at and are looking for a gullible customer to pay them to explain their methodology – I’m interested in the non-crooked ones, why would they bother?

Methodologies and practices in software development are like fashion. The cool kid down the hall is doing XP. He gets his friends hooked. Before you know it, all the kids are doing XP. Eventually, everyone is doing XP, even the old fogies who say they were doing XP before you were born. Then the kids are talking about Scrum or Software Craftsmanship. And before you know it, the fashion has changed. But really, nothing fundamentally changed – just window dressing. Bright developers will always figure out the best, fastest way to build software. They’ll use whatever fads make sense and ignore those that don’t (DDD, I’m looking at you).

The real challenge then is the people. If simply having the right people on the team is a better predictor of productivity than choice of methodology, then surely recruitment and retention should be our focus. Rather than worrying about scrum or XP; trying to enforce code reviews or pair programming. Perhaps instead we should ensure we’ve got the best people on the team, that we can keep them and that any new hires are of the same high calibre.

And yet… recruitment is a horrible process. Anyone that’s ever been involved in interviewing candidates will have horror stories about the morons they’ve had to interview or piles of inappropriate CVs to wade through. Candidates don’t get an easier time either: dealing with recruiters who don’t understand technology and trying to decide if you really want to spend 8 hours a day in a team you know little about. It almost universally becomes a soul destroying exercise.

But how many companies bring candidates in for half a day’s pairing? How else are candidate and employer supposed to figure out if they want to work together? Once you’ve solved the gnarly problem of getting great developers and great companies together – we’ll probably discover the sad truth of the industry: there aren’t enough great developers to go round.

So rather than worrying about this technology or that; about Scrum or XP. Perhaps we should study why some developers are 10x more productive than others. Are great developers born or made? If they’re made, why aren’t we making more of them? University is obviously poor preparation for commercial software development, so should there be more vocational education – a system of turning enthusiastic hackers into great developers? You could even call it apprenticeship.

That way there’d be enough great developers to go round and maybe we can finally start having a grown up conversation about methodologies instead of slavishly following fashion.

First company coding dojo

Last month we ran our first company coding dojo – this was only open to company staff, but attendance was good (around a dozen people).

For those that have never heard of it, a coding dojo – based on the idea of a martial arts dojo – is an opportunity for programmers to improve their skills. This means getting a group of developers together, round a big screen, to work through a problem. Everything is pair programmed, with one “driver” and one “co-pilot”. Every so often the pair is changed: the driver returns to the audience, the co-pilot becomes the driver and a new co-pilot steps up. That way everyone gets a turn writing code, while the rest of the group provide advice (no matter how unwelcome).

For the first dojo we tackled a problem in Scala – this was the first time using Scala for most people, so a lot of time was spent learning the language. But thanks to Daniel Korzekwa, Kingsley Davies & DJ everyone got to grips with the language and we eventually got a solution! The session was a lot of fun, with a lot of heated discussion – but everyone felt they learned something.


Afterwards, in true agile style, we ran a quick retrospective. The lessons learned showed the dojo had been an interesting microcosm of development – with us making the same mistakes we so often see in the day job! For example, we knew we should start with a design and went as far as getting a whiteboard; but failed to actually do any design. This led to repeated rework as the final design emerged, slowly, from numerous rewrites. One improvement for next time was to do just in time design – in true agile style.

We also set out to do proper test-first TDD. However, as so often happens, this degenerated into code-first development with tests occasionally run and passing rarely. It was interesting to see how quickly a group of experienced developers fall out of doing TDD. Our retrospective highlighted that next time we should always write tests first, and take “baby steps” – by doing the simplest thing that could possibly make the test pass.

Overall it was a great session and very enjoyable – it was fascinating to see the impact of ignoring “best practices” on something small where the results are so much more immediate.

Is agile about developers (any more)?

I spent last week at the Agile 2010 Conference. It was my first time at a conference this size; I definitely found it interesting and there were some thought provoking sessions – but there weren’t many deeply technical talks. As others have asked, what happened to the programmers?

Bob Martin wrote that

Programmers started the agile movement to get closer to customers not project managers

He also commented on how few talks were about programming

< 10% of the talks at #agile2010 are about programming. Is programming really < 10% of Agile?

People have already commented on how cost is a factor in attending a conference like this – especially for those of us outside the US who have expensive flights to contend with, too. This is certainly a factor, but I wonder if this is the real problem?

Do developers attend a conference like Agile 2010 to improve their craft? How much can you cover in a 90 minute session? Sure, you can get an introduction to a new topic – but how much detail can you get into? Isn’t learning the craft fundamentally a practical task? You need hands on experience and feedback to really learn. In a short session with a 100+ people are you actually gonna improve your craft?

Take TDD as an arbitrary example. The basic idea can be explained fairly quickly. A 90 minute session can give you a good introduction and some hands on experience – but to really grok the idea, to really see the benefit, you need to see it applied to the real world and take it back to the day job. I think the same applies to any technical talk – if its interesting enough to be challenging, 90 minutes isn’t going to do it justice.

This is exacerbated by agile being such a broad church; there were developers specialising in Java, C#, Ruby and a host of other languages. Its difficult to pitch a technical talk that’s challenging and interesting that doesn’t turn off the majority of developers that don’t use your chosen language.

That’s not to say a conference like Agile 2010 isn’t valuable, and I’m intrigued to see where XP Universe 2011 gets to. However, I think the work that Jason Gorman is doing on Software Craftsmanship, for example, is a more successful format for technical learning – but this is focused clearly on the technical, rather than improving our software delivery process.

Isn’t the problem that Agile isn’t about programming? It is – or at least has become – management science. Agile is a way of managing software projects, of structuring organisations, of engaging with customers – aiming to deliver incremental value as quickly as possible. Nothing in this dictates technical practices or technologies. Sure, XP has some things to say about practices; but scrum, lean, kanban et al are much more about the processes and principles than specific technical approaches.

Aren’t the biggest problems with making our workplaces more agile – and in fact the biggest problems in software engineering in general –  management ones, not development ones? Its pretty rare to find a developer that tells you TDD is bad, that refactoring makes code worse, that continuous integration is a waste of time, that OOD leads to worse software. But its pretty common to find customers that want the moon on a stick, and want it yesterday; managers that value individual efficiency over team effectiveness, that create distinct functional teams and hinder communication.

There is always more for us to learn; we’re improving our craft all the time. But I don’t believe the biggest problems in software are the developers. Its more common for a developer to complain about the number of meetings they’re asked to attend, than the standard of code written by their peers.

Peers can be educated, crap management abides.

Risk free software

[tweetmeme source=”activelylazy” only_single=false]

Nobody wants to make mistakes, do they? If you can see something’s gonna go wrong, its only natural to do what you can to prevent it. If you’ve made a mistake once, what kind of idiot wants to repeat it? But what if the cure is worse than the problem? What if the effort of avoiding mistakes is worse than what you’re preventing?

Preventative Measures

So you’ve found a bug in production that really should have been caught during QA; you’ve had a load-related outage in production; you find a security issue in production. What’s the natural thing to do? Once you’ve fixed the immediate problem, you probably put in place a process to stop similar mistakes next time.

Five whys is a great technique to understand the causes and make appropriate changes. But if you find yourself adding more bureaucracy, a sign-off to prevent this happening in future – you’re probably doing it wrong!

Unfortunately this is a natural instinct: in response to finding bugs in production, you introduce a sign-off to confirm that everyone is happy the product is bug-free, whatever that might mean; you introduce a final performance test phase, with a sign-off to confirm production won’t crash under load; you introduce a final security test, with a sign-off to confirm production is secure.

Each step and each reaction to a problem is perfectly logical; each answer is clear, simple and wrong.

Risk Free Software

Let’s be clear: there’s no such thing as risk free software. You can’t do anything without taking some risk. But what’s easy to overlook, is that not doing something is a risk, too.

Not fixing a bug runs the risk that its more serious than you thought; more prevalent than you thought; that it could happen to an important customer, someone in the press, or a highly valued customer – with real revenue risk. You run the risk that it collides with another, as yet unknown bug, potentially multiplying the pain.

Sometimes not releasing feels like the safest thing to do – but you’re releasing software because you know something is wrong. How can not changing it ever be better?

The Alternative

So what you gonna do? No business wants to accept risk, you have to mitigate it somehow. The simple, easy and wrong thing to do is to add more process. The braver decision, the right decision, is to make it easy to undo any mistakes.

Any release process, no matter how retarded, will normally have some kind of rollback. Some way of getting back to how things used to be. At its simplest, this is a way of mitigating the risk of making a mistake: if it really is a pretty shit release, you can roll it back. Its not great, but it gives you a way of recovering when the inevitable happens.

But often people want to avoid this kind of scenario. People want to avoid rolling back; to avoid the risk of a roll back; totally missing the point that the rollback is your way of managing risk. Instead, you’re forced to mitigate the risk up front with bureaucracy.

If you’re using rollback as a way of managing risk (and why wouldn’t you?), then you’d expect to rollback from time to time. If you’re not rolling back, then you’re clearly removing all risk earlier in the process. This means you have a great process for removing risk; but could you have less process and still release product sometime this year?

Get There Quicker

Being able to rollback is about being able to recover from mistakes quickly and reliably. Another way to do that is to just release solutions quickly. Instead of rolling back and scheduling a fix sometime later, why not get the fix coded, tested and deployed as quickly as possible?

Some companies rely on being able to release quickly and easily every day. Continuous deployment might not itself improve quality; but it improves your ability to react to problems. The obvious side-effect of this is that you can fix issues much faster, so you don’t spend time before a release trying to catch absolutely everything. Instead by decreasing the time between revisions, by increasing your velocity, you create a higher quality product: you just fix issues so much faster.

Continuous deployment lets you streamline your process – you don’t need quite so many checks and balances, because if something bad happens you can react to it and fix it. Obviously, you need tests to ensure your builds are sound – but it encourages you to automate your checks, rather than relying on humans and manual sign-offs. Instead of introducing process, why not write code to check you’ve not made the same mistake twice?

Of course, the real irony in all this, is that the thing that often stops you doing continuous deployment is a long and tortuous release process. The release process encapsulates the lessons from all your previous mistakes. But with a lightweight process, you could react so much faster, by patching within minutes not days, that you wouldn’t need the tortuous process.

Your process has become its own enemy!