Measuring Software

A while back I read Making Software – it made me disappointed at the state of academic research into the practice of developing software. I just read Leprechauns of Software Engineering which made me angry about the state of academic research.

Laurent provides a great critique of some of the leprechauns of our industry and why we believe in them. But it just highlighted to me how little we really know about what works in software development. Our industry is driven by fashion because nobody has any objective measure on what works and what doesn’t. Some people love comparing software to the medical profession, to torture the analogy a bit – I think today we’re in the blood letting and leeches phase of medical science. We don’t really know what works, but we have all sorts of strange rituals to do it better.

Rituals

Rituals? What rituals? We’re professional software developers!

But of course, we all get together every morning for a quick status update. Naturally we answer the three questions strictly, any deviations are off-topic. We stand up to keep the meeting short, obviously. And we all stand round the hallowed scrum kanban board.

But rituals? What rituals?

Do we know if stand ups are effective? Do we know if scrum is effective? Do we even know if TDD works?

Measuring is Hard

Not everything that can be measured matters; not everything that matters can be measured

I think we have a fundamental problem when it comes to analysing what works and what doesn’t. As a developer there are two things I ultimately need to know about any practice/tool/methodology:

  1. Does it get the job done faster?
  2. Does it result in less bugs / lower maintenance cost?

This boils down to measuring productivity and defects.

Productivity

Does TDD make developers more productive? Are developers more productive in Ruby or Java? Is pairing productive?

These are some fascinating questions that, if we had objective, repeatable answers to would have a massive impact on our industry. Imagine being able to dismiss all the non-TDD doing, non-pairing Java developers as being an unproductive waste of space! Imagine if there was scientific proof! We could finally end the language wars once and for all.

But how can you measure productivity? By lines of code? Don’t make me laugh. By story points? Not likely. Function points? Now I know you’re smoking crack. As Ben argues, there’s no such thing as productivity.

The trouble is, if we can’t measure productivity – it’s impossible to compare whether doing something has an impact on whether you get the job done faster or not. This isn’t just an idle problem – I think it fundamentally makes research into software engineering practices impossible.

It makes it impossible to answer these basic questions. It leaves us open to fashion, to whimsy and to consultants.

Quality

Does TDD help increase quality? What about code reviews? Just how much should you invest in quality?

Again, there are some fundamental questions that we cannot answer without measuring quality. But how can you measure quality? Is it a bug or a feature? Is it a user error or a requirements error? How many bugs? Is an error in a third party library that breaks several pages of your website one bug or dozens? If we can’t agree what a defect is or even how to count them how can we ever measure quality?

Subjective Measures

Maybe there are some subjective measures we could adopt. For example, perhaps I could monitor the number of emails to support. That’s a measure of software quality. It’s rather broad, but if software quality increases, the number of emails should decrease. However, social factors could so easily dwarf any actual improvement. For example, if users keep reporting software crashes and are told by the developers “yeah, we know”. Do you keep reporting it? Or just accept it and get on with your life? The trouble is, the lack of customer complaints doesn’t indicate the presence of quality.

What To Do?

What do we do? Do we just give up and adopt the latest fashion hoping that this time it will solve all our problems?

I think we need to gather data. We need to gather lots of data. I’d like to see thousands of dev teams across the world gathering statistics on their development process. Maybe out of a mass of data we can start to see some general patterns and begin to have some scientific basis for what we do.

What to measure? Everything! Anything and everything. The only constraint is we have to agree on how to measure it. Since everything in life is fundamentally a problem of lack of code, maybe we need a tool to help measure our development process? E.g. a tool to measure how long I spend in my IDE, how long I spend testing. How many tests I write; how often I run them; how often I commit to version control etc etc. These all provide detailed telemetry on our development process – perhaps out of this mass of data we can find some interesting patterns to help guide us all towards building software better.

One thought on “Measuring Software

Leave a comment

This site uses Akismet to reduce spam. Learn how your comment data is processed.