Choosing a Programming Language: Recruitment

How do you choose the right language to use for your next project? Use the right tool for the job? Sure, but what does that mean? And how do I know what the right tool is? How do I get enough experience in a new language to know whether or not it is the right tool for the job?

Your choice of language will have several impacts on your project:

  • Finding & hiring new developers to join your team
  • Whether there is a thriving community: both for providing Q&A support and the maturity of libraries & frameworks to help you get more done
  • Training your existing team; discovering and learning new tools; as well as the possible motivation impact of using new technologies
  • Finally your choice of “the right tool for the job” will make the development effort easier or harder, both in the short-term and for long-term maintenance

This is the first article in a series on choosing the right language – we will start by addressing recruitment.

Recruitment

For many teams, recruiting new staff is the hardest problem they face. While choice of programming language has many obvious technical impacts on the development process, it also has a huge impact on your recruitment efforts. If your choice of language has the ability to make your hardest problem easier then it has to be considered.

Filtering

The truth is that most developers will naturally filter themselves by language, first and foremost. Most Java developers, when looking for jobs, will probably start by asking their Java developer friends and searching job sites for Java developer jobs. Recruiters naturally further this – they need a simple, keyword friendly way to try and match candidates to jobs, and filtering by language is an easy way to start.

Of course, this is completely unnecessary – most developers, if the right opportunity arose, would happily switch language. Most companies, given the right candidate, would be more than happy to hire someone without the requisite language background.

Yet almost every time I’ve been involved in recruitment we’ve always filtered candidates by language. No matter how much we say “we just want the best candidates”, we don’t want to take a risk hiring someone that needs a couple of months to get up to speed with our language.

My last job switch involved a change of language, the first time in 10 years I’ve changed language. Sure, it took me a month to become even competent in C#; and arguably three or four months to become productive. But of course it is possible, and yet it doesn’t seem to be that common.

This is purely anecdotal: I’d love to find a way to try and measure how often developers switch languages – just how common is it for companies to take on developers with zero commercial experience in their primary language? How could we measure this? Could we gather statistics from lots of companies on their recruitment process? Email me or post a comment below with your ideas.

Size of Talent Pool

Ultimately your choice of language has one main impact on recruitment: which language would make it easiest to find good developers? I think we can break this down into three factors:

  1. Overall number of developers using that language
  2. Competition for these developers from other companies
  3. The density of “good” developers within that pool

Total Number of Developers

Some languages have many more developers using them than others. For example, there are many times more Java developers than there are Smalltalk developers. At a simple level, by looking for Java developers I will be looking for developers from a much larger pool than if I search for Smalltalk developers. But how we can we quantify this?

We can estimate the number of developers skilled in or actively using a particular language by approximating the size of the community around that language. Languages with large communities indicate more developers using and discussing them. But how can we measure the size of the community?

GitHub and StackOverflow are two large community sites – representing open source code created by the community and questions asked by the community. Interestingly they seem to have a strong correlation – languages with lots of questions asked on StackOverflow have lots of code published on GitHub. I’m certainly not the first to spot this, I found a good post from Drew & John from a couple of years back looking at the correlation: http://www.dataists.com/2010/12/ranking-the-popularity-of-programming-langauges/

GitHub & StackOverflow probably give a good indication of the size of the community around each language. This means if we look for C# developers, there are many times more developers to choose from than if we choose Ada. However, sheer numbers isn’t the full story…

Competition for Developers

We’re not hiring in a vacuum – there is competition for developers from all the other companies hiring in our area (and those remote companies that don’t believe geography should be a barrier). This competition will have two impacts: first if there is more competition for developers using a certain language, that makes it harder to attract them to our company. Assuming market forces work, it also suggests that average salaries for developers with experience in an in-demand language will naturally be higher – as companies use salary to differentiate themselves from the competition. Just because there are more C# developers than Ada developers, the ease with which you can hire developers from either group is partly dependent on the amount of competition.

so-questions-adsI looked at the total number of questions tagged on StackOverflow.com and the number of UK job ads requiring the language on JobServe.com. Interestingly there is a very strong correlation (0.9) – this suggests, perhaps not surprisingly, that most developers primarily use the language they use in their day job: there are large communities around languages that lots of companies hire for; and only small communities around languages that few companies hire for. Of course – it’s not clear which comes first, do companies adopt a language because it has a large community, or does a large community emerge because lots of developers are using it in their job?

There are some interesting outliers: there seems to be greater job demand than there is developer supply for some languages such as Perl and VisualBasic. Does this suggest these are languages that have fallen out of favour, that companies still have investment in and now find it hard to recruit for? Similarly Scala and Node.js seem to have more demand than developers can supply. On the other side of the line, languages such as PHP and Ruby have much more active communities than there is job demand – suggesting they are more hobbyist languages, languages amateurs and professionals alike use in their spare-time. Then there is Haskell, another language with a lot of questions and not many jobs – perhaps this is purely its prevalence in academia that hasn’t (yet) translated to commercial success?

In general the more popular languages have more competition for developers. Perl & VisualBasic will be harder to find developers skilled in; whereas PHP, Ruby & Haskell should be easier to find developers. Besides these outliers, whichever language you hire for the overall size of the pool of developers (i.e. those within commuting distance or willing to relocate) is the only deciding factor. How many companies continue to use the mainstream languages (Java and C#) because its “easier” to hire developers? On the basis of this evidence, I don’t think its true. In fact, quite the opposite: to get the least competition for developers (lower salaries and less time spent interviewing) you should hire developers with Ruby or Haskell experience.

Density of Good Developers

One of the more contentious ideas is that filtering developers by language allows you to quickly filter out less capable developers. For example, by only looking at Python developers, maybe on average they are “better” than C# developers. My doubt with this is if it were found to be true, developers would quickly learn that language so as to increase their chances of getting a job: quickly eliminating any benefit of using language as a filter. It also starts to sound a bit “my language is better than your language”.

But it would be interesting if it were true, if we could use language as a quick filter to get higher quality candidates; it would be quite possible to gather data on this. For example, I use an online interview that could easily be used to provide an A/B test of developers self-selecting based on language. Given the same, standardized recruitment test – do developers using language A do better or worse than those using language B? Of course, the correlation between a recruitment test and on the job performance is very much debatable – but it could give us a good indication of whether there is correlation between language and (some measure of) technical ability. Would this work, is there a better way to measure it? Email me or post a comment below.

Conclusion

Popular languages have more developers to hire from but more companies trying to hire them. These two factors seem to cancel each other out – the ratio of developers:jobs is pretty much constant. As far as recruitment is concerned then, maybe for your next project break away from the norm and use a non-mainstream language? Is it time to hire you some Haskell?

The future’s software: the future’s shit

Software is taking over the world. Arguably software has already taken over the world. The trouble is: software is all shit – and it’s all your fault.

I had one of those seconds-from-flinging-something-heavy-through-my-tv days yesterday: I really have had it with crappy software. First, I decided to relax and catch up with last week’s Have I Got News For You on BBC iPlayer. Unfortunately, for some unfathomable reason, the XBox iPlayer app has become shit in recent months. Last night, it played 0.5 seconds of the show and hung. Network traffic was still flying by, but none of it was appearing on my TV. Fine, I switched to using iPlayer on my laptop – at least that normally works.

Afterwards I decided to watch another of the (excellent, by the way) American remake of House of Cards on Netflix. First, my router had got its knickers in a twist and switched the VPN so Netflix thought I was in the US, so all of my recently watched had disappeared. Boot up laptop (again), login to admin page on the router, fiddle with settings, reboot, reboot XBox, login again: there we go, UK Netflix is back, I can finally watch House of Cards. Except now the TV has crashed and refuses to decode the HDCP signal from the XBox. Reboot TV: no. Reboot xbox, again. Yes! We have signal!

I sat back and started watching. I have to get up, three times, because the volume is quieter than a quiet thing on quiet day in quiet land. Eventually it was so damned quiet I manage to miss some dialog, so I get to use the, occasionally awesome, XBox kinect voice control feature that saves me scrabbling to find the controller: “xbox rewind”. Excellent, it starts rewinding. “xbox stop”. “XBOX, STOP!” “Oh for fuck’s fucking sake xbox, fucking stop you fucking retarded demon, STOP!”. Find the controller. Wait for it to boot up. Now the Netflix app has resumed its entirely random bug where I lose all the on screen navigation controls, so I have to guess which combination of up, down, A, A, B, up, down stops the bloody rewind and not the one that invokes the super Netflix boss level. Eventually the button mashing exits Netflix, not what I intended but at least it’s stopped rewinding. I launch Netflix again. This time the on screen controls work. Woohoo! Now volume has switched back to slightly louder than a jet engine piped directly against your ear drum, so I jump up and turn the volume down before it wakes my little boy up.

By now, I’m in no mood for political intrigue – killing every developer that was involved in any of the pieces of software that have ruined my mood and my evening, quite gladly: pass me the knife, I’m happy to relieve each them of their useless lives and hopelessly misapplied careers.

The trouble is: it’s not their fault. All software is shit. As software takes over more and more of my life I realise: it’s not getting any better. My phone crashes, my TV crashes, my hi-fi crashes. My sat nav gets lost and my car needs a BIOS upgrade. This is all before I get to work and get to use any actual computers. Actual computers that I actually hate.

Every step of every day, software is pissing me off. And you know what? It’s only going to get worse. As software invades more and more of our lives our complete inability to create decent, stable software is becoming a plague. The future will be dominated by people swearing at every little computer they come into contact with. In our bright software future where everything is digital and we’re constantly plugged into the metaverse I will spend my entire life swearing at it for being so unutterably, unnecessarily shit.

You know what else? As software developers: it’s all our fault.

Visualizing Code

When writing software we’re working at two levels:

  1. Creating an executable specification of exactly what we want the machine to do
  2. Creating a living document that describes the intent of what we want the machine to do, to be read by humans

The first part is the easy part, the second part takes a lifetime to master. I read a really great post today pointing out signs that you’re a bad programmer. Whether you’re a bad programmer or just inexperienced, I think the biggest barrier is being able to quickly and accurately visualize code. You need to visualize what the application actually does, what happens at runtime; but all your IDE shows you is the raw, static, source code. From this static view of the world you have to infer the runtime behaviour, you have to infer the actual shape of the application and the patterns of interaction; while closely related, the two are separate. Given just source code, you need to be able to visualize what code does.

What does it mean to visualize code? At the simplest level, it’s understanding what individual statements do.

string a = "Hello":
string b = "world";
a = b;

It might sound trivial, but the first necessary step is being able to quickly parse code and mentally step through what will happen. First for basic statements, then for code that iterates:

while (stack.Count() > 1)
{
    var operation = stack.Pop() as IOperation;
    var result = operation.Execute(stack.Pop(), stack.Pop());
    stack.Push(result);
}

Where you need to understand looping mechanics and mentally model what happens overall not just each iteration. Then you need to be able to parse recursive code:

int Depth(ITreeNode node)
{
    if (node == null)
        return 0;
    return 1 + Math.Max(Depth(node.Left), Depth(node.Right));
}

Which is generally harder for inexperienced programmers to understand and reason about; even though once you’ve learned the pattern it can be clearer and more concise.

Once you’ve mastered how to understand what a single method does, you have to understand how methods become composed together. In the OO world, this means understanding classes and interfaces; it means understanding design patterns; you need to understand how code is grouped together into cohesive, loosely coupled units with clear responsibilities.

For example, the visitor pattern has a certain mental structure – it’s about implementing something akin to a virtual method outside of the class hierarchy; in my mind it factors a set of classes into a set of methods.

public interface IAnimal
{
    void Accept(IAnimalVisitor visitor);
}

public class Dog : IAnimal { ... }
public class Cat : IAnimal { ... }
public class Fox : IAnimal { ... }

public interface IAnimalVisitor
{
    void VisitDog(Dog dog);
    void VisitCat(Cat cat);
    void VisitFox(Fox fox);
}

The first step in reading code is being able to read something like a visitor pattern (assuming you’d never heard of it before) and understand what it does and develop a mental model of what that shape looks like. Then, you can use the term “visitor” in your code and in discussions with colleagues. This shared language is critical when talking about code: it’s not practical to design a system by looking at individual lines of code, we need to be able to draw boxes on a whiteboard and discuss shapes and patterns. This shared mental model is a key part of team design work; we need a shared language that maps to a shared mental model, both of the existing system and of changes we’d like to make.

In large systems this is where a common language is important: if the implementation uses the same terms the domain uses, it becomes simpler to understand how the parts of the system interact. By giving things proper names, the interactions between classes become logical analogues of real-world things – we don’t need to use technical words or made up words that subsequent readers will need to work to understand or learn, someone familiar with the domain will know what the expected interactions are. This makes it easier to build a mental model of the system.

For example, in an online book store, I might have concepts (classes) such as book, customer, shopping basket, postal address. These are all logical real world things with obvious interactions. If I instead named them PurchasableItem, RegisteredUser, OrderItemList and DeliveryIndicationStringList you’d probably struggle to understand how one related to the other. Naming things is hard, but incredibly important – poor naming will make it that much harder to develop a good mental model of how your system works.

Reading code, the necessary precursor to writing code, is all about building a mental model. We’re trying to visualize something that doesn’t exist, by reading lines of text.