Continuum of Design

How best to organise your code? At one end of the scale there is a god class – a single, massive entity that stores all possible functions; at the other end of the scale are hundreds of static methods, each in their own class. In general, these two extremes are both terrible designs. But there is a continuum of alternatives between these two extremes – choosing where along this scale your code should be is often difficult to judge and often changes over time.

Why structure code at all?

It’s a fair enough question – what is so bad about the two extremes? They are, really, more similar to each other than any of the points between. In one there is a single class with every method in it; in the other, I have static methods one-per class or maybe I’m using a language which doesn’t require me to even group by classes. In both cases, there is no organisation. No structure larger than methods with which to help organise or group related code.

Organising code is about making it easier for human beings to understand. The compiler or runtime doesn’t care how your code is organised, it will run it just the same. It will work the same and look the same – from the outside there is no difference. But for a developer making changes, the difference is critical. How do I find what I need to change? How can I be sure what my change will impact? How can I find other similar things that might be affected? These are all questions we have to answer when making changes to code and they require us to be able to reason about the code.

Bounded contexts help contain understanding – by limiting the size of the problem I need to think about at any one time I can focus on a smaller portion of it, but even within that bounded context organisation helps. It is much easier for me to understand how 100 methods work when they are grouped into 40 classes, with relationships between the classes that make sense in the domain – than it is for me to understand a flat list of 100 methods.

An Example

Let’s imagine we’re writing software to manage a small library (for you kids too young to remember: a library is a place where you can borrow books and return them when you’re done reading them; a book is like a physically printed blog but without the comments). We can imagine the kinds of things (methods) this system might support:

  • Add new title to the catalog
  • Add new copy of title
  • Remove copy of a title
  • User borrows a copy
  • User returns a copy
  • Register a new user
  • Print barcode for new copy
  • Fine a user for a late return
  • Pay outstanding fines for a user
  • Find title by ISBN
  • Find title by name, author
  • Find copy by scanned id
  • Change user’s address
  • List copies user has already borrowed
  • Print overdue book letter for user

This toy example is small enough that written in one-class it would probably be manageable; pretty horrible, but manageable. How else could we go about organising this?

Horizontal Split

We could split functionality horizontally: by technical concern. For example, by the database that data is stored in; or by the messaging system that’s used. This can work well in some cases, but can often lead to more god classes because your class will be as large as the surface area of the boundary layer. If this is small and likely to remain small it can be a good choice, but all too often the boundary is large or grows over time and you have another god class.

For example, functionality related to payments might be grouped simply into a PaymentsProvider – with only one or two methods we might decide it is unlikely to grow larger. Less obviously, we might group printing related functionality into a PrinterManager – while it might only have two methods now, if we later start printing marketing material or management reports the methods become less closely related and we have the beginnings of a god class.

Vertical Split

The other obvious way to organise functionality is vertically – group methods that relate to the same domain concept together. For example, we could group some of our methods into a LendingManager:

  • User borrows a copy
  • User returns a copy
  • Register a new user
  • Fine a user for a late return
  • Find copy by scanned id
  • List copies user has already borrowed

Even in our toy example this class already has six public methods. A coarse grained grouping like this often ends up being called a SomethingManager or TheOtherService. While this is sometimes a good way to group methods, the lack of clear boundary means new functionality is easily added over time and we grow ourselves a new god class.

A more fine-grained vertical grouping would organise methods into the domain objects they relate to – the recognisable nouns in the domain, where the methods are operations on those nouns. The nouns in our library example are obvious: Catalog, Title, Copy, User. Each of these has two or three public methods – but to understand the system at the outer level I only need to understand the four main domain objects and how they relate to each other, not all the individual methods they contain.

The fine-grained structure allows us to reason about a larger system than if it was unstructured. The extents of the domain objects should be relatively clear, if we add new methods to Title it is because there is some new behaviour in the domain that relates to Title – functionality should only grow slowly, we should avoid new god classes; instead new functionality will tend to add new classes, with new relationships to the existing ones.

What’s the Right Way?

Obviously there’s no right answer in all situations. Even in our toy example it’s clear to see that different structures make sense for different areas of functionality. There are a range of design choices and no right or wrong answers. Different designs will ultimately all solve the same problem, but humans will find some designs easier to understand and change than others. This is what makes design such a hard problem: the right answer isn’t always obvious and might not be known for some time. Even worse, the right design today can look like the wrong design tomorrow.

Visualizing Code

When writing software we’re working at two levels:

  1. Creating an executable specification of exactly what we want the machine to do
  2. Creating a living document that describes the intent of what we want the machine to do, to be read by humans

The first part is the easy part, the second part takes a lifetime to master. I read a really great post today pointing out signs that you’re a bad programmer. Whether you’re a bad programmer or just inexperienced, I think the biggest barrier is being able to quickly and accurately visualize code. You need to visualize what the application actually does, what happens at runtime; but all your IDE shows you is the raw, static, source code. From this static view of the world you have to infer the runtime behaviour, you have to infer the actual shape of the application and the patterns of interaction; while closely related, the two are separate. Given just source code, you need to be able to visualize what code does.

What does it mean to visualize code? At the simplest level, it’s understanding what individual statements do.

string a = "Hello":
string b = "world";
a = b;

It might sound trivial, but the first necessary step is being able to quickly parse code and mentally step through what will happen. First for basic statements, then for code that iterates:

while (stack.Count() > 1)
{
    var operation = stack.Pop() as IOperation;
    var result = operation.Execute(stack.Pop(), stack.Pop());
    stack.Push(result);
}

Where you need to understand looping mechanics and mentally model what happens overall not just each iteration. Then you need to be able to parse recursive code:

int Depth(ITreeNode node)
{
    if (node == null)
        return 0;
    return 1 + Math.Max(Depth(node.Left), Depth(node.Right));
}

Which is generally harder for inexperienced programmers to understand and reason about; even though once you’ve learned the pattern it can be clearer and more concise.

Once you’ve mastered how to understand what a single method does, you have to understand how methods become composed together. In the OO world, this means understanding classes and interfaces; it means understanding design patterns; you need to understand how code is grouped together into cohesive, loosely coupled units with clear responsibilities.

For example, the visitor pattern has a certain mental structure – it’s about implementing something akin to a virtual method outside of the class hierarchy; in my mind it factors a set of classes into a set of methods.

public interface IAnimal
{
    void Accept(IAnimalVisitor visitor);
}

public class Dog : IAnimal { ... }
public class Cat : IAnimal { ... }
public class Fox : IAnimal { ... }

public interface IAnimalVisitor
{
    void VisitDog(Dog dog);
    void VisitCat(Cat cat);
    void VisitFox(Fox fox);
}

The first step in reading code is being able to read something like a visitor pattern (assuming you’d never heard of it before) and understand what it does and develop a mental model of what that shape looks like. Then, you can use the term “visitor” in your code and in discussions with colleagues. This shared language is critical when talking about code: it’s not practical to design a system by looking at individual lines of code, we need to be able to draw boxes on a whiteboard and discuss shapes and patterns. This shared mental model is a key part of team design work; we need a shared language that maps to a shared mental model, both of the existing system and of changes we’d like to make.

In large systems this is where a common language is important: if the implementation uses the same terms the domain uses, it becomes simpler to understand how the parts of the system interact. By giving things proper names, the interactions between classes become logical analogues of real-world things – we don’t need to use technical words or made up words that subsequent readers will need to work to understand or learn, someone familiar with the domain will know what the expected interactions are. This makes it easier to build a mental model of the system.

For example, in an online book store, I might have concepts (classes) such as book, customer, shopping basket, postal address. These are all logical real world things with obvious interactions. If I instead named them PurchasableItem, RegisteredUser, OrderItemList and DeliveryIndicationStringList you’d probably struggle to understand how one related to the other. Naming things is hard, but incredibly important – poor naming will make it that much harder to develop a good mental model of how your system works.

Reading code, the necessary precursor to writing code, is all about building a mental model. We’re trying to visualize something that doesn’t exist, by reading lines of text.

Implementing a rich domain model with Guice

The anaemic domain model is a really common anti-pattern. In the world of ORM & DI frameworks we naturally seem to find ourselves with an ORM-managed “domain” that is all data and no behaviour; coupled with helper classes that are all behaviour and no data, helpfully injected in by our DI framework. In this article I’ll look at one possible approach for implementing a rich domain model – this example uses Guice, although I’m sure Spring etc would have different ways of achieving the same thing.

The problem

All the source code can be found on github. The “master” branch shows the original, badly factored code. The “rich-domain” branch shows the solution I describe.

Anaemic domain model

First, our anaemic domain model – TradeOrder.java. This class, as is traditional with Hibernate, has a load of annotations describing the data model, fields for all the data, accessors and mutators to get at the data and nothing else of any interest. I assume, in this domain, that TradeOrders do things. Maybe we place the order or cancel the order. Somewhere along the line, the key objects in our domain should probably have some behaviour.

@Entity
@Table(name="TRADE_ORDER")
public class TradeOrder {
    @Id
    @Column(name="ID", length=32)
    @GeneratedValue
    private String id;

    @ManyToOne
    @JoinColumn(name="CURRENCY_ID", nullable=false)
    @ForeignKey(name="FK_ORDER_CURRENCY")
    @AccessType("field")
    private Currency currency;

    @Column(name="AMOUNT", nullable=true)
    private BigDecimal amount;

    public TradeOrder() { }

    public String getId() { return id; }

    public Currency getCurrency() { return currency; }
    public void setCurrency(Currency currency) { this.currency = currency; }

    public BigDecimal getAmount() { return amount; }
    public void setAmount(BigDecimal amount) { this.amount = amount; }
}

Helper class

In this really simple example, we need to figure out the value of the order – i.e. the number of shares we want to buy (or sell) and the price per share we’re paying. Unfortunately, because this involves dependencies the behaviour doesn’t reside in the class it relates to, instead its been moved into an oh-so-helpful helper class.

Take a look at FiguresFactory.java. This class only has one public method – buildFrom. The goal of this method is to create a Figures from a TradeOrder.

    public Figures buildFrom(TradeOrder order, Date effectiveDate) throws OrderProcessingException {
        Date tradeDate = order.getTradeDate();
        HedgeFundAsset asset = order.getAsset();

        BigDecimal bestPrice = bestPriceFor(asset, tradeDate);

        return order.getType() == TradeOrderType.REDEMPTION
            ? figuresFromPosition(
                  order,
                  lookupPosition(asset, order.getFohf(), tradeDate),
                  lookupPosition(asset, order.getFohf(), effectiveDate),
                  bestPrice)
            : getFigures(order, bestPrice, null);
    }

Besides the “effective date” (whatever that might be), the only input this method takes is the TradeOrder. Using the copious number of getters on TradeOrder it asks for data to operate on, instead of telling the TradeOrder what it needs. In an ideal, object-oriented system, this would have been a method on TradeOrder called createFigures.

Why did we end up here? It’s all the dependency injection framework’s fault! Because the process of creating a Figures object requires us to resolve prices and currencies, we need to go and lookup this data – using injectable dependencies. Our anaemic domain can’t have dependencies injected, so instead we inject them into this little helper class.

What we end up with is a classic anaemic domain model. The TradeOrder has the data; while numerous helper classes, like FiguresFactory, contain the behaviour that operate on that data. It’s all very un-OO.

A better way

Data record

The first step is to create a simple value object to map rows from the database – I’ve called this TradeOrderRecord.java. This looks much like the original domain object, except I’ve removed the accessors & mutators to make it clear that this is a value object with no behaviour.

To make constructing these record objects easier, I’ve used karg, a library written by a colleague of mine – this requires us to declare the set of arguments we can use to construct the record with, and a constructor that accepts a list of arguments. This greatly simplifies construction and avoids us having a constructor which takes 27 strings (for example).

@Entity
@Table(name="TRADE_ORDER")
public class TradeOrderRecord {
    @Id
    @Column(name="ID", length=32)
    @GeneratedValue
    public String id;

    @Column(name="CURRENCY_ID")
    public String currencyId;

    @Column(name="AMOUNT", nullable=true)
    public BigDecimal amount;

    public static class Arguments {
    	public static final Keyword<String> CURRENCY_ID = newKeyword();
    	public static final Keyword<BigDecimal> AMOUNT = newKeyword();
    }

    protected TradeOrderRecord() { }

    public TradeOrderRecord(KeywordArguments arguments) {
    	this.currencyId = Arguments.CURRENCY_ID.from(arguments);
    	this.amount = Arguments.AMOUNT.from(arguments);
    }
}

The rich domain

Our goal is to make TradeOrder a rich domain object – this should have all the behaviour and data associated with the domain concept of a “TradeOrder”.

Data

The first thing TradeOrder will need is, internally, to store all the data associated with a TradeOrder (at least as a starting point, the unused fields hint that we might be able to simplify this further).

public class TradeOrder {
    private final String id;
    private final String currencyId;
    private final BigDecimal amount;

We make the data immutable. Immutable state is generally a good thing – and here it forces us to be clear that this is a fully populated TradeOrder, and since it has an id, it is always associated with a row in the database. By making TradeOrder immutable the obvious question is – how do I update an order? Well, there are numerous ways we could choose to do that – but that is a different story for a different time.

We also do not need accessors. Since we plan on putting all the behaviour that relates to TradeOrder on the TradeOrder class itself, other classes should not need to ask for information, they should only need to tell us what they want to achieve.

Note: there is one (now deprecated) accessor – that hints at a further behaviour that ought to be moved.

Dependencies

Besides the fields to store data, TradeOrder will also have fields representing injectable dependencies.

    private final CurrencyCache currencyCache;
    private final PriceFetcher bestPriceFetcher;
    private final PositionFetcher hedgeFundAssetPositionsFetcher;
    private final FXService fxService;

Some people will find this offensive – mixing dependencies with data. However, personally, I think the trade-off is worth it – the benefit of being able to define our behaviours on the class they relate to is worth it.

Behaviour

Now we have the data and the dependencies all in one place, it is relatively easy to move across the methods from FiguresFactory:

    public Figures createFigures(Date effectiveDate) throws OrderProcessingException {
        BigDecimal bestPrice = bestPriceFor(this.asset, this.tradeDate);

        return this.type == TradeOrderType.REDEMPTION
            ? figuresFromPosition(
                  fohf,
                  lookupPosition(this.asset, fohf, this.tradeDate),
                  lookupPosition(this.asset, fohf, effectiveDate), bestPrice)
            : getFigures(fohf, bestPrice, null);
    }

Construction

The last thing we need to tackle is how to create instances of TradeOrder. Since the fields for data and dependencies are all marked as final, the constructor must initialise them all. This means we need a constructor that takes the dependencies and a TradeOrderRecord (i.e. the value object we read from the database):

    @Inject
    protected TradeOrder(CurrencyCache currencyCache,
                         PriceFetcher bestPriceFetcher,
                         PositionFetcher hedgeFundAssetPositionsFetcher,
                         FXService fxService,
                         @Assisted TradeOrderRecord record) {
        ...
    }

This isn’t particularly pretty, but the key thing to note is the @Assisted annotation. This allows us to tell Guice that the other dependencies are injected normally, whereas the TradeOrderRecord should be passed through from a factory method. The factory interface itself looks like this:

    public static interface Factory {
    	TradeOrder create(TradeOrderRecord record);
    }

We don’t need to implement this interface, Guice provides it automatically. TradeOrder.Factory becomes an injectable dependency we can use from elsewhere when we need to create an instance of TradeOrder. Guice will initialise the injectable dependencies as normal, and the assisted dependency – TradeOrderRecord – is passed through from the factory. So our calling code doesn’t need to worry that our rich domain needs injectable dependencies.

    @Inject private TradeOrder.Factory tradeOrderFactory;
    ...
    TradeOrderRecord record = tradeOrderDAO.loadById(id);
    TradeOrder order = tradeOrderFactory.create(record);

Conclusion

By mixing dependencies and data together into a rich domain model we are able to define a class with the right behaviours. The obvious code smell in TradeOrder now is that the detailed mechanics of creating a Figures is probably a separate concern and should be broken out. That’s ok, we can introduce a new dependency to manage that – as long as the TradeOrder is still the starting point for constructing the Figures object.

By having the behaviours in a single place our domain model is easier to work with, easier to reason about and it’s easier to spot cases of duplication or similarity. We can then refactor sensibly, using a good object model, instead of introducing arbitrary distinctions into helper classes that are function libraries, not participants in the domain.

 

Software craftsmanship roundtable

Last night was the second London Software Craftsmanship round table. This was a great session with lots of lively discussion. We seemed to cover a wide variety of topics and I came away with loads of things I need to follow up on.

It’s amazing, as soon as you start discussing things with other developers you pick up on lots of different ideas, tools and technologies you’d not previously explored. These sessions are proving to be a great way to find out some of the cool stuff people are doing. So what did we discuss?

*phew* I think that covers it. If I missed anything, let me know below.

So, are you coming to the next one? What should we discuss?