Want to learn more about WebDriver? What do you want to know?

If you’re testing a web application, you can’t go far wrong with Selenium WebDriver. But in this web 2.0 world of ajax-y goodness, it can be a pain dealing with the asynchronous nature of modern sites. Back when all we had was web 1.0 you clicked a button and eventually you got a new page, or if you were unlucky: an error message. But now when you click links all sorts of funky things happen – some of which happen faster than others. From the user’s perspective this creates a great UI. But if you’re trying to automate testing this you can get all sorts of horrible race conditions.

Thread.sleep

The naive approach is to write your tests the same way you did before: you click buttons and assert that what you expected to happen actually happened. For the most part, this works. Sites are normally fast enough, even in a continuous integration environment, that by the time the test harness looks for a change it’s already happened.

But then… things slow down a little and you start getting flickers – tests that sometimes pass and sometimes fail. So you add a little delay. Just 500 milliseconds should do it, while you wait for the server to respond and update the page. Then a month later it’s flickering again, so you make it 1 second. Then two… then twenty.

The trouble is, each test runs at the pace that it runs at its slowest. If login normally takes 0.1 seconds, but sometimes takes 10 seconds when the environment’s overloaded – the test has to wait for 10 seconds so as not to flicker. This means even though the app often runs faster, the test has to wait just in case.

Before you know it, your tests are crawling and take hours to run – you’ve lost your fast feedback loop and developers no longer trust the tests.

An Example

Thankfully WebDriver has a solution to this. It allows you to wait for some condition to pass, so you can use it to control the pace of your tests. To demonstrate this, I’ve created a simple web application with a login form – the source is available on github. The login takes a stupid amount of time, so the tests need to react to this so as not to introduce arbitrary waits.

The application is very simple – a username and password field with an authenticate button that makes an ajax request to log the user in. If the login is successful, we update the screen to let the user know.

The first thing is to write our test (obviously in the real world we’d have written the test before our production code, but its the test that’s interesting here not what we’re testing – so we’ll do it in the wrong order just this once):

@Test
public void authenticatesUser()
{
    driver.get("http://localhost:8080/");

    LoginPage loginPage = LoginPage.open(driver);
    loginPage.setUsername("admin");
    loginPage.setPassword("password");
    loginPage.clickAuthenticate();
    Assert.assertEquals("Logged in as admin", loginPage.welcomeMessage());
}

We have a page object that encapsulates the login functionality. We provide the username & password then click authenticate. Finally we check that the page has updated with the user message. But how have we dealt with the asynchronous nature of this application?

WebDriverWait

Through the magic of WebDriverWait we can wait for a function to return true before we continue:

public void clickAuthenticate() {
    this.authenticateButton.click();
    new WebDriverWait(driver, 30).until(accountPanelIsVisible());
}

private Predicate<WebDriver> accountPanelIsVisible() {
    return new Predicate<WebDriver>() {
        @Override public boolean apply(WebDriver driver) {
            return isAccountPanelVisible();
        }
    };
}
private boolean isAccountPanelVisible() {
    return accountPanel.isDisplayed();
}

Our clickAuthenticate method clicks the button then instructs WebDriver to wait for our condition to pass. The condition is defined via a predicate (c’mon Java where’s the closures?). The predicate is simply a method that will run to determine whether or not the condition is true yet. In this case, we delegate to the isAccountPanelVisible method on the page object. This does exactly what it says on the tin, it uses the page element to check whether it’s visible yet. Simple, no?

In this way we can define a condition we want to be true before we continue. In this case, the exit condition of the clickAuthenticate method is that the asynchronous authentication process has completed. This means that tests don’t need to worry about the internal mechanics of the page – about whether the operation is asynchronous or not. The test merely specifies what to test, the page object encapsulates how to do it.

Javascript

It’s all well and good waiting for elements to be visible or certain text to be present, but sometimes we might want more subtle control. A good approach is to update Javascript state when an action has finished. This means that tests can inspect javascript variables to determine whether something has completed or not – allowing very clear and simple coordination between production code and test.

Continuing with our login example, instead of relying on a <div> becoming visible, we could instead have set a Javascript variable. The code in fact does both, so we can have two tests. The second looks as follows:

public void authenticate() {
    this.authenticateButton.click();
    new WebDriverWait(driver, 30).until(authenticated());
}

private Predicate<WebDriver> authenticated() {
    return new Predicate<WebDriver>() {
        @Override public boolean apply(WebDriver driver) {
            return isAuthenticated();
        }
    };
}

private boolean isAuthenticated() {
    return (Boolean) executor().executeScript("return authenticated;");
}
private JavascriptExecutor executor() {
    return (JavascriptExecutor) driver;
}

This example follows the same basic pattern as the test before, but we use a different predicate. Instead of checking whether an element is visible or not, we instead get the status of a Javascript variable. We can do this because each WebDriver also implements the JavascriptExecutor allowing us to run Javascript inside the browser within the context of the test. I.e. the script “return authenticated” runs within the browser, but the result is returned to our test. We simply inspect the state of a variable, which is false initially and set to true once the authentication process has finished.

This allows us to closely coordinate our production and test code without the risk of flickering tests because of race conditions.

In most sensible languages the compiler ignores whitespace; it’s only there to help humans understand the code. The trouble is, without automated checking of whitespace it’s very hard to have consistent style. Without a compiler telling you when you get it wrong, it’s hard to enforce a standard. Sometimes it leads to bugs and, ironically, hard-to-understand code. Maybe it’s time to realise that formatting of code is different from its semantics. Maybe in the 21st century, programmers should be using more than just a text editor?

No Whitespace

For example, the following is perfectly legal Java:

public String shareWishList() {
    User user = sessionProcessor.getUser();  WishList wishList =
    loadWishList.forUserByName(user, selectedWishListName); for(
    String friend : friends)  { if ( !friend.equals("") )  { new
    ShareWishListMail(  friend,  user,  wishList,  emailFactory,
    this.serverHostName ) . send();   }   }   return  "success";
}

But most normal human beings would much prefer to see this written the “traditional” way, with line breaks in appropriate places and sensible use of the space character:

public String shareWishList() {
    User user = sessionProcessor.getUser();
    WishList wishList = loadWishList.forUserByName(user,
                            selectedWishListName);
    for (String friend : friends) {
        if (!friend.equals("") ) {
            new ShareWishListMail(friend, user, wishList,
                                  emailFactory,
                                  this.serverHostName)
                .send();
        }
    }
    return "success";
}

Builders

Of course, most IDEs can reformat code for you. And with careful tweaking of the settings you can get something useful. But there are always cases where you want to do something different, to help readability.

For example, I often format uses of builders differently to other code. While I might normally prefer long lines to be wrapped and continue to the line break on the following line, when calling a builder it makes it hard to read:

Trade trade = tradeBuilder.addFund(theFund).addAsset(someAsset).atSharePrice(2)
    .forShareCount(10).withAmount(20).tradingOn(someDate)
    .effectiveOn(someOtherDate)
    .withComment("This is a test trade").forUser(myUser)
    .withStatus(ACTIVE).build();

This is about a billion times easier to read as:

Trade trade =
    tradeBuilder.addFund(theFund)
                .addAsset(someAsset)
                .atSharePrice(2)
                .forShareCount(10)
                .withAmount(20)
                .tradingOn(someDate)
                .effectiveOn(someOtherDate)
                .withComment("This is a test trade")
                .forUser(myUser)
                .withStatus(ACTIVE)
                .build();

Test Setup

Tests seem to end up particularly dependent on whitespace. For example, in my current job I recently had cause to try and write something like the following:

var sharePosition = CreateAnnualSharePosition
    .InAsset(someAsset)
    .Quarter1Shares(0)        .WithQ1Price(1.250, GBP)
    .Quarter2Shares(10000)    .WithQ2Price(1.15,  GBP)
    .Quarter3Shares(1000000)  .WithQ3Price(1.35,  GBP)
    .Quarter4Shares(1000000)  .WithQ4Price(1.45,  GBP)
    .Build();

Because I had several similar tests, with subtle variations of share count & price quarter-to-quarter it massively helped me while writing it, to have it clearly laid out – so I could see when share count had changed and when price had changed. Unfortunately, VisualStudio had other ideas and removed some of my whitespace when copy & pasting this, which led to me reformatting every damned time.

Internal DSL

Effectively these builders are a form of internal DSL. You’re (ab)using the source language to provide a domain specific language to let you describe what you want to actually do. My previous company used a home grown Java based BDD framework – Narrative. This leads to very fluent acceptance tests, but lots of method chaining. As with the builder pattern this would be totally unreadable without liberal use of whitespace. Take this not particularly unusual example:

Then.the(user)
    .expects_that(the_orders(),
         Contains.only
         (an_order()
                  .of_type(SELL)
                  .for_asset(someAsset)
                  .for_amount(150, gbp())
                  .with_trades(
                      allOf(
                          Contains.inAnyOrder(
                              a_trade_selling("Sub asset 1"),
                              a_trade_selling("Sub asset 2")),
                          trades_totalling("150")
                      ))));

Now, there’s a very particular “style” to how this is laid out. Whoever thought this looked clean and readable (probably me) was clearly on something that day. But it’s difficult to lay out a “sentence” like this because of the massive amount of nesting inherent in what it’s saying.

How can we possibly define rules for whitespace to layout such statements in a clear, readable, machine-enforceable way? After months of seeing different people spend endless hours tweaking whitespace to their own personal preference you know what I realise? You can’t automate it.

So maybe we’re trying to solve the wrong problem? Is the fact that we’re so reliant on whitespace to make code not look like total shit, a hint that maybe our text editor isn’t giving us the expressive power we need? Maybe, just maybe, in the 21st century, programmers could be using something richer than a text editor? Maybe there’s some structure in our source code that we could show better?

Builders – Revisited

Let’s go back to a simple example, my trade builder. Wouldn’t this be easier to read if it was written in the table it so obviously is?

Shock! Horror! Fancy that! Source code that’s not just plain old text but – OMG – a table. The humanity of it! Sacrilege! Giving programmers powerful, expressive, rich text instead of a dumb old text editor.

My share position builder is helped even more by being expressed in a table:

But what about the complex set of nested matchers?

Here the nested structure is clearly visible, there’s no need to rely on whitespace – the nested structure is clear and unambiguous. The editor understands the structure and renders the correct layout making it easy to write and much easier to read. It actually becomes impossible to indent things wrong accidentally: a nested table is either nested or it isn’t.

But… how would it work?

Well, obviously our editors would need to be more powerful to support this. The indentation can be derived from the source code: the editor can parse brackets much more easily than the human eye and translate that to a more structured form that’s easier to read.

The critical part is defining the presentation style when declaring a method/class – e.g. identifying builder methods so we can present more naturally.

Would it work?

There are numerous implementation details – but I’m sure it could be made to work easily enough. But, is the complexity worth it? Am I stark raving mad or is it about time our editors were dragged kicking and screaming into the 20th century?

Actively Lazy

Month: January 2012

Testing asynchronous applications with WebDriverWait

Thread.sleep

An Example

WebDriverWait

Javascript

Enough whitespace already

No Whitespace

Builders

Test Setup

Internal DSL

Builders – Revisited

But… how would it work?

Would it work?

Thread.sleep

An Example

WebDriverWait

Javascript

Share this:

No Whitespace

Builders

Test Setup

Internal DSL

Builders – Revisited

But… how would it work?

Would it work?

Share this: