Demystifying Fixtures and Test Doubles: Spies, Stubs, Mocks and London vs. Detroit

Writing software tests - unit tests, integration tests, etc. - is an important part of software development. The tests developers write serve as documentation for the systems they test - known as the System Under Test (SUT). Tests also allow us to catch any regressions that occur in the SUT. As software tests are pieces of code, we also need to maintain them and therefore they need to be readable.

Many projects I have worked on have used the term "mock" to mean any test double or test fixture. This makes these tests more difficult to read, and consequently more difficult to maintain. Yes, as a team we would use the same terminology in the same way and therefore there would be team cohesion, however it would be better if we all used the correct terms in the correct way.

Fixtures and test doubles are distinct. In this blog I am going to define test fixtures and the different types of test doubles. I will walk-through different styles of tests in the context of the so-called Detroit vs. London style of tests. I will then walk through some gotchas with utilising mocks.

Test Fixtures

First, let us begin with a test fixture. A test fixture, also known as a test context, is a way for software developers to automatically set up a system state needed for tests. They are used to provide everything a test needs to get started.

Test fixtures themselves can come in different forms: they could be a script to get a database into a certain state or they could be pre-created objects generated before the running of a test.

When pre-creating objects, Martin Fowler proposes a factory class called the "Object Mother" for creating similar fixtures that can be used across tests. [1]

Fixtures are important as they allow us to get the SUT into the state we want using reusable code. They can simplify the "Arrange" step of the "Arrange-Act-Assert" pattern [2].

Different test libraries will have different ways of setting up test fixtures, and you should check the documentation of your testing library for how to set up fixtures.

Test Doubles

Test doubles are a generic name for objects which, during software tests, are used to replace components of our system for testing purposes. It is important to note that these should be viewed as a continuum rather than strictly distinct groups [3]. If you feel that the lines get blurred between these definitions, that is ok as the lines between the categories are blurry. The nomenclature here is used to attempt to create distinct groups to aid understanding for developers.

Double Type Usage
Dummy Contains no implementation and is used only for parameter fulfilment. They are not actually used in the test. Null parameters can be classes as dummies.
Stub These provide canned return values to calls made during tests. They are minimal implementations of interfaces or base classes. Stubbed void methods typically contain no implementation at all; stubbed non-void methods typically return hard-coded values.
Fake Contain more complex implementations than stubs. They're working objects, but will take short-cuts to function. They aren't sutiable for production usage. An example here may be an in-memory database.
Spy Similar to stubs, but also record how members were invoked so that tests can later verify that the spy was invoked as expected.
Mock Usually created by a mocking library. These provide pre-programmed responses and are used to assert that particular members have been invoked with particular arguments. Mocks can behave like dummies, stubs, fakes or spies.

sources: [3] [4]

There are multiple types of test doubles and they all serve a slightly different purpose. Their common usage is that they alter how a dependency of SUT responds. This allows us to ensure that we are only testing what we want to test and in the way we want to test it.

We could use fakes or stubs to provide canned responses where our system may call an external API or we may use them where one of our classes calls the function of a child class. We may want to track how a child class is called and how many times, here we would use spies or mocks.

Mocks run verification functions to check that the calls and arguments received during the test match what the mock was expecting to receive. When using mocks we would need to know beforehand what arguments we would expect the test double to be called with.

Contrasting this, spies record the arguments passed to it and track their methods being called. They are more passive in that you check assertions of the calls afterwards. With a mock, expectations can be set on the mock object to verify the calls as they are happening. Spies capture calls for later verification; mocks capture calls for instant verification. [5]

Like fixtures, each testing library will have its own way of setting up the majority of these test doubles and usually specific mocking libraries are used to set up mocks - although some testing libraries may have mocking built in. Dummy test doubles are simple to use and no special methods nor libraries should need to be used.

Detroit vs. London Tests

When doing research into writing software tests using test doubles you will eventually come across the terms Detroit tests and London tests[4].

Detroit tests favour not using mocks and are sometimes called "Classical" tests; London tests favour using mocks to replace dependencies where possible and are sometimes called "Mockist tests".

The Detroit and London terms originate from the classical style unit tests being developed in Detroit and the more mockist style being developed in London. In reality these fall into two different types of testing.

Classical tests are more black-box in that you provide a function some arguments, it calls it's dependencies and returns an output which you then use to check that it matches your assertion. This is known as state-based verification.

Mockist tests are more white-box in that you mock all the dependencies out and you verify that they are being called with the expected arguments. This is behaviour-based verification.

One should not rigidly prescribe to a particular side. Context is important. What type of test are you writing? Are you wanting to test your unit of code and dependencies or just this small unit of code in isolation? Is the collaboration between the classes simple?

Gotchas with Mocks

There are a few gotchas when using test doubles. [6]

Partial Mocking

We do not want to partially mock dependencies.

If we have fully mocked all of a SUT's dependencies (i.e. London-style test) then the unit test should fail when contracts with the dependencies change, as the call will have changed.

If we aren't using mocks (i.e. Detroit-style test) then the test should fail when the behaviour of the SUT changes.

If we have partially mocked the SUT's dependencies, it can be difficult to work out why a test breaks: is it because of a contract changing or because of a behavioural change?

Mocking Arbitrarily

When writing unit tests for a SUT we either want to use test doubles on the nearest dependency or have the system behaving as close to reality as possible.

We certainly do not want to use test doubles on an arbitrary class down the class hierarchy or an arbitrary layer far away from the SUT. If we were to use test doubles arbitrarily it would become difficult to detect where issues are occurring and make it difficult to debug.

Difficult to Mock Code

This applies to writing software tests as a whole rather than just test doubles. It is easy for tests to get blamed for being painful to write, but code that is difficult to test is usually difficult to use and should be a code smell.

If code is difficult to use test doubles on and we own it, then we should refactor the code that is difficult to use doubles with; if the code that is difficult to use doubles with is not owned by us then we should create wrapper modules around that third party code so that we can control the usage and also use test doubles on the code with ease.

There are many other gotchas with writing test doubles, and I really would recommend watching Justin Searls' talk Please don't mock me as it highlights many other issues as well as those listed above.

Conclusion

Tests are important for software development, but there is a lot of terminology that comes with them. Test fixtures allow us get the system in a pre-defined state before running our automated tests; test doubles provide a way of changing how a dependency works.

London based testing focusses on using mocks where possible and testing contracts, meaning we think about the dependencies, contracts and isolate modules as we are building our tests. Detroit based testing focusses on testing the state and follows a more black-box testing approach, meaning we focus on outputs when passed certain inputs.

There are gotchas with using test doubles and their usage should be deliberate and thoughtful, not arbitrary.

The most important thing to remember with testing is that tests serve as documentation, a method to verify that the SUT is working as expected and a method to inform design. A test that is difficult to use test doubles with usually signals that there is a problem with the code's design, and suggests that a refactor should be undertaken.


Sources

[1] Object Mother - Martin Fowler

[2] Test-Driven Development by Example - Kent Beck

[3] Unit Testing: Exploring The Continuum Of Test Doubles - Mark Seemann

[4] Mocks Aren't Stubs - Martin Fowler

[5] Mocks, Fakes, Stubs and Dummies - xUnitPatterns

[6] Please don't mock me - Justin Searls