Property based testing

The following is a rough transcript of a talk I did for my colleagues at PA on property based testing. You can find more materials here at the Github repo.

Introduction

As software engineers and programmers, we all know the value of writing automated tests for our code. Many of us appreciate the advantages of Test Driven Development. Today I want to talk about another technique that can improve the usefulness of our tests.

Property based testing involves running a single test many times with multiple randomly generated inputs. This allows you to test more with less code. It makes it easier for you to write better tests, and reduces the need for you to think up examples.

If we consider our tests as documentation, PBT improves the breadth and generality of that documentation.

Example based testing

So, how is property based testing different from tests we've written before? I can't speak for everyone, but when I have written tests in the past, they have been largely example-based. I find myself having to think up example scenarios and manually code them. For example, testing a sorting algorithm:

1 def test_sort_list_of_ints():
2     for i in range(2, 101):
3         ints = [random.randint(-1000, 1000) for _ in range(i)]
4         result = sort(ints)
5         assert all(x <= y for x, y in zip(result, result[1:]))

I want to ensure my function sort actually sorts a list of integers. My test, however, only tests for one specific input. It's possible that even if it passes, other values could cause problems. It also doesn't document the desired behaviour of my code fully. Examples are helpful in understanding how something works, but they aren't the whole story. This test is just an example of how the code should work, rather than a statement defining a more general property.

It's inherently difficult or lengthy to demonstrate generic properties with example based testing.

With this in mind, here is a modified test:

1 def test_sort_list_of_ints():
2     for i in range(2, 101):
3         ints = [random.randint(-1000, 1000) for _ in range(i)]
4         result = sort(ints)
5         assert all(x <= y for x, y in zip(result, result[1:]))

This is better in some ways – we are now testing for lists with between 2 and 100 elements long, containing random integers in the range ±1000.\ However, there are some difficulties here:

A property based testing framework can help resolve these issues.

Property based testing

So, what can a property based testing framework add to this?

Here is the above example test modified to use the hypothesis framework for Python, which I learnt about at Europython:

1 import hypothesis
2 from hypothesis import strategies as st
3 
4 
5 @hypothesis.given(st.lists(st.integers(), min_size=2))
6 def test_sort_list_of_ints(ints):
7     result = sort(ints)
8     assert all(x <= y for x, y in zip(result, result[1:]))

The key differences:

Use cases

I think these kind of tests are useful in any software project, but here are some particularly motivating examples:

These are invariant properties that are poorly demonstrated with examples alone. Property based testing allows your tests to function better both as documentation, and as proof of the robustness of your code. Each test is more concise, and each test goes further. For these hopefully very compelling reasons, I hope you'll all consider giving PBT a try and using it in your work!

Some frameworks to read up on are:

Enjoy! ::: :::