Code your own QuickCheck (Shrink)

In the previous posts, we started to implement our own QuickCheck in Haskell, which we named RapidCheck, based on the original publication on QuickCheck.

The first post went over the basic concepts needed to build such a module, while our last post focused on how we can write generator for arbitrary functions. You might want to read them before continuing:

In today’s post, we will complete the implementation of our RapidCheck module, by adding the ability to shrink counter examples to make them more manageable.

This post will present two different strategies to implement this shrinking process. The first attempt will show the simplest solution that might come into mind. Although it will achieve the desired shrinking, we will explain in what sense it is badly designed. Identifying these defects will lead us to a second, much better solution to the problem.

 

Motivation


In our first post, we went over the process of building the basic blocks of our RapidCheck implementation. One of our success criteria was for our implementation to successfully find a counter example to following invalid property:

 
While this counter example is perfectly valid, it is unnecessarily complex. This property could have failed with much smaller numbers:

 
The goal for today is to modify the implementation of our RapidCheck implementation to implement shrinking, one of the most amazing feature of QuickCheck.

At the end of this post, our implementation should be able to come up with much smaller counter examples for prop_bad_gcd:

 

Meaning of shrinking


Our goal is to provide simpler counter examples to the user of the RapidCheck module, ideally a minimal test case. But what do we mean by “simpler” or “minimal”?

The general notion has to do with information. We want to remove noise from the test case. We want to get rid of artefacts that do not participate to the error. We want our arguments to contain only the information needed to make our test case fail.

We can therefore understand shrinking as the process of removing some information from our arguments until we get to the point where all the information contained in these arguments is necessary for the test case to fail.

 

Enhancing Arbitrary


The first step toward being able to provide simpler counter examples is to figure out a way to reduce the amount of information of each of the arguments that lead to the property to fail.

To know where to start, we will enumerate some of the things we know and want about this shrinking process:

  • Quantifying information, thus reducing it, highly depends on the considered type
  • It involves search: there is no known universal solution to find a minimal test case
  • Shrinking might not make sense for all values (zero) or types (functions)
  • We would like equal values to shrink in a similar deterministic way

To sum up, we need a way to express an optional process that is neither random nor generic, and involves non-determinism. This is exactly what the shrink function of the Arbitrary type class has to offer:

  • The list allows non-determinism: each output represents a different “path”
  • If the test case is already minimal, we can return an empty list
  • The default implementation helps for types that do not support shrinking

 

The Shrink Tree


To find the simplest counter arguments, several sequential calls to shrink might be needed. These recursive applications of shrink build a tree, whose root is the initial value that led to a counter example.

We will assume that this tree is built such that the children are ordered in such a way that it maximize the chances to find the smallest counter example. This assumption means we must be very careful in the ordering of the elements returned by shrink.

If this assumption holds, we can therefore explore the tree by finding the first children that makes the property fail, and dive deeper into this branch. If no children makes the property fail, the visit stops and we return our smaller counter example.

 

Arbitrary example


Now that we know what is expected from the shrink function, we can enrich the Arbitrary Integer instance to provide an implementation for it. Our implementation will stay very close to the one provided in Test.QuickCheck.Arbitrary:

  • We first try to remove the sign of negative integers
  • Then we try zero, the simplest possible integer value
  • Finally, we proceed with a right recursive dichotomy

The behavior of this Arbitrary Integer instance is better explained through examples:

 

Feeding shrink in Testable


We know our Arbitrary type class has a new member shrink. We would like this ability to shrink to be automatically used inside the Testable properties that are made of shrinkable arguments.

From our first post, we know that the Testable properties are defined in terms of an induction on the number of arguments. We will need to enrich this induction to include shrinking.

To do so, we add a shrinker argument to forAll, the function that implements the induction step. For now, we keep the implementation unchanged:

We can then adapt the Testable induction step to call forAll function one more argument:

 
We now reached the point where we need to implement the forAll function to plug all the pieces together. The next sections will present two different implementations:

  • We will start by the simplest implementation (and make it work)
  • We will then go for a better implementation (and make it work faster)

 

First try: visiting the shrink tree inside forAll


Our first implementation will consist in slightly adapting our forAll function to handle the shrinking. For reference, the current implementation of forAll is listed below:

We will enrich the lambda given to overFailure and inside this lambda, we will shrink the value arg of the initial counter example.

 

Shrinking implementation

Our shrinking process will take a function (a -> Result) to test the property against an argument of type a. This a will take different value as we visit the shrink tree.

Our shrinking function will also need the root of the shrink tree and the shrink function, to get the children of a given node. We get the following prototype:

From a list of potential counter-example, we can write a search function that will return the first one that makes the property fail:

Applied to the output of shrink, the first match will provide us with the next branch of the shrink tree to explore. With that in mind, we can finish the implementation of the shrinking function:

 

Shrinking in forAll

We want the shrunk arguments to be used in place of the original randomly generated argument, and not trigger a full random test case again. So we must use the same random generator used to initially run the sub-properties for the shrinking.

Inside forAll, we can therefore add the shrinking inside the lambda given to overFailure. The shrinking uses the same runSub function (bound to the same seed) to search for a smaller counter example:

This implementation makes use of the evalSubProp function, to get the (a -> Result) function required to explore the shrink tree:

 

Resulting behavior

This implementation works in the sense that it will shrinking the counter example as we expect it would:

 
But our implementation of forAll is deceptively simple. It looks like forAll tries to shrink the arguments sequentially, starting from the last argument of the property. But it is not what happens.

Our forAll is part of the induction process that builds a Property from a list of arguments from the last to the first of a Testable property. Since our forAll implementation now includes visiting the shrink tree, finding a counter example for an argument (this includes shrinking this same argument) also involves visiting the shrink tree of all subsequent arguments.

To illustrate this, let us take the following invalid property:

  • Finding an initial counter example for a involves evaluating the sub-properties prop_gcd_bad is built from. This includes shrinking argument b.
  • Shrinking a also involves re-evaluating the sub-properties prop_gcd_bad is built from. This also automatically includes shrinking the argument b.

So we shrunk argument b twice, and this will only grow with the number of arguments participating in the property.

So this is quite a waste. And there is nothing we can do about it: it is a natural consequence of a design that integrates visiting the shrink tree inside forAll.

 

Second try: visiting the shrink tree outside forAll


We know from our previous design that we need to search for better counter example outside of the forAll function. In this second design, the forAll function will only be responsible to build a search tree, for the rapidCheck function to explore it.

 

Result tree

To achieve this, we will have to modify our Property type to return a Result tree instead of a single Result.

In this design, our rapidCheck function is responsible for navigating the tree and seeking a better counter example (in case of failure).

The only modification needed in rapidCheckImpl is a call to visitResultTree:

Implementing the visitResultTree function is quite straightforward. We find the first children that preserves the failure, and dive deeper into it:

 

Building the Result tree

Since a Property now returns a result tree, we need to adapt our forAll function accordingly to return a result tree instead of a single result.

Each argument inductively added by forAll to the Property will return a tree to the “father” forAll. For forAll to output a tree, we need a way to collapse a tree of tree of results into a tree of results.

This collapse must be very carefully designed to prioritize the shrinking of the outer argument over the shrinking of the inner arguments. Otherwise, the outer arguments would almost never be visited and thus shrunk. This is the job of joinTree:

 
The creation of the tree of tree of results involves a bit of code:

  • We first need to create a shrink tree of all outer argument values
  • For each outer argument value, we evaluate the sub-property result tree
  • We enrich the sub-property result tree with the outer argument value
  • At the end, we join our tree of result tree into a result tree

This code makes use of the following helper functions:

  • buildTree creates a potentially infinite shrink tree from a root value
  • addCounterExample add a counter example across a whole result tree

 

Wrapping up

To finish up the implementation, we only need to adapt our forAll to do the necessary plumbing:

  • Split the pseudo random generator in two: rand1 and rand2
  • Use rand1 to generate the root value of the shrink tree of the outer argument
  • Use rand2 to evaluate the next result tree of the chain

This is it: we know have an implementation that will shrink the outer arguments of a given property first, before proceeding with the inner arguments.

 

Enjoying shrinking


Now that our implementation works, we can play a bit with it. We will run it on our invalid property, and check that the results are satisfying:

 
We can add traces in our property to check how it behaves. The following is an excerpt of log traces of the a and b values used to find a counter example. The full file is only 63 lines long:

 

Conclusion


In this post, we went over how we could add shrinking to provide better counter examples to invalid properties.

We completed the code of our RapidCheck module with a shrinking mechanism and tried two different implementations of it. The mistakes done in the first implementation allowed us to understand the second implementation better. The full code of our RapidCheck module is available in this GitHub Gist.

This second implementation is not quite the same as the one used in QuickCheck, due to the numerous additional features it supports. But this journey should still guide (the most adventurous of) you quite a bit in understanding the code behind the fabulous QuickCheck.

I hope you enjoyed the journey as much as I did, and that you learned something reading these last few posts.

Code your own QuickCheck (HOF)

In the previous post, we started to build our own QuickCheck implementation in Haskell, which we named RapidCheck. We went over the basic concepts needed to build such a module, based on the original publication on QuickCheck.

In particular, we explained in details and implemented the following concepts from QuickCheck:

  • Gen a: a generator of value of type a
  • Arbitrary: the class of types with a default Gen associated to it
  • Property: a testable predicate, from which we can derive a Gen Result
  • Testable: the class of types we can convert to a Property

Our two next posts will build on top of this previous post and complete the implementation of RapidCheck:

  • In this post, we will add the support for generators of functions, which is critical to test properties expressed as higher order functions
  • The next post will focus on adding the support for argument shrinking, which is critical to provide better counter examples for failing properties

 

Motivating example


In a functional language such as Haskell, we often modularize and factorize our code through the use of Higher Order Functions. Testing these functions with generative testing requires QuickCheck to support the generation of functions.

Our goal for today will be to enrich our RapidCheck implementation with everything needed to test the following properties.

 
The prop_partition is a valid property. It verifies that partition correctly does it jobs:

  • It checks that all elements on the left match the predicate
  • It checks that no elements on the right matches the predicate
  • It checks that no elements were lost in the process

We expect this property to pass:

 
The prop_distributive is an invalid property. It asserts that every single functions from integer to integer is distributive over the operator plus.

We expect this property to fail, and RapidCheck to provide us with a counter example:

 
You might feel quite disappointed by this counter example. The “function” string counter example is not that helpful. While this is true, understanding how to circumvent this issue goes beyond the scope of this post.

There is a brilliant presentation you can have a look at to better understand how the real QuickCheck manages to provide useful outputs for counter examples on functions.

 

Toward generating functions


How can we even generate functions? Functions can operate on a potentially infinite number of values, and it seems hard to conceive a way to generate such a complex structure.

Fortunately, the original publication on QuickCheck gives us a nice trick to understand where to start. The trick is based on playing with types. We will do a quick reminder of it goes below.

If you already know the trick, you can skip this section.

 
Our goal is to be able to create a generator of functions from a to b: Gen (a -> b). We know this Gen (a -> b) type is just a wrapper around a function StdGen -> (a -> b). By getting rid of the parentheses and reordering the parameters of this function, we get the following prototype: a -> StdGen -> b. We recognize in this last expression a generator of value of type b.

All of this means we can rewrite Gen (a -> b) as a -> Gen b.

This play with types makes use realize that we can build a generator of functions from a function returning a generator, by merely moving the application of the function inside the generator. We name this transformation promote:

 

Generators perturbation


The promote helper requires us to give it a function: a -> Gen b. Finding a way to instantiate such a function is the focus of this section.

We can reason that coming with a function that creates a generator from a value of a complete different type is not conceivable unless this function is the result of a partially applied function. So let us add an argument in front of it to see how it could help us.

By adding Gen b, we get the following prototype: Gen b -> a -> Gen b. One way to see this is as a perturbation on a random generator driven by a value of type a. We capture this requirement on a with the CoArbitrary type class:

 

The coarbitrary function is very close to what the function promote needs to create a generator of function. We just need to provide it a generator as first parameter.

To sum this up, it means we can make an Arbitrary (a -> b) instances from the two following requirements:

  • Arbitrary b: the existence of a standard generator for the type b
  • CoArbitrary a: the ability for the type a to perturb another generator

Then we only need to provide the generator to coarbitrary to feed it to promote:

 

Implementing perturbations


We saw that generating a function requires each of the argument of the function to implement the CoArbitrary type class.

We know that for a type a to be an instance of CoArbitrary it needs to be able to perturb generators of any other types. Let us now look at how we can implement such a perturbation.

 

General pattern

To influence a generator Gen, we can act on its StdGen parameter. This impact on the random generation process needs to be linked with the value of the type implementing CoArbitrary.

Applied to the integer data type, this translates into the following code, where perturb is yet to be defined:

This is the general pattern. We can use it to define any CoArbitrary instances: the only difference will be the implementation of the perturb function.

 

Implementing perturb

The implementation of perturb is critical to have good statistically properties. The prototype alone is not enough.

As an extreme example, let us imagine an implementation of perturb that would return its input generator unmodified. As a result, the resulting function generator would produce functions that systematically ignore all their parameters and return the same result for any inputs. Applied to our partition property, it would generate predicates that either return True on all inputs, or False on all inputs. Not that great.

On the contrary, and to get good statistical properties, the implementation of perturb should try to map each possible value of a to a different perturbation on the generator Gen b.

This is the approach we have chosen for our CoArbitrary Integer instance, for which we provide different chains of split calls for each different number we get:

  • The sign of the number is taken into account
  • The value zero is not forgotten and does impact the generator
  • The decomposition of the number into digits further drives the perturbation

Although we must be careful, we are not limited: there are in practice many different valid implementations possible.

 

Leveraging

We can use the pattern we saw above in combination with our implementation of perturb to easily build more instances of CoArbitrary. Since we have a way to build a perturbation from an integer, we can use it as an intermediary step.

We can map the values of a type to an integer value to quickly get a CoArbitrary instance for it. We can also use perturb several times in a row:

The bottom line is that we do not need to re-invent the wheel each time we want to define a CoArbitrary instance.

In particular, QuickCheck offers a lot of helpers for building both Arbitrary and CoArbitrary in its Test.QuickCheck.Arbitrary module. This is a good place to look before starting implementing a new instance of either of those typeclasses.

 

Showable functions


We now have everything we need to run our tests cases… except for one small but important detail. Our code will not compile, due to the requirements associated with the Testable typeclass:

 
We need functions to implement Show for higher order functions to be valid instances of Testable. We will take a shortcut here and import the Text.Show.Functions module which defines it for us.

The problem with this approach is that we get “function” inside our counter example, when the property does not pass. This is clearly not very informative.

QuickCheck offers the Test.QuickCheck.Function to deal with this issue. It requires to transform the declaration of properties to use Fun a b instead of a -> b. In exchange for this small change in syntax, it is both able to show and shrink functions, improving our user experience by that much.

 

Conclusion and what’s next


In this post, we went over how we could generate functions to test properties expressed as higher order functions.

We completed the code of our RapidCheck library with all that was needed to make our original test cases pass. You can find the current state of the code of our RapidCheck available in this GitHub Gist.

Adding this feature allowed us to test properties on an algorithm such as partition by generating predicates for it.

Our next post will aim at finishing the work and add one last important features of Property Based Testing, the ability to shrink counter examples.