Meetup Report: Kata night

Two days ago, I went at a Crafting Software Meetup at Arolla. The goal was to join other developers and improve our skills as a group by practicing kata.

A large majority of us were debutant. So after a difficult start on an exercise that was definitively out of our league, we decided to play the simple Fizz Buzz kata.

Together with my pair, we developed it using Haskell, following a TDD approach. Then went on improving our test coverage by using Property Based Testing. The night ended with Romeu Moura doing a quite amazing demonstration on how he did tests and TDD.

This post summarise the most important lessons I learned for the night, and is divided in three parts:

  1. Providing the initial solution our pair produced for Fizz Buzz
  2. Providing you with the great feedback of Romeu on how to do tests
  3. Using this knowledge to try to improve the initial solution

 

Initial solution to the Kata


Fizz Buzz is a simple program that takes a positive integer as input, and outputs a string. It should:

  • Return “0” if the input is 0
  • Return “Fizz” if the input is a multiple of 3
  • Return “Buzz” if the input is a multiple of 5
  • Return “FizzBuzz” if the input is a multiple of 3 and 5
  • Echo its input as a string otherwise (example “7” for 7)

Here is the implementation of Fizz Buzz we ended with:

Here are the tests we ended up with to test this implementation:

We went a bit crazy at the end, and did our best to leverage property based testing to check some properties of our algorithm. We managed not to replicate the implementation of the fizz buzz in the properties (which is an error we can easily fall into). You can check the result of this experiment at the following link if your are curious.

 

Great lessons on how to test


Writing test is not easy. Deriving good practices on how to test is not easy either. This section is about restituting some of the great advices from the amazing demonstration of TDD done by Romeu.

 
Tell a story: Use naming to make your tests such as they read like a story. Use the name of the Fixture, of the tests, and even the parameters to help you. The ordering of the tests inside the Fixture might also be important.

For example, in Java, you could have something like this:

class FizzBuzzShould {

  @Test
  void fizz_buzz_for(int multiple_of_3_and_5) {
    assertThat(fizzBuzz(multiple_of_3_and_5)).isEqualTo("FizzBuzz");
  }

  @Test
  void otherwise_fizz_for(int multiple_of_3) {
    assertThat(fizzBuzz(multiple_of_3)).isEqualTo("Fizz");
  }

  @Test
  void otherwise_buzz_for(int multiple_of_5) {
    assertThat(fizzBuzz(multiple_of_5)).isEqualTo("Buzz");
  }

  @Test
  void otherwise_echo_its(int input) {
    assertThat(fizzBuzz(input)).isEqualTo(Integer(input).toString());
  }
}

 
Use Bad Faith: After adding tests: write the least amount of work needed to make the test pass. This include writing code that no-one would on purpose: special “if” cases.

Why would we do that? The goal is to show that the tests actually test something and are able to catch errors. If a new test is never going red, how can we ensure it is really a useful tests?

Where does it stop? Writing special case code can go on for quite a long time, so we need something to stop the cycle. One such rule is the rule of three which states that upon 3 code repetition, the code must be factorized.

 
Make it work, then make it right: Refactor only if all tests are green. Mixing refactoring and fixing bugs might make you loose time if something bad happens.

For example, to end up the “bad faith cycle”, start by making the 3 repetitions to make the test pass. Once green, you can get rid of this DRY violation.

 
Lazy naming: Delay the correct naming of your tests as late as you possibly can. Use inaccurate name at first, then revisit them at the end.

Naming your test accurately at the beginning is quite hard. Naming early might lead you to use names too close from the implementation. It might also lead you to rely on badly conceived notions that will be very hard to get rid of later.

As a consequence, naming your test “fizz_buzz_0” is fine when building the test suite: most of the tests will disappear in the end.

 
Fix something: Resist the temptation of writing a very generic test that take a list of (input, output). The problem with this approach is that it does not provide much of the intent behind each the test cases. As a consequence, when a test will fail, the reader will have a hard time figuring out why.

If you look at our tests in the initial part, this is a mistake we did: our tests take a list of pairs of (input, output) and run them all.

Instead, try to fix either a given input or output, as it will help giving meaning to your test cases. A good approach is to divide by equivalence class. For example, all inputs that produces “Fizz” could end up in one parameterised test.

 
Show the rules: Instead of writing a test that asserts that fizzBuzz of 30 is equal to “FizzBuzz”, explicit what 30 is.

For example, 30 is a shortcut for 2 * 3 * 5. Using the shortcut in the test means that the next reader will spend time figuring out the rules. Providing the rules explicitly will diminish the mental load needed to understand the test.

 
Keep tests small: Avoid conditional statements and looping constructs in your tests. The test should be as simple as a single assertion.

Use parameterised testing to get rid of the loops on the inputs. Rework your design to leverage value types to keep the number of assertions limited.

 

Putting the lesson to practice


The initial test suite we developed with my pair was clearly not up to the level of quality that Romeu described. We violated several of the rules:

  • Our test suite is very generic and does not show its intent.
  • We used shortcuts like 30 instead of 2 * 3 * 5 that show the rules
  • Our test cases do not tell a story at all

Here is an attempt (that might maybe go a bit too far) to make the tests readable:

It requires some helpers to be defined for us (fortunately, these are quite generic function we could put in a separate file):

 

Conclusion


I am not quite sure the reworking of the test suite of the FizzBuzz lead to something that follows the great pieces of advice of the night.

But I sure did learn something: avoiding producing very generic tests as I regularly did and though was good practice. And I hope some of this did help you as well.

Accumulating your merge sort

In the previous post, we described an elegant algorithm that allows to stable sort a container based on fold (also known as std::accumulate or reduce). To make it simpler, we implemented it in Haskell and used it to merge lists.

The subject of today’s post is to:

  • Implement this algorithm in C++
  • Make it work on the input container directly
  • Make it work for more than just lists

In the process, we will build a std::stable_sort variant that works for ForwardIterators, whereas std::stable_sort only works for RandomAccessIterator.

 

Implementing the binary counter


Following a quick youtube search, I was able to find an episode of the A9 Stepanov lectures that dealt with the implementation of a binary counter in C++.

The following code is greatly inspired (almost copy-pasted, except for the std::find_if) from these amazing lessons:

  • The add_to_counter function handles the carry propagation, and returns the carry if it ran out of bits
  • The reduce_counter allows to collapse all the remaining bits at the end of the accumulation
  • The binary_counter class maintains the vector of bits, and is responsible for gluing the algorithms together

As for the Haskell implementation, the C++ implementation of the binary counter is able to deal with any Monoid. It only requires:

  • An associative binary operation
  • That admits a neutral element

This implementation also shows a great usage of object orientation to structure a program: the class is used to combine some algorithms with the data their operate on (to guaranty invariants), while the algorithms are kept outside of the class for greater re-use.

 

The merge binary operation


Now that we have a binary_counter that works on any Monoid, we need to define the equivalent of the SortedList Monoid instance we defined last time in Haskell.

The neutral element will be the empty range, and the binary operation will be something close to std::merge. Why something close and not std::merge directly? Because we need some kind of auxiliary buffer to perform the merge.

This is the implementation I came to:

I found this implementation more tricky than I first thought it would be. In particular, we need to ensure that the two ranges we merge are always:

  1. Next to each other (one range end is the beginning of the other)
  2. In the right order (left range is before the right range)

I claim these two propositions will always hold if we accumulate our container from left to right. The argument is based on the implementation of add_to_counter and reduce_counter and the properties of Monoids.

Indeed, reduce_counter initializes its result from the lower level “bit” (which is the latest arrived in the counter) and combines it on the right with op(*first, result). So the latest arrived range is always provided as right argument of the merge operation.

The same goes for add_to_counter that combines the carry on the right side of the binary operation. Again, the latest arrived range is provided as right argument of the merge operation.

This property of the binary_counter algorithm is crucial. The Monoid property guaranties our operation is associative but not necessarily commutative. So the binary_counter must have this property to be correct, and as a result we can rely on it.

 

Accumulating with iterators


It is now time to combine our different elements to build our std::stable_sort based on a std::accumulate. But there is a catch…

As observed by Ben Deane in std::accumulate: Exploring an Algorithmic Empire, we cannot really use the std::accumulate of the STL here. It deals with values while we need ranges here, which means iterators.

An easy way to get past this is to create our own accumulate_iter algorithm, that provides an iterator to the accumulating function (instead of a value):

Based on this algorithm, we are now able to assemble the different parts needed to build a std:stable_sort working on ForwardIterators:

Now we are truly done. You can convince yourself that it does stable sort a container by trying it on some examples.

 

Wrapping it up


We managed to write a std::stable_sort using a variant of std::accumulate that sorts a container provided as parameter. I do not know if this is the same solution Ben Deane used to implement his own, but it is definitively doable.

One interesting consequence is that our stable sort algorithm also works for ForwardIterator. So you can sort a std::list and even a std::forward_list with it.

Now, I would not be offended if you told me you found accumulate_iter usage neither beautiful, nor concise, nor simpler than just implementing a raw loop by hand:

Indeed, C++ may not be the best language to play with fold algorithms. While the Haskell solution was both elegant and concise, you might prefer the imperative style when programming in C++.

Ultimately, deciding which tool and which paradigm to use is a choice you (or your team) have to make.