Meetup Report: Kata night

Two days ago, I went at a Crafting Software Meetup at Arolla. The goal was to join other developers and improve our skills as a group by practicing kata.

A large majority of us were debutant. So after a difficult start on an exercise that was definitively out of our league, we decided to play the simple Fizz Buzz kata.

Together with my pair, we developed it using Haskell, following a TDD approach. Then went on improving our test coverage by using Property Based Testing. The night ended with Romeu Moura doing a quite amazing demonstration on how he did tests and TDD.

This post summarise the most important lessons I learned for the night, and is divided in three parts:

  1. Providing the initial solution our pair produced for Fizz Buzz
  2. Providing you with the great feedback of Romeu on how to do tests
  3. Using this knowledge to try to improve the initial solution

 

Initial solution to the Kata


Fizz Buzz is a simple program that takes a positive integer as input, and outputs a string. It should:

  • Return “0” if the input is 0
  • Return “Fizz” if the input is a multiple of 3
  • Return “Buzz” if the input is a multiple of 5
  • Return “FizzBuzz” if the input is a multiple of 3 and 5
  • Echo its input as a string otherwise (example “7” for 7)

Here is the implementation of Fizz Buzz we ended with:

Here are the tests we ended up with to test this implementation:

We went a bit crazy at the end, and did our best to leverage property based testing to check some properties of our algorithm. We managed not to replicate the implementation of the fizz buzz in the properties (which is an error we can easily fall into). You can check the result of this experiment at the following link if your are curious.

 

Great lessons on how to test


Writing test is not easy. Deriving good practices on how to test is not easy either. This section is about restituting some of the great advices from the amazing demonstration of TDD done by Romeu.

 
Tell a story: Use naming to make your tests such as they read like a story. Use the name of the Fixture, of the tests, and even the parameters to help you. The ordering of the tests inside the Fixture might also be important.

For example, in Java, you could have something like this:

class FizzBuzzShould {

  @Test
  void fizz_buzz_for(int multiple_of_3_and_5) {
    assertThat(fizzBuzz(multiple_of_3_and_5)).isEqualTo("FizzBuzz");
  }

  @Test
  void otherwise_fizz_for(int multiple_of_3) {
    assertThat(fizzBuzz(multiple_of_3)).isEqualTo("Fizz");
  }

  @Test
  void otherwise_buzz_for(int multiple_of_5) {
    assertThat(fizzBuzz(multiple_of_5)).isEqualTo("Buzz");
  }

  @Test
  void otherwise_echo_its(int input) {
    assertThat(fizzBuzz(input)).isEqualTo(Integer(input).toString());
  }
}

 
Use Bad Faith: After adding tests: write the least amount of work needed to make the test pass. This include writing code that no-one would on purpose: special “if” cases.

Why would we do that? The goal is to show that the tests actually test something and are able to catch errors. If a new test is never going red, how can we ensure it is really a useful tests?

Where does it stop? Writing special case code can go on for quite a long time, so we need something to stop the cycle. One such rule is the rule of three which states that upon 3 code repetition, the code must be factorized.

 
Make it work, then make it right: Refactor only if all tests are green. Mixing refactoring and fixing bugs might make you loose time if something bad happens.

For example, to end up the “bad faith cycle”, start by making the 3 repetitions to make the test pass. Once green, you can get rid of this DRY violation.

 
Lazy naming: Delay the correct naming of your tests as late as you possibly can. Use inaccurate name at first, then revisit them at the end.

Naming your test accurately at the beginning is quite hard. Naming early might lead you to use names too close from the implementation. It might also lead you to rely on badly conceived notions that will be very hard to get rid of later.

As a consequence, naming your test “fizz_buzz_0” is fine when building the test suite: most of the tests will disappear in the end.

 
Fix something: Resist the temptation of writing a very generic test that take a list of (input, output). The problem with this approach is that it does not provide much of the intent behind each the test cases. As a consequence, when a test will fail, the reader will have a hard time figuring out why.

If you look at our tests in the initial part, this is a mistake we did: our tests take a list of pairs of (input, output) and run them all.

Instead, try to fix either a given input or output, as it will help giving meaning to your test cases. A good approach is to divide by equivalence class. For example, all inputs that produces “Fizz” could end up in one parameterised test.

 
Show the rules: Instead of writing a test that asserts that fizzBuzz of 30 is equal to “FizzBuzz”, explicit what 30 is.

For example, 30 is a shortcut for 2 * 3 * 5. Using the shortcut in the test means that the next reader will spend time figuring out the rules. Providing the rules explicitly will diminish the mental load needed to understand the test.

 
Keep tests small: Avoid conditional statements and looping constructs in your tests. The test should be as simple as a single assertion.

Use parameterised testing to get rid of the loops on the inputs. Rework your design to leverage value types to keep the number of assertions limited.

 

Putting the lesson to practice


The initial test suite we developed with my pair was clearly not up to the level of quality that Romeu described. We violated several of the rules:

  • Our test suite is very generic and does not show its intent.
  • We used shortcuts like 30 instead of 2 * 3 * 5 that show the rules
  • Our test cases do not tell a story at all

Here is an attempt (that might maybe go a bit too far) to make the tests readable:

It requires some helpers to be defined for us (fortunately, these are quite generic function we could put in a separate file):

 

Conclusion


I am not quite sure the reworking of the test suite of the FizzBuzz lead to something that follows the great pieces of advice of the night.

But I sure did learn something: avoiding producing very generic tests as I regularly did and though was good practice. And I hope some of this did help you as well.

Meetup Report: The philosophy of FP

Yesterday, I attended a meetup at Arolla, where Arnaud LEMAIRE (@lilobase) did a wonderful talk on the philosophy of Functional Programming. You can find the slides of the presentation here.

The presentation was an introduction to the concepts behind Functional Programming. It went over the pillars of FP, one by one, before concluding with practical advices on how to use this knowledge in our (most probably object-oriented) code base.

Although it was designed as an introduction talk, and even as regular practitioner of Haskell and Clojure, I found quite a lot of things to learn.

What amazed me the most was the way Arnaud was able to summarize complex ideas into simple definitions. He managed to find analogies and make interesting parallels that made these concepts easily understandable.

I really enjoyed it. So I thought it would be a good idea to write down what stuck in my mind, and share it to you in this post.

 

On State


What is a state? The result of combining a value with time. This is how you get variables and overall, this is a good recipe to get bugs.

The functional paradigm language is about getting rid of this complexity by getting rid of time, and play with immutable values instead. The trick is to avoid variables, and create new values for each “modification” we would do in imperative programming style.

Contrary to what one could expect, creating these new values can be done very efficiently. Persistent data structures are based on structural sharing, the ability to re-use part of the old data value to construct the new one.

Persistent data structures do not make the cost of creating new values disappear entirely. They do consume more memory than their respective counterparts. They also add indirections that may cost in terms of computation time.

But this cost can be easily offset by the whole range of possible optimization they make possible in exchange. In particular:

  • Much easier use of parallel or distributed computations
  • No need to use defensive locking mechanisms

 

On (pure) Functions


Pure functions can be seen as delayed values: they just miss some arguments to be able to deliver their value. Give them the same arguments twice, and they output the same value. This has important consequences:

Memoization: Instead of recomputing the values of a pure functions called with the same argument, we can keep the result in store for the next time it is called. Pure functions make caching almost trivial.

Lazy evaluation: We can emulate the lazy evaluation of languages like Haskell by wrapping a computation inside a lambda that takes no arguments. Forcing the computation is just calling that closure with no arguments.

First class functions: If functions are like values, there is no reason you could not manipulate them as values. In particular, it means having functions returning functions, and functions taking functions as parameters, both of which are called higher-order functions.

 

Objects and Closures


OOP is based on smart data structures, classes that carry with them the functions that operate on them. Functional Programming is based on simpler data structures: functions that operate on the data structure are moved outside of it.

The FP view makes for easier composition of functions, especially if you see the functions as transformation between two sets of values. Chaining functions becomes function composition, and can be done without utilizing side effects.

Arnaud ended this sections with a wonderful comparison of Closure and Objects. He pointed out that Closures can be seen as the inverse of Objects:

  • Objects are data that carry some functions with them
  • Closures are functions that carry some data with them

 

On Type Systems


Note: this topic is not one I know a lot, so there might be re-transcription issues.

Functional programming is usually based on structural typing, whereas Object Oriented Programming is usually based on nominative typing.

Structural typing checks if types are equivalent by matching their structure. If two values have a similar structure, they can be considered of the same type.

Nominative typing checks if types are equivalent by matching their “name”. If two values have the name class name, they are considered of the same type.

As an example of structural typing, if a Coordinate is made of two doubles, and if a function translate expects a Coordinate, providing it two doubles will work fine. In the case of nominative typing, this would not type-check: you would need an explicit Coordinate data structure to be instantiated.

This is a trade-off. Structural typing is more loose and flexible than nominative typing. It makes it easier to build generic functions that can apply on different data structures. But it comes at a price: nominative type systems are better at ensuring correctness of your program.

In reality, this dichotomy is not as strict as it may sound. Some languages reinforce the structural typing by putting a name on the structure. This ensures the same correctness guaranty as nominative type systems, but provides a greater expressiveness. This kind of type system is called Algebraic typing.

 

Pattern matching and Polymorphism


Pattern matching in Functional Programming can be seen as the opposite of polymorphism in Object Oriented Programming:

  • Interfaces in OOP are closed under the addition of new functions, but not on a addition of new types. The dispatch is based on the name of the type.
  • Algebraic data types are closed under addition of new types, but open in terms of addition of new functions. The dispatch is based on the structure.

We can see there a reference to the Expression Problem coined by Philip Wadler. Both use cases exists: for example in Haskell, you can switch to Typeclasses when you need to be open to new types and closed to new functions.

I guess this is why Arnaud talked about not being a zealot and advised against the strict segregation of paradigms.

 

Practical Advices


The talk ended by a collection of practical advices for all of us that do not program into a “natively functional programming language”.

I think these pieces of advice are especially useful since most of developers do not have access to languages like Haskell or Clojure at their workplace:

  • Try to use immutability as much as you can: use const everywhere you possibly can. This will catch a lot of bugs.
  • Use algorithms like map, filter and reduce to avoid states in your looping constructs. It makes the code more declarative and less error prone.
  • Use Algebraic Data Types. They are available in most languages: JS, Python, std::variant in C++, etc. Use them.
  • To handle concurrency, try to rely on concepts like actors to avoid the synchronization nightmares (dead-lock and other performances issues).
  • Use concepts like Functional Core, Imperative Shell to make your business logic code as pure as possible.