Generalized Algebraic Data Types gives us the power to develop type-safe Free Monads, without having to rely on continuation passing style when using simple Algebraic Data Types. The resulting Abstract Syntax Tree acts as easy to understand specifications for the interpreters of the language it defines.
In today’s post, we will revisit the first Embedded Domain Specific Language (EDSL) example of our previous Free Monad tutorial post. We will look at its Abstract Syntax Tree (AST) again, and show its similarities to the continuation passing style (CPS) of programming.
We will then explore another way we can define an AST for the same language, only this time in direct style, the opposite of continuation passing style, using Generalized Algebraic Data Types.
We will talk about the advantages of defining an AST in direct style, and conclude by giving a small recipe we can use to almost automatically create a Free Monad from a set of instructions we want to support in a Domain Specific Language (DSL).
Quick summary of the last episode
This post is a follow up of our previous Free Monad tutorial post in which we first introduced the basics of Free Monads and interpreters, before discussing more advanced examples where Free Monads are especially interesting.
We will start with a summary of the last episode, just enough for our purpose here. If you read the previous post or are familiar enough with Free Monads, you should skip this section.
Free Monads (in short)
Free Monads are special kinds of Monads which produce an Abstract Syntax Tree (AST) upon evaluation, a data structure for a recipe to execute. They do not perform side-effects directly, they only emit instructions. Nothing really happens until an interpreter reads these instructions to interact with the world.
As shown in our previous post, it gives us a great flexibility in terms of evaluation of the AST. We can switch back-ends, and write several interpreters to produce different effects out of the same AST (recipe).
We also illustrated the great power it offers us in terms of consulting, transforming or even combining recipes before evaluating them with an interpreter.
A Free Monad in 3 steps
We started our previous post with a first goal: showing in 3 steps how to decouple the recipe for an IO computation, asking the user for a password, from its execution.
Our first step was to identify and isolate the side-effects needed to express this computation. We identified two of them: reading a string from and writing a string to the standard output. We then created an AST containing of the instructions needed to support the side-effect we identified.
Here is the AST we developped for our IOSpec Domain Specific Language (DSL):
- ReadLine represents the instruction of reading a line. It contains a continuation which represents the set of instructions to execute after getting the string from the console.
- Writeline represents the instruction of writing a line. It contains the string to print and the set of instructions to execute after the printing is done.
- Pure is the marker for the end of the computation. It carries inside the result of the overall computation.
The second and third steps were respectively to create some factory functions and to make our IOSpec AST a Monad. These two steps completed the construction of our Free Monad, making it easy to emit an IOSpec AST by writing imperative looking code that uses the do-notation.
Interpreting the Free Monad
For a sequence of instructions emitted by a Free Monad to be useful, we need an interpreter to translate the AST instructions into a lower level language (such as IO).
We illustrated this by developing a simple interpreter which transforms a computation expressed in the IOSpec Free Monad into a IO computation to execute it:
This was a short summary of the first part of our previous post on Free Monads. You can have a look at the rest of the post if you are interested in transformation on Free Monads AST.
Continuation Passing Style AST
Let us now explore in what aspects the IOSpec Abstract Syntax Tree we just defined resembles the continuation passing style (CPS) of programming.
What is continuation passing style
The Continuation passing style of programming consists in explicitly passing to each function a continuation which is a function that represents the rest of the computation to perform.
In direct style, a function returns its result of type a directly. It just returns the value. This is the default style in most programming languages:
In continuation passing style, a function takes an additional function – named a continuation – from a to an arbitrary type r as parameter, and calls it on its result of type a:
We will not go in details over the advantages of switching to continuation passing style (which include for example turning stack recursion into heap recursion). Instead, we will look at the overall pattern, abstract it, and see how this abstraction relates to our IOSpec AST.
Abstracting continuation passing style
Looking back at the definition of our parensCps continuation passing style function, we can see a pattern. The return type of the function follows the pattern (a -> r) -> r, where:
- a is the result type the function would have in direct style
- a -> r is the type of the continuation
- r is the result type of the continuation
In the case of parensCps, a matches the string type (the return type of parens written in direct style), while our continuation has the type String -> a:
We can capture this pattern through the type (or type alias in our case) Cont r a, and adapt the type of parensCps accordingly:
In the code above, the type Cont r a should be understood as “the type of a continuation passing style function returning a a“ with r being the return type of the continuation. Alternatively, we could see Cont r a as the type of a function that leads us one step closer to a final result of type r, where:
- a is an intermediary step on the way to compute the final result of type r
- The continuation a ->r represents the rest of the computation to get to the final result
This continuation pattern is the one we will try to find in IOSpec.
IOSpec continuation passing style
Looking back at the AST of the IOSpec Free Monad, let us see in what aspects it resembles the continuation passing style of programming. We will start with the definition of the AST:
We can easily refactor this Algebraic Data Type (ADT) to use the more general syntax of Generalized Algebraic Data Type (GADTs), yielding the following equivalent definition for IOSpec:
Now, if you look at this carefully, you should start to see that:
- ReadLine resembles a function written in continuation passing style, returning a string
- WriteLine resembles a function written in continuation passing style, returning an empty tuple
In fact, we can rewrite our AST using our Cont type alias:
Note: We cheated a bit. For PrintLine, we replaced an IOSpec computation by a function taking an empty tuple and returning an IOSpec computation. These things are however semantically equivalent.
Toward direct style AST using GADTs
Continuation passing style is arguably a bit complex to understand. Thankfully, we can very often transform a CPS function into a function written in direct style, usually much easier to understand. The same way, we can often refactor an AST written in continuation passing style to an AST written in direct style.
Using GADTs – a direct style AST
Generalized Algebraic Data Types allow us to write down the type of the constructors. Each of our constructors is free to return a different type. We can use this to define an AST made of instructions that are not of the same type.
This allows us to do something rather nifty. We can turn any function into an instruction of our AST by lifting the name of the function into a type constructor:
As we did for our continuation passing style AST, we can hide the construction of the AST using factories functions. These functions are much easier to write than previously: they just map instructions back to their corresponding functions.
Starting from this base, we can build a new Free Monad that encode the same language as our previous AST, by simply adding the support for the Monad interface.
Making it a Monad
To transform our AST into a Free Monad, we can almost mechanically add two additional constructors for our AST, Pure and Bind, which mimic the Monad interface:
Now, the Monad implementation of IOSpec almost writes itself. It only consists in calling the appropriate constructors, and is much simpler to write than the Monad instance needed for our continuation passing style AST. It is provided here as reference.
A specification for the interpreter
A value of type IOSpec a represents a computation that returns a result of type a. Our GADT-based AST therefore encodes (and enforces!) the following:
- Interpreting the ReadLine instruction should return a String
- Interpreting the PrintLine instruction should return empty tuple
Somehow, this same information was also encoded, although a bit differently, in our initial continuation passing style AST:
It however required to know a bit about continuations passing style in order to recognize the pattern, and notice it was effectively equivalent to:
Our direct style AST is therefore at the same time easier to derive (just lift function to constructors) and easier to read as a specification for an interpreter to translate it into lower-level code.
Because our AST now encodes the specification of its interpretation in a more direct way, writing an interpreter is also much easier. The translation in a lower level language is almost straightforward:
- Pure translates to pure in the lower level language
- Bind translates into >>= between two interpretations
- The other instructions translate to their corresponding lower level instructions
Each instruction but Bind is translated into a non-compound expression. This is simpler than with our continuation passing style AST, whose interpretation involved both translating instructions and chaining their result with their corresponding continuation.
Bind factors the continuation
The astute reader might have recognized some continuation passing style creeping in our GADT-based AST, through the Bind constructor.
This is no accident. We can see the same continuation in the very definition of a Monad (*):
What our GADT-based AST does is in fact to factorize the continuation which used to appear in the ReadLine and PrintLine constructors, into a single Bind single constructor. This is partly what makes the AST simpler to understand and the interpreters easier to write.
(*) Basically, we could see the bind operator of a Monad as mapping between a Monadic value and a continuation. In addition, continuations form a Monad too. This connection between Monads and continuations is expressed in more details in The mother of all Monads.
The end of our journey
There are many different ways to write the same code. Similarly, there is more than one way we can write our Free Monad. In particular, there are two broad categories of Abstract Syntax Trees: continuation passing style ASTs, and direct style ASTs.
Continuation passing style
While continuation passing style is rarely encountered in the wild world of programming, it is relatively more present in the context of Free Monads.
The reason is that constructors of an algebraic data type must all return the same type, such as AST a where a stands for the result type of the overall computation. Therefore, encoding the return type of each instruction is not possible unless:
- We rely on something like continuation passing style
- We enable Haskell type extensions such as GADTs
Switching to direct style
Turning up the GADTs extensions gives us the choice not to dive into the tricky world of continuation passing style (which is arguably harder to understand), and get some additional benefits such as:
- An almost automatic way to build an AST from the instructions we want it to support
- An AST that acts as easily to understand specification for its interpreters
- Simpler interpreters, factory functions, and Monad instances
These benefits makes it much easier to introduce the concept of Free Monad to newcomers to Haskell or Idris (refer to this article as an example) and make GADT-based AST a great alternative to continuation passing style ASTs.