The 1.0.0 of Idris has been released just a few months back. In previous posts, we went over 10 differences between Haskell and Idris and illustrated the dependent typing features of Idris through the implementation of the rules of the Bowling game at the type level.

We will continue this series on post on Idris, by implementing a small transducer library (the Idris package is available in this GitHub repo.).

The goal will be to provide Idris with a library efficient composable algorithmic transformations. This is something we quickly feel the need to when playing with Idris, due to the strict (non-lazy) nature of the language.

### Motivation for transducers in Idris

Let us first start by discussing why transducers are an interesting library to port in Idris.

###### Lazy vs Strict

As mentioned in the post listing 10 differences between Haskell and Idris, Idris is strict by default, while Haskell is lazy by default. Whether or not laziness is a good default is a common subject of discussion in the Haskell community.

On the bad side, laziness is often complained about for its tendency to introduce non obvious memory leaks or other performance defects in Haskell code. On the good side, laziness allows to separate processes that produce data, from the one transforming or consuming it (*).

*(*) John Hughes writes in lot more details about the benefits of laziness in terms of software modularity in its great paper Why functional programming matters.*

###### Composing algorithms

Thanks to laziness, we can easily compose *map*, *filter* and *folds* together in Haskell and not worry to much about intermediary list creation. But this is not the case in Idris.

The following two lines of Idris code will not execute at the same speed: *map (+1)* will need to first realize the whole intermediary list before *take* extracts only the 10 first elements:

take 10 $ map (+1) [1..20] -- Faster | |

take 10 $ map (+1) [1..1000] -- Slower |

In Idris, we need to implement ways to emulate laziness in order to get back the efficient composition of algorithm which is native in Haskell.

###### Why transducers?

Transducers are one way to implement efficient composition of algorithms. Transducers comes from the Clojure world and have been built with efficiency and reuse in mind:

- They eliminate intermediary sequence or containers when composing algorithms
- They decouple the transformation from the data source and destination

This makes transducers a great tool to build generic pipe-lines of algorithmic transformation, reusable in quite different contexts, and which can be composed very efficiently.

### Step functions & Reducing functions

We will start with some definitions and define their associated types. The vocabulary is directly inspired from Clojure, with some minor twists. Defining this vocabulary will guide us through the main concepts of pipelines of data transformation.

###### Step functions

A **stateless step function** is a function whose purpose is to be used in a left fold (also known as reduce in Clojure or std::accumulate in C++).

It combines an element coming from a data source (a container for instance) with some intermediary result, to compute the next intermediary result. Put differently, a step function represents a task to be performed in one iteration of a pure loop (without states and side-effects).

- We call
**accumulator**(or acc) the result being computed - We call
**element**(or elem) an element of the data source to read

We can define a *type alias* that will help us formalise this concept in the code:

StatelessStep : (acc, elem: Type) -> Type -- List of type parameters for the alias | |

StatelessStep acc elem = acc -> elem -> acc -- Define the alias to the step function |

For instance, the type alias **StatelessStep Int String** represents a step function that consumes strings to produce a result of type integer. It could be a function that sums the length of strings (in a fold):

sumLength : StatelessStep Int String | |

sumLength res str = res + fromNat (length str) | |

foldl sumLength 0 ["abc", "de", "", "fghij"] | |

> 10 |

###### Adding state

A **stateful step function** is a step function that needs some additional state to do its job properly. For instance, splitting a collection into chunks of equal size requires some kind of state: elements have to be kept in store until the completion of a chunk.

Because we are in a pure language, we will avoid relying on side-effects to track the state of our step function. Instead, we will rely on the State Monad:

Step : (state, acc, elem: Type) -> Type | |

Step state acc elem = acc -> elem -> State state acc |

For instance, the type alias **Step Bool Int String** represents a step function that consumes strings to produce a result of type integer, and maintains a state of type boolean. It could be a function that sums the length of strings, ignoring one of every two strings it encounters:

sumLengthOfEveryOddStrings : Step Bool Int String | |

sumLengthOfEveryOddStrings totalLength str = do | |

doSum <- get -- Get the state (should we sum the length?) | |

modify not -- Modify the state (alternate True and False) | |

pure $ if doSum | |

then sumLength totalLength str -- Sum the length in case of True | |

else totalLength -- Otherwise return the length unchanged |

###### Adding early termination

We will also want to express early termination for algorithms such as *take* that do not need to consume the whole data source. To do so, we will enhance our **stateful step function** to return a result decorated by a status:

data Status a = Done a | Continue a | |

Step : (state, acc, elem: Type) -> Type | |

Step state acc elem = acc -> elem -> State state (Status acc) |

- The step can return
*Done*to indicate early termination of the loop - The step can otherwise return
*Continue*to proceed with the rest of the loop

For instance, the type alias **Step () Int String** represents a step function that consumes strings to produce a result of type integer, and maintains no state. It could be a function that sums the length of strings, stopping when the sum exceeds a given value:

sumLengthUntil : Int -> Step () Int String | |

sumLengthUntil maxValue totalLength str = | |

pure $ if totalLength <= maxValue | |

then Continue (sumLength totalLength str) | |

else Done totalLength |

*Note: this ability to return early combines especially well with state (for instance to build functions such as take or drop). But we just saw that it also makes sense without state.*

###### Reducer (or reducing function)

A **reducer** (or reducing function) is what we get when we bundle a stateful step function with its state. The following Idris code creates a type *Reducer* that contains a field *state* and two functions *runStep* and *complete*:

record Reducer st acc elem where -- Defines a reducer in terms of | |

constructor MkReducer | |

state : st -- * A piece of state | |

runStep : Step st acc elem -- * A (stateful) step function | |

complete : st -> acc -> acc -- * A completion function |

The **runStep** function is doing the heavy lifting. It takes as input the current state, the current value of the accumulator, and an element. It returns the a new accumulator and the state to use for the next iteration. It represents the content of a loop.

The **complete** function is there to deal with the remaining piece of state. It represent the termination of the loop, the last thing to perform once the source of data is consumed. It allows to discharge remaining pieces of state at termination.

###### Example of reducer

Going back to our previous example, we can package our * sumLengthOfEveryOddStrings* step function (which sums the length of strings, ignoring one of every two strings it receives as input) with its state (a boolean).

Having packaged the stateful step with its state, we get a reducer that is self sufficient. It can be plugged into *reduce* to execute it on an input collection:

let reducer = MkReducer True sumLengthOfEveryOddStrings (const id) | |

reduce reducer 0 ["abc", "de", "", "fg", "hij"] | |

> 6 |

We have not introduced the *reduce* function yet, but you can see it as a generalised fold which manage both state and early termination.

###### Transducers

A **transducer** is a transformation of reducers. It takes as input a reducer, and returns another reducer. The new reducer includes additional functionality. In the process of adding new functionality, a transducer can both:

- Change the type of the state (s1 to s2), usually to track some more state
- Change the type of the element (elem1 to elem2) being accumulated

The type of the accumulator, on the other hand, cannot be modified (the result we expect from a loop is fixed). This leads to the following type alias for transducers:

Transducer : (acc, s1, s2, elem2, elem1: Type) -> Type | |

Transducer acc s1 s2 elem2 elem1 = | |

Reducer s1 acc elem1 -- Input reducing function | |

-> Reducer s2 acc elem2 -- Output reducing function |

For instance, the type alias **Transducer Int () Bool String Char** represents a transducer that transforms:

- A stateless reducer accumulating strings into a result of type integer
- To a reducer accumulating characters into result of type an integer
- And adding a boolean state in the output reducer

###### Transducer composition

Because transducers are just functions from one reducer to another reducer, they can be composed together just like normal function can. Composing transducers lets us:

- Gradually add features to pipe-line of data transformation
- Avoid realizing intermediary containers: we only compose recipes

For instance, we can compose a transducer than adds a filtering behaviour on a reducer to a transducer that maps each element to its square:

filtering odd -- Added behavior: keep only odd numbers | |

mapping square -- Added behavior: square each element | |

-- When composed: keep only odd numbers and square them afterwards | |

filtering odd . mapping square |

Because of this, we can see transducers as pipe-lines, each each step is responsible for its own transformation, and forwards resulting elements to the next transducer in the line.

*Note: You may have noticed that the type alias for transducer had elem1 and elem2 reversed. This is because transducers compose from right to left (the direction of composition), but the data flows from left to right (the direction of pipeline).*

### Running the loop

Reducing functions provides us with recipes to consume a stream of value and summarize it as a single result. Transducers allows us to build such recipes out of smaller recipes. But to make these recipes useful, we need to way to execute them.

In this section, we will dive into the implementation of **reduce** and **transduce**, two function that will allow us to execute the recipe on some data. You can skip this section if you are not interested in the implementation, and jump to the next one for example of transducers.

###### Reduce

The library provides a function named **reduce** that we used already in this post. It executes a provided reducing function (a recipe), given:

- An initial value for the accumulator
- A data source we can fold over to retrieve a stream of values

It will consume elements of the data source, executing the recipe at each iteration, until the stream is totally consumed or until the recipe asks for early termination. Then it will call the completion function on the result before returning it.

reduce : (Foldable t) => Reducer st acc elem -> acc -> t elem -> acc | |

reduce step acc = | |

uncurry (complete step) -- 4) Call the completion on the result | |

. (\(acc, s) => (s, unStatus acc)) -- 3) Remove the wrapping status | |

. (flip runState (state step)) -- 2) Remove the state monad | |

. runSteps (runStep step) acc -- 1) Consume as many value as needed |

This implementation relies on **runSteps** to run the loops until either the stream of values is totally consumed, or the reducing function asks for early termination (returning *Done*).

###### Running the steps

The implementation of **runSteps** relies on Continuation Passing Style to support the early termination.

runSteps : (Foldable t) => Step st acc elem -> acc -> t elem -> State st (Status acc) | |

runSteps step acc elems = | |

foldr stepImpl (pure . id) elems (Continue acc) -- Using continuation passing style | |

where | |

stepImpl _ nextIteration (Done acc) = pure (Done acc) | |

stepImpl e nextIteration (Continue acc) = step acc e >>= nextIteration |

- foldr builds a chain of functions, one for each element of the source
- Each of these functions represents one iteration of the loop
- Each iteration and is provided with the next one:
*nextIteration* - If
*nextIteration*is not called, the next iteration is not run and so the loop stops

###### Transduce

The library provides a second function named **transduce** that is only a thin but useful layer above reduce. It reduces (pun intended) verbosity for the most common use cases.

Indeed, a lot of pipe-lines of data transformation ends up with a stateless step function (like a sum, a concatenation, etc.). In such cases, using *transduce* instead of *reduce* removes a bit of noise. Here is an example in which we sum the square of odd numbers:

-- with transduce | |

transduce (filtering odd . mapping square) (+) 0 [1..10] | |

> 165 | |

-- with reduce | |

reduce (filtering odd . mapping square $ terminal (+)) 0 [1..10] | |

> 165 |

One other great advantage is that it allows to use the word *transduce* in our code, making your Idris code almost as cool as Clojure code.

### Building our own transducer

It is time to build our very own transducers. We will keep it simple in this post, and focus on stateless transducers. We will also use a helper function **statelessTransducer** (available in the library) to abstract away some of the details of the construction of a transducer.

The goal is to build an intuition for those who are foreign to the concept. The next post will deal with stateful transducer and unveil the details behind the helper functions.

###### Mapping

Mapping consists in using a function from **a** to **b** to transform an input reducer operating on element of type **b** to a reducer operating on elements of type **a**. It takes elements of type **a** and sends elements of type **b** to the next element of the pipe-line.

In the code below, **next** represents the step function of the next transducers in the pipe-line, while **fn** represents the mapping function from **a** to **b**.

- By applying the function
*fn*on an element of type*a*, we get an element of type*b* - We can give this element to
*next*, which consumes elements of type*b*

||| Given a mapper of type (a -> b), it transforms: | |

||| - a step function taking values of type `b` | |

||| - into one taking values of type `a` | |

mapping : (a -> b) -> Transducer acc s s a b | |

mapping fn = statelessTransducer $ | |

\next, acc, a => next acc (fn a) |

Here is an example in which we sum the length of a bunch of strings, multiplying each length by 2 along the way:

transduce (mapping length . mapping (*2)) (+) 0 ["abc", "", "de"] | |

> 10 |

*Note: Thanks to transducers and ability to compose in terms of recipe, we get the following equivalence by construction: mapping f . mapping g = mapping (g . f). Note that f and g are inverted compared to the usual Functor law because of the left to right flow of data.*

###### Filtering

Filtering consists in transforming an input reducer operating on element of type **a** to a reducer operating on only those elements **a** that satisfy the a given predicate.

In the code below, **next** represents the step function of the next transducers in the pipe-line, while **pf** represents the predicate function from **a** to **Bool**.

By calling *next* on only these elements that satisfy the predicate, we effectively filter these elements from the input source data: the rest of the pipeline will not see them.

||| Given a predicate of type (a -> Bool), it transforms: | |

||| - a step function taking values of type `a` | |

||| - into one taking the values of type `a` that fulfills the predicate | |

filtering : (a -> Bool) -> Transducer acc s s a a | |

filtering pf = statelessTransducer $ | |

\next, acc, a => | |

if pf a | |

then next acc a | |

else pure (Continue acc) |

Here is an example in which we sum the length of strings that do not start with the letter ‘a’:

transduce (filtering (\s => not (startsWith 'a' s)) . mapping length) (+) 0 ["abc", "de", "fgh"] | |

> 5 |

###### Concat mapping

Concat mapping is just like mapping, to the exception that the function provided to the **catMapping** function may return several **b** for one **a**.

In the code below, **next** represents the step function of the next transducers in the pipe-line, while **fn** represents the mapping function from **a** to a collection of **b**.

By calling *next* on all the the elements output by *fn*, *catMapping* takes elements of type **a** from and the left, and transmits elements of type **b** to pipe-line on the right. The main difference with *mapping* is that the number of **b**s is not necessarily the same as the number of **a**s.

||| Given a mapper of type (a -> Foldable b), it transforms: | |

||| - a step function taking values of type `b` | |

||| - into one taking values of type `a` | |

catMapping : (Foldable t) => (a -> t b) -> Transducer acc s s a b | |

catMapping fn = statelessTransducer $ | |

\next, acc, a => runSteps next acc (fn a) |

Here is an example in which we sum the length of strings, counting each of them twice:

transduce (catMapping (replicate 2) . mapping length) (+) 0 ["abc", "", "de"] | |

> 10 |

### Conclusion, and what’s next

In this post, we went over basics to define transducers in Idris. We defined the concepts and our main types, detailed the main function *reduce* and *transduce*, and built very simple transducers.

In the next post, we will see how to define more interesting transducers, such as *take*, *takeWhile*, *interspersing* or *groupingBy*. We will also look at some helper functions that helps building these transducers more easily.

The library is available in this GitHub repository. Any suggestions for improvements is obviously welcomed.

Follow @quduval