LISP Meta-Programming for C++ Developers: Introduction

The standardization of C++ 11, 14 and 17 each brought some features to help developers doing meta-programming in C++. With these, meta-programming in C++ has come from a technical curiosity to becoming part of the language, and a growing subject of research.

C++ is not the first language to invest in meta-programming. LISP dialects have a strong reputation in that regard. They also provides a perspective on meta-programming that differs from the one C++ developers are accustomed to.

In this series of post, we will explore the world of Clojure, a recent and quite popular LISP dialect, and learn from a language with a built-in and quite powerful support for meta-programming.

We will see how meta-programming works in Clojure, what power it offers and how it differs from C++ in terms of philosophy. We will discuss the place of meta-programming in LISPs and why it is so useful at all abstraction levels.

Our overall focus will be to offer C++ programmers with a boarder view of the use cases of meta-programming, and also to discuss some potential changes to C++ that could potentially benefit its meta-programming facilities.

 

Overall Plan


To keep everyone afloat, the basic assumption is that the reader does not have any previous experiences with Clojure or any LISPs.

The plan below describes how, through this entire series of post, we will get you from possibly not-knowing-Clojure to being proficient enough to understand how to do meta-programming in Clojure, and how useful it is.

 

Introducing Clojure

We will first have to go through the basics of Clojure and learn enough to understand its meta-programming facilities later.

The good thing with LISP dialects is how simple and fast to learn they are. A few sections should be enough, and by the end of this post, you should know enough of Clojure to feel comfortable following the rest.

 

Basics of macros

After the quick introduction on Clojure, we will go through an explanation of how macros, which are the main facility for meta-programming in Clojure, work.

We will go through some basic example to illustrate our explanations. This will give us a glimpse of what we can do with them. This will be the opportunity to discuss how they differ from C++ constructs such as constexpr.

 

Zero overhead meta-programming abstractions

Once familiar with Clojure and its macro, we will dive deep into a topic that C++ developers cherish: zero-overhead abstractions. We will see how we can benefits from macros to build powerful, high-level abstractions, without sacrificing performance.

This section will be oriented toward library developers that wish to understand how macros help in optimising code. In particular, we will see how they empower the developer to develop its own special purpose optimisations in complement to the general purpose optimisations built-in the compiler.

 

Empowering application developers

While in C++, meta-programming traditionally focuses on giving tools to low level library developers, meta-programming can also be used to build powerful high level software abstraction for application developers.

In this section, we will see how macros can help express high level concerns such as operational, architectural or business requirements.

We will argument that meta-programming should not be seen as something reserved for low level libraries and the bunch of elite developers that implement them, but should be seen as a tool to be used in the everyday’s developer job.

 

Clojure introduction


In this section, we will discuss the basics of the Clojure language, with no pre-requisites required for you to understand it.

Clojure is a very tiny language. In this section alone, we will cover quite a lot of ground. This introduction is not exhaustive though: we will explore enough for you to discuss meta-programming in Clojure in the next posts.

If you want a more complete introduction to Clojure, I recommend you to have a look at Brave Clojure: Do Things as a good starting point.

 

Calling a function

The code below shows the syntax for calling functions in Clojure. As in most LISPs, calling a function consists in:

  • Opening a first parenthesis
  • Adding the verb in first position, like the symbol of the function to call
  • Adding the list of all the arguments to pass to the function behind
  • Closing the initial parenthesis
Start expression
|
|  Arguments
|  | |
v  v v 
(+ 1 2)
 ^    ^
 |    |
Verb  |
      |
End of expression

You can find below two examples of function calls, one to sum two integers (with operator +) and one to compute the length of a string (with count):

(+ 1 2)
=> 3
(count "abc")
=> 3

 

Useful basic types

Clojure comes with a set of built-in types, such as the integers and the strings we already encountered in the previous example. We will see below some of the other useful built-in types.

 
Vectors starts and ends with square brackets. They are containers that are efficient to index into. Adding into a vector push the new element at the end:

[1 "a" 2] ;; Create a vector
=> [1 "a" 2]
(count [1 "a" 2]) ;; Get the size of the vector
=> 3
(get [1 "a" 2] 1) ;; Get the element at index 1
=> "a"
(conj [1 "a" 2] 3) ;; Push back an element at the end
=> [1 "a" 2 3]

 
Keywords look like symbols (name of function, types, etc.) that starts with a colon “:”. They are like the item of an enumeration in C++ and have basically the same purpose. Here are some examples:

:first-name
:last-name
:age

 
Maps starts and ends with curly braces. They are associative containers that are efficient to search by key into, or to add a new key value pair into:

;; Create a new map associating:
;; - "John" to :first-name
;; - "Doe" to :last-name
{:first-name "John"
:last-name "Doe"}
=> {:first-name "John", :last-name "Doe"}
;; Add the age of John Doe to the map:
;; Associate the value 31 to the key :age
(assoc
{:first-name "John"
:last-name "Doe"}
:age 31)
=> {:first-name "John", :last-name "Doe", :age 31}
;; Retrieve the first name of John Doe
;; Search for the key :first-name
(get
{:first-name "John"
:last-name "Doe"}
:first-name)
=> "John"

 

Defining functions

Let us define our first function, add-broken. This function takes two arguments, a and b, and returns their sum, plus one. You can see the syntax for the definition below, followed by an example of call on the arguments 1 and 2:

(defn add-broken
[a b]
(+ 1 a b))
(add-broken 1 2)
=> 4
  • defn indicates that a definition of a function follows
  • Immediately after defn follows the name of the function
  • Then a vector holding the names of the arguments, a and b
  • The rest is the definition of the function (a simple addition here)

Since our add is broken (it adds 1 more to the sum of a and b), we can either fix the function, or document the bug. We can document the bug by adding some comments after the name of the function, inside the definition:

(defn add-broken
"Broken - Could not fix it"
[a b]
(+ 1 a b))

Note: the arguments are inside a vector, the comments are simple strings, any expression is surrounded by parentheses (a list in Clojure). The language is defined in terms of its own data structures. We will see how it helps with meta-programming later.

 

Lambda functions

Lambda allow to define functions without names, and which can capture their environment (closure). Clojure has two syntax for them, one of them being a shortcut for the other. They are shown below:

;; Lambda function with:
;; - 2 arguments x and y
;; - capture of value z
(fn [x y] (+ x y z))
;; Short notation for the same function
#(+ %1 %2 z)

The C++ equivalent syntax would be:

[z](int x, int y)
{
return x + y + z;
}

 

Let bindings

We can use let to introduce bindings, which you can see as const variables in C++. The let asks for two arguments: a vector containing the bindings followed by the body of the let. The bindings are only visible inside the scope of the let body.

The bindings vector consists in an alternating sequence of const variable followed by the expression to compute them. The syntax is described below:

(let [variable-1 expression-1
      variable-2 expression-2]
  (The body of function using variable-1 and variable-2))

The result of the let expression is the result of the body of the let, where each variable expressed in the bindings has been replaced by its associated value:

(let [x (+ 1 2)]
(* x 2))
=> 6
(let [x (+ 1 2)
y (* 2 x)]
(+ 1 y)
=> 7

 

If statement

Clojure offers an if statement that takes three branches. The first branch is the condition to be evaluated. If the condition is true, the second branch (the consequence) is executed, or else the third branch (the alternative) is executed instead.

(defn absolute-value
[x]
(if (< x 0) ;; Condition
(* -1 x) ;; Consequence
x)) ;; Alternative
(absolute-value -1)
=> 1
(absolute-value 1)
=> 1

 

Loops? Recursion!

Clojure is a functional language that puts the emphasis on immutability and avoiding side-effects. As such, traditional looping constructs like the for-loop are not part of what Clojure offers.

Instead, Clojure relies on recursion, and provides an efficient low-level looping construct named recur that allows to recurse without growing the stack (the compiler ensures it is called in tail position).

Here we use it to implement a counter that decrements to zero and prints all numbers along the way before indicating it is done:

(defn counter
[n]
(println n)
(if (< 0 n)
(recur (- n 1))
"Done!"))
(counter 3)
3
2
1
0
=> "Done!"

Clojure offers an even lower level looping construct named loop-recur to implement a recursion without growing the stack and without requiring a function.

The loop construct accept a list of bindings that initialise the recursion, and the recur will replace the arguments by the new value.

(loop [n 5]
(println n)
(if (< 0 n)
(recur (- n 1))
"Done!"))

Running this loop will produce the same result as calling count with the argument 5.

 

This is it

Functions, control structures, and data structures are enough to build already quite complex features in a language such as Clojure. I encourage you to have a deeper look at Clojure if you are interested. But this tutorial should be enough for our purpose.

The next section will apply this newly acquired knowledge to build our first really useful function in Clojure.

 

First non trivial function


Let us write a function that compute the average on a collection of values. For instance, calling our average function on [1 2 4 5] should return 3.

We know that the average if the ratio of the sum of the values divided by the number of values, so we will build two functions: sum and average.

 

Imperative style

We can implement it in an imperative style code, using the very low level loop-recur construct. This code makes use of the sequence abstraction of Clojure that allows to extract the first element from a sequence as well as its rest. It allows to iterate generically over any type of collections, a bit like the iterator concept in C++.

(defn sum
[coll]
(loop [total 0
coll coll]
(if (not-empty coll)
(recur (+ total (first coll)) (rest coll))
total)))
(defn average
[coll]
(/ (sum coll) (count coll)))
(sum [1 2 4 5])
=> 12
(average [1 2 4 5])
=> 3

This is equivalent to this C++ code, which is at a similar abstraction level (slightly higher, the use of iterators being hidden in the range based for loop):

template<typename Container>
int sum_recur(Container const& coll)
{
int total = 0;
for (int val: coll)
total += val;
return total;
}
template<typename Container>
int average(Container const& coll)
{
return sum_recur(coll) / coll.size();
}

 

Functional style

Almost no Clojure developer would code this function like we just did. Functional programming encourages using higher order functions like reduce (the equivalent of std::accumulate of the STL) to express the sum as a one liner:

(defn average
"Average of numbers known at runtime"
[coll]
(/ (reduce + coll) (count coll)))

The equivalent code in C++ would be using std::accumulate to avoid the need to hand craft a for-loop:

template<typename Container>
int average(Container const& coll)
{
return std::accumulate(begin(coll), end(coll), 0, std::plus<int>{}) / coll.size();
}

 

Homoiconicity and macros


It would be a waste not to start talking about macros in the first post on a series dedicated to LISP meta-programming. In this section, we will discuss the basic idea behind macros.

 

Homoiconicity

As we noticed through the preceding paragraphs, the arguments of a functions have the same syntax as a vector of symbols. The let bindings are also following the syntax of a vector. The function call syntax is based on the list syntax (parentheses surround lists). The documentation of functions is written as a string, the annotations are have the same syntax as Clojure maps, and so on.

In truth, the whole Clojure programming language is built from a small set of the core Clojure data structures. The Clojure code is data. The AST of Clojure is made of the same the structure that participate in defining its syntax. We name this property Homoiconicity.

 

Code transformation

The language being defined in terms of its own data structures, we can easily write Clojure code that reads and transforms Clojure code. Even better, the Clojure core library already contains hundreds of functions that apply on its core data structures.

Especially since there are only but a few such data structure to consider, functions from AST to AST are relatively simple to implement. Meta-programming in Clojure (and any LISP) only consists in transforming nested data structures made of strings, lists, vectors, maps and a few other data structures.

From there, we only miss an entry point in the compiler to plug for these functions from Clojure code to Clojure code. This is where macros comes into play.

 

Macro as compiler extensions

Macros defines functions from Clojure code to Clojure code (AST data to AST data) that gets called during compilation by the Clojure compiler. Macros are effectively ways to plug into the compiler your own logic, to implement your own constructs, optimisations, code generation, and more.

The Clojure reader will first read the program, yielding a bunch of data structures. Upon encountering a macro in the verb position of a list, it will provide the macro with the Clojure code wrapped by the macro as parameter. From there, the macro is free to impact the code structure as much as it wants.

 

Values, not types

We will end up by stressing a first but very crucial difference between C++ and LISP meta-programming.

As underlined in the newly published Practical C++ Metaprogramming book, C++ meta programming is based on types as first class values (considering that integral constants are convertible to types). On the contrary, Clojure meta-programming instead deals with values. These values are fragments of the AST, built from the standard core container of the language.

This means that Clojure meta programming is based on the same stuff that makes its runtime world, while C++ has a different world to deal with meta-programming. We will see how it makes the reuse of code for both runtime and compile time different in both worlds (in particular, we will spend time discussing about constexpr).

 

Conclusion and what’s next


Inside this first post, we introduced the basics of Clojure, learned enough about the language to build our first function, and leaned about homoiconicity and macros in theoretical terms.

In the next post, we will start to play with macro and discuss how it compares to C++ constructs such as constexpr.


You can contact me or follow me on Twitter.

Leave a comment

Create a website or blog at WordPress.com

Up ↑