Lisp Meta-Programming for C++ Developers: Automatic traces (Part 2)

In the previous posts, we started a series dedicated to familiarize C++ developers to the world of LISP meta-programming, based on the Clojure programming language. The goal is to offer a different perspective on meta-programming, how we can approach it, and the kind of use cases it can address.

We started the series by an introduction on Clojure, homoiconicity and how meta-programming works in LISP dialects. You can find below our previous posts:

 
In our previous post, we started implementing a way to automatically trace function calls along with its arguments. We showed how Boost.Hana helped dealing with heterogeneous containers, and discussed the importance of supporting reflection as a language feature.

In today’s post, we will continue where we left our log API, and use meta-programming to automatically add calls to our log API in our functions. The absence of reflection in C++ will lead us to use the good old C-macros.

We will then have a look at how Clojure deals with the same problem. We will see that having access to AST manipulations as well as side-effects at compile time tremendously helps cleaning up the code and separate concerns.

Note: this post builds on the previous posts of the series. Unless you already know Clojure and its macros, you need to read the previous posts to be able to follow this one.

 

Summary of the last episode


Let us start with a quick summary of our goal and what we started to implement in the previous post of the series.

 

Our goal

Our goal is to provide a very simple way for a developer to add traces around some of its most critical function, logging each of their calls. These traces should contain the name of the function as well as the arguments names and values passed to it.

[TRACE] Call to "repeat-n" with arguments: {n 3, s "toto"}

We want to support two main ways to plug the traces: one that allows to dynamically switch traces on and off, and one that is enabled or disabled at compile time.

 

Where we are

In our previous post, we built an API to manually log traces in both Clojure and C++. This API features a function to compute the log prefix from a function name, and a function to log the arguments of a function.

We demonstrate below how we would use the Clojure API, if we add to manually traces the inputs of a function repeat-n:

(defn repeat-n
[n s]
(println ; Print
(log-symbol-values ; the log message with:
(function-log-prefix "repeat-n") ; - as prefix: the function name
'[n s] ; - the arguments' symbols
[n s])) ; - the arguments' values
...) ; Implementation
; Calling the function
(repeat-n 3 "hello")
; Produces the log trace:
"[TRACE] Call to \"repeat-n\" with arguments: {n 3, s \"hello\"}"

The equivalent C++ code is shown below. Since C++ does not have a dedicated type for its symbols (as it does not yet support reflection), we use string for the symbol names:

std::string repeat_n(int n, std::string const& s)
{
log_named_trace( // Print the log message
get_logger(), // inside the logger (like std::cout), with:
log_prefix("repeat_n"), // - the function name as prefix
hana::make_tuple("n", "s"), // - the arguments' names
n, s); // - the arguments' values
return ...; // Implementation
}

 

Our next steps

As shown by the example above, adding traces manually to our code is both verbose and error prone. We want to get rid of all this risky and noisy boilerplate.

We will provide a meta-programming facade to our API to make this kind of code as concise and error free as possible. This facade will need to support the two use cases we mentioned earlier: run-time and compile-time switches of the traces.

 

Automated tracing in C++


We will start by C++. Because C++ does not (yet) support reflection, we will make use of C-macros (and their # operator) to extract the argument names.

 

First without argument names

As usual, we will start with a simpler problem. We will trace a function call, logging the function name along with its argument values (but not their names).

To do so, we will use log_trace we developed in our previous post, and our first goal is to call it as safely and concisely as possible:

template <typename ...Args>
void log_trace(std::ostream& out,
std::string const& prefix,
Args const& ...args)
{
out << prefix << "(";
hana::for_each(
hana::intersperse(hana::make_tuple(args...), ", "),
[&out](auto const& arg) { out << arg; });
out << ")\n";
}

We can write a higher order variadic template function wrap_with_logs that will call the log_trace function before calling the function it wraps.

template <typename Fct, typename ...Args>
auto wrap_with_logs(std::string const& fct_name,
Fct fct, Args && ...args)
{
log_trace(get_logger(), log_prefix(fct_name), args...);
return fct(std::forward<Args>(args)...);
}

It helps, but the function name needs to be provided as a string and manually:

std::string repeat_n_with_logs(int n, std::string const& s)
{
return wrap_with_logs("repeat_n"s, repeat_n, n, s);
}

The need to provide the function name manually is a potential source of inconsistency if the actual function name happens to to change.

 

A bit of Preprocessor in the mix

We can automatically provide the function name to the wrap_with_logs function, using a C-macro and the # operator to string-ify its function argument.

#define LOGGED_CALL(fct, ...) wrap_with_logs(#fct, fct, __VA_ARGS__)

Using it, it gets easier to add logs around our repeat_n function. We merely wraps the function call with our LOGGED_CALL macro. Ugly, but it works:

std::string repeat_n_with_logs(int n, std::string const& s)
{
return LOGGED_CALL(repeat_n, n, s);
}
//Calling
repeat_n_with_logs(3, "tutu");
//Produces trace:
"[TRACE] Call to \"repeat_n_\" with arguments: (3, tutu)"

This avoids the inconsistency errors we discussed earlier: the function name appearing in the logs will automatically be the correct one.

 

Adding argument names

We solved the simpler problem, so we can add our arguments names back. We will adapt our wrap_with_logs function to call log_named_trace instead.

template <typename ...Names, typename Fct, typename ...Args>
auto wrap_with_logs(
std::string const& fct_name,
hana::tuple<Names...> const& names,
Fct fct, Args && ...args)
{
log_named_trace(get_logger(), log_prefix(fct_name), names, args...);
return fct(std::forward<Args>(args)...);
}

A variadic C-macro will not do this time: separating each argument name with a comma (to build the hana::tuple) cannot be easily done. So we will offer several C-macros, one by function arity. These macros can themselves be generated by a script.

We only show the first three of them:

#define LOGGED_CALL_1(fct, a1) \
wrap_with_logs(#fct, hana::make_tuple(#a1), fct, a1);
#define LOGGED_CALL_2(fct, a1, a2) \
wrap_with_logs(#fct, hana::make_tuple(#a1, #a2), fct, a1, a2);
#define LOGGED_CALL_3(fct, a1, a2, a3) \
wrap_with_logs(#fct, hana::make_tuple(#a1, #a2, #a3), fct, a1, a2, a3);

It is slightly less intuitive to work with: we have to care about the arity of the function. But it manages to eliminate much of the boilerplate:

std::string repeat_n_with_logs(int n, std::string const& s)
{
return LOGGED_CALL_2(repeat_n, n, s);
}
//Calling
repeat_n_with_logs(3, "tutu");
//Produces trace:
"[TRACE] Call to \"repeat_n_\" with arguments: {n 3, s tutu}"

Using the C-macro, there is no way to mess up with the argument names anymore. The trick is that these argument names are the ones of the wrapping function.

 

Further improvements

The wrapped function repeat_n can be hidden or not, depending on whether we want to constrain the user of the API to use the version with the traces:

// Could be made private inside a compilation unit
std::string repeat_n(int n, std::string const& s);
std::string repeat_n_with_logs(int n, std::string const& s)
{
return LOGGED_CALL_2(repeat_n, n, s);
}

Now, we are not completely done. The initial requirements mentioned the support of run-time and compile-time switch of the traces. This can be easily accommodated:

  • Run-time switches can be factored into the with_*_log functions
  • Compile-time switches can be offered through pre-processor flags

For instance, we can make the logs completely disappear from production build through the use of a preprocessor flag to modify the definition of the LOGGED_CALL C-macro. These enhancements are left to the curious reader to implement.

 

Drawbacks

We managed to deliver the feature, but we had to rely on several different mechanisms, some of which are not part of our language:

  • Template Meta Programming
  • The Boost.Hana library
  • The preprocessor (for C-macros)
  • Likely our build system to carry the preprocessor flag

This diversity has some serious drawbacks. We need to learn different things, each having their different way to work, which places a burden on the developer.

In addition, the “source of truth” is not at one single place: no place gives the whole story, especially if we have to use the build system to carry preprocessor flags. All the different places where the “truth” is spread are nevertheless strongly coupled.

 

Automated tracing in Clojure


It is time to deliver the same service in Clojure, and show how the ability to manipulate the AST directly and perform side-effect at compile time helps us avoiding the difficulties we encountered in our C++ implementation.

 

Bring your own language

You can find below the implementation of the repeat-n function, that creates a string made of n times the string s.

(defn repeat-n
[n s]
(apply str (repeat n s)))

Direct AST manipulations in Clojure allows us to build our own language inside the language. We can create a macro that defines a function with the additional log traces built-in. Here is how we can define repeat-n with automatic traces:

(defn-log repeat-n
[n s]
(apply str (repeat n s)))

The only difference is the use of the macro defn-log (which we will implement just after) instead of the defn macro used to define a function in Clojure. That’s it.

 

A first implementation

Our goal is to mimic defn so we first need to understand how it works. It is a macro, taking 3 (or more) arguments:

(defn ; Macro to create a function, taking:
repeat-n ; - Name of the function
[n s] ; - Bindings (the arguments)
(apply str (repeat n s)) ; - Body (the implementation)
)
  • The function name
  • A vector of arguments (named bindings)
  • The implementation (all remaining arguments)

We will take the exact same arguments to define our defn-log macro. It will return the same function definition as defn would, except it will intercalate a call to our trace API at the beginning of the function (*).

(defmacro defn-log
[name bindings & body]
`(defn ~name ; Defined a function with `defn`
~bindings ; With the same argument as `defn` would
(println ; But with an added trace API call
~(compile-log-message name bindings))
~@body)) ; Before the body of the function
view raw defn-log.clj hosted with ❤ by GitHub

The tilde allows to interpolate inside the expression the result of the compilation of the log message (which we will see in the next section). The tilde-arobase allows to interpolate a list of expressions (the whole body of the function in our case).

We can try our new macro and see how it macro-expands. An interesting thing is that we are able to precompute the prefix at compile time:

(defn-log repeat-n
[n s]
(apply str (repeat n s)))
; Is macro-expanded (at compile-time) to:
(defn repeat-n
[n s]
(println
(log-symbol-values
"[TRACE] Call to \"repeat-n\" with arguments: "
'[n s]
[n s]))
(apply str (repeat n s)))

 

Log message compilation

Our defn-log macro used compile-log-message to add the tracing code and compute at compile time the prefix of the log traces. Here is its implementation:

(defn compile-log-message
[fct-name bindings]
`(log-symbol-values
~(function-log-prefix fct-name)
'~bindings
~bindings))
  • It is a simple function returning an AST, not a macro (*), so we can test it
  • The returned code fragment contains the call to the trace logging API
  • It also contains the computed function prefix (interpolated through tilde)

We demonstrate its behaviour below:

(compile-log-message "my-function" '[a b])
; Output (code fragment):
(log-symbol-values
"[TRACE] Call to \"my-function\" with arguments: "
'[a b]
[a b])

If we call this function at compile time, the prefix computation will thus happens at compile time, reducing the amount of work to do at run-time.

(*) Functions can return code fragments too. Macros are only needed when we want to intercept an AST fragment before it reaches the compiler. This is very useful in practice: it allows to extract in testable function most of the heavy lifting of macros.

 

Going further: adding support for de-structuring

The previous implementation of compile-log-message does not work for function making use of de-structuring. De-structuring allows to pattern match directly on the arguments of a function.

As an example, the following function will extract the head and tail of the collection given as argument, and create a map holding the destructured parts:

(defn head-and-tail
[[x & xs]] ; Destructuring of the input container
{:head x ; `x` contains the first element
:tail xs}) ; `xs` contains the remaining elements
; Example call
(head-and-tail [1 2 3])
=> {:head 1, :tail (2 3)}

Our defn-log macro will not work with this head-and-tail function. The ampersand symbol does not have any value associated to it. It is a purely syntactic construct, but our macro will capture it nevertheless, leading to a compilation error.

So we have to filter out the symbols that are only syntactic constructs. This is the task of the function bindings->bound-vars (implementation available on GitHub). Here is the corrected version of our compile-log-message, which we will work for destructuring:

(defn compile-log-message
[fct-name binding-vec]
(let [bindings (bindings->bound-vars binding-vec)]
`(log-symbol-values
~(function-log-prefix fct-name)
(quote ~bindings)
~bindings)))

Now, we can trace de-structured argument values automatically:

(defn-log head-and-tail
[[x & xs]]
{:head x
:tail xs})
; Example call
(head-and-tail [1 2 3])
; Generates the log trace
"[TRACE] Call to \"head-and-tail\" with arguments: {x 1, xs (2 3)}"

 

The simpler the syntax, the better

Our previous example with destructuring shows how syntax impacts AST based meta-programming. Every specific syntax construct has to be dealt with. This represents a cost that has to paid inside most macros.

There is very little separation of concern when it comes to syntactic constructs. The compiler sees everything, and so the macros (which intervene just before the compiler) see everything as well. The more complex the syntax, the harder it gets.

This is why Clojure and Lisps in general deliberately keep the core of the language as small as possible. Irregularities and specific syntax cost dear (for tooling as well). There is a value in not adding features to a language.

 

Enabling and disabling traces


We will end this post by improving our defn-log macro to add the support a switch to enable/disable traces at run-time or at compile-time.

 

Run-time switch

Adding a run-time switch to enable/disable the traces is actually pretty straightforward. We can just add an if statement to the generated code by defn-log.

(defmacro defn-log
[name bindings body]
`(defn ~name
~bindings
(if (...) ; Write any condition you like there
(println ~(compile-log-message name bindings)))
~body))

As we can generate the code we wish inside the if condition, we can get fancy and even introduce some more complex logic: retrieving some configuration to chose which logger to us, and more.

 

Compile-time switch

An if statement did the job at run-time. To implement the compile-time switch, we can use an if statement as well. We merely have to execute it at compile time.

(defmacro defn-switch-log
[& args]
(if (logger-enabled?) ; Compile-time condition
`(defn-log ~@args) ; With traces: forwarding to `defn-log`
`(defn ~@args))) ; Without traces: forwarding to `defn`

Our if condition can do anything we wish, even trigger side effects. We can for instance read a resource file (shown below) to enable or disable the traces.

{:log
{:enabled? true}}

Here is the implementation of log-enabled? that reads the content of this resource file and extract the :enabled? flag value from it:

(def ^:const trace-config-file-path
"./src/clj/algorithms_clj/macro_initiation_resource.edn")
(defn logger-enabled? []
(->
(slurp trace-config-file-path) ; read the resource file (slurp)
(read-string) ; parse its content
:log :enabled? ; extract the value of :enabled? under :log
))

The code emitted by the compiler will depends of the content of the resource file. Unless :enabled? is true, the code will not contain any trace related code.

 

Side effects in the compiler

Triggering side effects in the compiler through meta-programming is not just a technical curiosity. It delivers separation of concern between the build system (used traditionally to convey flags influencing compilation) and the source code. The source code can do its own business without the build system having to know or care.

You might have encountered plenty of examples in which the build scripts contained flags that are needed by some pre-processor macro somewhere, used to disable or enable some things at compile time (like asserts, logs, etc.).

You might have encountered plenty of examples in which the build scripts contained configuration data needed for a particular Web-Framework, a library to generate nice GUI from a XML/JSON files, in any other language (C++, Java, JS, etc.).

These pieces of configuration are usually tricky, are coupled to all the software, and contains lots of repetitions (configuration files do not have the full power of a programming language). But it does not need to be so.

Allowing side effects inside the compiler (though meta-programming) allows to escape all these issues. Having a pure functional programming language at compile time is not enough: side effects are useful at compile time as much as they are at run-time.

 

Conclusion and what’s next


In this post, we finished the implementation in both Clojure and C++ of a small library that allows to automatically trace call to particular functions, logging their argument names and values.

We showed how direct AST manipulations inside the compiler allows us to build this feature in a very simple way. We demonstrated the value of not having too much syntactic constructs inside a language. And we talked about the value of triggering side effects in meta-programming.

In the next post of the series, we will see how meta-programming can help us express more declaratively high level concerns, by declaring the how with data and delegating to a macro the generation of the corresponding code.


You can follow me on Twitter at @quduval.

Leave a comment

Create a website or blog at WordPress.com

Up ↑