How to use tag dispatching in your code more effectively?

This post is an answer to the C++ blog post How to Use Tag Dispatching In Your Code Effectively. I read the article, its main example and the guideline it delivers, and I am not convinced it really works.

This blog post aims at expressing my thoughts, offer an alternative simpler way to solve the same problem than Jonathan uses as main example, and answer the open question he raised at the end of the post: why the STL does not follow the guideline of his post.

We will go through the different points one by one.

 

Naming constructors – The simple way


Jonathan first starts the article by making the observation that constructors in C++ have no names. Therefore, we cannot define two constructors with two different behaviors, if they take same arguments as input.

 

A convoluted solution?

To solve this particular of defining several constructors with different behaviors, but with the same data taken as input, Jonathan encourages the use of tag dispatching:


class MyClass
{
public:
static struct ConstructThisWay{} constructThisWay;
static struct ConstructThatWay{} constructThatWay;
explicit MyClass(ConstructThisWay);
explicit MyClass(ConstructThatWay);
// …
};
// Selecting the overload
MyClass x(MyClass::constructThatWay);

The idea is to create N different types for our N different behaviors. These types are completely empty. They do not bring any new data in the constructor. They only serve as a way to select the right constructor overload at compile time.

 

Standard & simpler solution

Letting aside whether or not it is a good idea to have constructors do different things provided the same data, there is a much simpler way to solve the problem.

We can simply give a name to our different constructors by making them functions instead, a design pattern known as Factory method:

class MyClass
{
public:
static MyClass constructThisWay();
static MyClass constructThatWay();
// ...
};
auto x = MyClass::constructThatWay();

The effect is globally the same – we have a way to select at compile time how we create an instance of MyClass – but it delivers the results in a much simpler and idiomatic way.

In terms of performance, RVO/NRVO should elide the copies.

 

Other advantages of factory functions

Aside from being more idiomatic, factory functions integrates more smoothly with the generic algorithms of the STL. Here is an example in which we generate n instances of MyClass:


std::vector<MyClass> xs;
std::generate_n(std::back_inserter(xs), n, MyClass::constructThisWay);

In fact, factory functions integrates more smoothly with factorization in general. If we need to pass as argument the “decision” to use “this” or “that” instantiation for the MyClass:

  • With factory functions, we can just pass a pointer to member function (*)
  • With tag types, we would have to resort on overloads or template parameters

(*) As long as they have the same parameters of course, which is the original use case presented in the original article.

 

The guide-line & my take on why the STL does not follow it


The article goes on and discusses strong typing, how it can be used to select the overload of a constructor, and then generalizes this example as a guideline:

    “I recommend you to use tag dispatching to provide additional information on behaviour, and strong types to provide additional information on the data”

The author then openly asks the question of why the STL does not seem to follow this very guideline.

Let us look at why it would not be such a good idea. And to better discuss strong types vs tag dispatching, we first need to look at how they work.

 

Tag dispatching – The old “if constexpr”

Tag dispatching is an old and quite clever way to implement the if constexpr that the lucky C++ developers using C++17 have the privilege to use nowadays.

It is mostly useful in the context of generic code, as a way to select at compile time the implementation of a function, based on the properties or traits supported by a type. A compile time dispatch mechanism, based on a clever use of overloading and types.

The Boost Generic Programming Technique guide gives a good illustration of this, featuring std::advance, the very same example the OP uses:


// O(n) complexity
std::advance(linkedList.begin(), n);
// O(1) complexity (taking advantage of random access iterator)
std::advance(vector.begin(), n);

Thanks to this, we can write generic code that still profits from the optimization made possible by exploiting information that our type publishes.

 

The two ingredients of tag dispatch

Tag dispatching relies on two main ingredients.

First, a set of known type tags (closed, which means not extensible without modifying the API) that the API (the STL here) understands and uses to dispatch to one implementation or another:


template<typename Iterator, typename Distance>
void advance(Iterator& it, Distance n, std::bidirectional_iterator_tag)
{
// Bidirectional implementation
}
template<typename Iterator, typename Distance>
void advance(Iterator& it, Distance n, std::random_access_iterator_tag)
{
// Random access implementation
}
template<typename Iterator, typename Distance>
void advance(Iterator& it, Distance n)
{
return advance(it, n, typename Iterator::iterator_category{});
}

Second, using these type tags on the client code side, to declare the functionalities supported by their types. These tags typically appears as typedefs in the structures, or in associated type traits:


struct LinkedListIterator
{
using iterator_category = std::bidirectional_iterator_tag;
};
struct VectorIterator
{
using iterator_category = std::random_access_iterator_tag;
};

Here, the iterator_category acts as a level of indirection between the iterator and the features it supports. Several iterators can pertain the same category.

The set of categories is closed. So we can see this as compile time if-statements, which is also a closed dispatch mechanism (*). Tags are like compile time enums. The tag dispatching is like a distributed if-else-statement on this enum value.

(*) If you are looking for open dispatch mechanism, policy based design would be the compile-time equivalent of the strategy design pattern.

 

The contender – strong typing

Jonathan exposes another proposal to solve the problem of selecting the best algorithm for an iterator. It consists in tagging the iterator with the feature it supports, by wrapping it in a strong type.

The advance algorithm is reworked to match on specific tag around the iterators:


template<typename Iterator, typename Distance>
void advance(BidirectionalIterator<Iterator>& it, Distance n)
{
// Bidirectional implementation
}
template<typename Iterator, typename Distance>
void advance(RandomAccessIterator<Iterator>& it, Distance n)
{
// Random access implementation
}

We wrap the iterator with another type that refines what the iterator supports:


struct LinkedListIterator
{
// No tag here
};
struct VectorIterator
{
// No tag here
};
struct LinkedList
{
using iterator_type = BidirectionalIterator<LinkedListIterator>;
};
struct Vector
{
using iterator_type = RandomAccessIterator<VectorIterator>;
};

The approach seems reasonable. The guideline of the article suggests we should prefer it to tag dispatching. But in truth, this approach suffers from several problems when compared with the tag dispatching approach.

 

Why tag dispatching wins the contest

The first problem has to do with cohesion. The strong type tag adds the information about the iterator category from the outside. But the information is already part of the implementation of the iterator.

Separating things that belong together greatly increases the likelihood of messing things up:


RandomAccessIterator<LinkedListIterator>
BidirectionalIterator<VectorIterator>

view raw

MessingTags.cpp

hosted with ❤ by GitHub

The second issue has to do with ordering constraints and combinatorial explosion. What if we need to add another tag to our iterator? There are two possibilities to plug these tags:


AdditionalTag<RandomAccessIterator<Iterator>>
RandomAccessIterator<AdditionalTag<Iterator>>

There are ways to solve this:

  • Imposing some conventions on the ordering (increases likelihood of errors)
  • Relying on meta programming techniques (scanning through the tags)
  • Dying of combinatorial explosion on the API side (N tags, N factorial overloads)

But the best way to solve a problem is still to avoid it entirely.

Finally, the solution based on strong typing is also much more intrusive. Tag dispatch allows us to add a tag to our iterator, without having to modify the iterator itself. We can just use some traits:


template<typename Iterator>
struct IteratorTraits {};
template<>
struct IteratorTraits<LinkedListIterator>
{
using iterator_category = std::bidirectional_iterator_tag;
};
template<>
struct IteratorTraits<VectorIterator>
{
using iterator_category = std::random_access_iterator_tag;
};

In the case of strong typing, adding the tag means modifying the vector iterator_type definition somehow, which means reworking the implementation (and the API) of the container delivering the iterator.

 

The guideline does not hold


Overall, it seems to me that the guideline provided by Jonathan Boccara in How to Use Tag Dispatching In Your Code Effectively is not holding against the check of reality:

    “in your own business code, when there is no generic code creating intricate design issues, I recommend you to use tag dispatching to provide additional information on behaviour, and strong types to provide additional information on the data”

First, there seems to be no real incentive in using tag dispatching outside of generic code. There are more idiomatic mechanisms available – such as simple functions – that solve the kind of issue that were presented in the original article.

Second, in the case of generic code – the typical use case for tag dispatching – the design of the STL seems quite superior in terms of reducing the chances of manual mistakes, and in terms of maintenance and evolution (for it is less intrusive).

Last but not least, this guideline does not account for whether the dispatch is done at compile-time or at runtime. Factory functions, tag dispatch and the like are compile-time decisions. Runtime dispatch (a real use case) requires runtime values (such as enumerations).

In any case, there are good chances that tag dispatching will become an even more esoteric feature, thanks to the apparition of “if constexpr”. Expressing a closed compile-time dispatch has became much easier in C++17 than it ever was. And that is awesome.


Leave a comment

Create a website or blog at WordPress.com

Up ↑