Traversals and lenses

Following the amazing keynote of Eric Niebler on ranges at the CppCon2015, there has a been a lot of discussion on ranges in the C++ community. Ranges are an amazing tool to add to the STL.

I am starting a series of post to introduce some useful higher order functions that go hand-in-hand with the STL and with the new ranges.

Iterator, ranges and algorithms are concerned about traversals: roughly speaking going from the beginning to the end of a range. But there are at least two additional interesting “dimensions” to explore:

Lenses allow to “zoom” on the elements being scanned
Traversal combinators allow to merge several scan into one single scan

Graphically, we could represent it that way:

Traversal combinators
         ^       
          \     ^ Lenses
           \   /
            \ /
             *---------> Ranges and Algorithms

This post is the first of a series of three posts on this subject:

Today’s post is about exploring the lenses higher order functions
Next post will be focused on functions to combine traversals
The last post will talk about their impact on design through an example

Motivation

Have you ever had to sort a container of articles by their article id? Have you had to find the maximum nominal of a trade in a container of trade? Have you ever had to retrieve the first elements of tuples stored in a container?

If so, and assuming you used STL to deal with these issues, you either had to define your own lambda, or you used what I named lenses to get short declarative code.

Let us consider the following example:

//Finding the smallest string
auto smallest = min_element(
   begin(strings), end(strings),
   [](auto& lhs, auto& rhs) {
      return lhs.size() < rhs.size();
   });

//Sorting the strings by length
sort(
   begin(strings), end(strings),
   [](auto& lhs, auto& rhs) {
      return lhs.size() < rhs.size();
   });

There is a lot of redundancy in this code. And the lambda having no name, understanding the intent of the code requires reading the body of the lambda, twice.

One possible way to fix it would be to factorize the lambda function and give it a name (or create a function by_length returning the lambda):

auto by_length =
   [](auto& lhs, auto& rhs) {
      return lhs.size() < rhs.size();
   };

auto smallest = min_element(begin(strings), end(strings), by_length);
sort(begin(strings), end(strings), by_length);

But somehow, we ought to do better. The factorized lambda contains the call to size(), which is specific to our problem, while the rest of the lambda is generic. This generic part is the one we should extract and factorize instead.

Introducing lenses

Our goal is to factorize the common parts of our lambda to get something much shorter and more declarative out.

The generic part consist of a projection from an origin universe (string in our case) to a target universe in which we perform the comparison (size_t in our case). So we can take the specific part as parameter (the projection) and return a function (a lambda) that will perform the comparison on the origin universe of the projection. Then we just need to give it a name:

template<typename Projection>
auto comparing(Projection proj) {
   return [proj](auto& lhs, auto& rhs) {
      return proj(lhs) < proj(rhs);
   };
}

Using this comparing function, and combining it with our specific projection, it leads to a much nicer and more declarative code:

size_t by_length(string const& s) {
   return s.size();
}

auto smallest = min_element(begin(strings), end(strings),
                            comparing(get_length));
sort(inputs, comparing(by_length));

But please be careful when using it with anything but trivial projections. Indeed, the projection will be called twice at each comparison. The transformation must thus be very simple, or even trivial like getting an attribute of an object (hence the lens name).

In practice, it is best used with the utility function on:

template<typename Member>
auto on(Member member) {
   return std::mem_fn(member);
}

Using this on helper function, we can spare the lambda by_length and get more declarative code:

auto smallest = min_element(begin(strings), end(strings),
                            comparing(on(&string::size)));
sort(begin(strings), end(strings), comparing(on(&string::size)));

In most cases, you should be perfectly fine performance-wise when you use the pattern comparing(on(member)).

Note: If the projection functions costs a lot of CPU, you could instead split the work in two: first transform the data in a vector V, then write a sort routine whose comparison is based on seeking the corresponding data in V (based on an index). Another option is to memoize the result of the projection function.

Generalizing

The comparing utilities can be further improved to deal with both a projection and a custom less functor. For example, you might want to sort a range of string, in the reverse order or their respective length:

sort(begin(strings), end(strings),
     comparing(on(&string::size),
               greater<size_t>()));

A minor enhancement to the comparing function will do it:

template<typename Projection, typename Less>
auto comparing(Projection proj, Less less) {
   return [proj, less] (auto& lhs, auto& rhs) {
      return less(proj(lhs), proj(rhs));
   };
}

You might even want to plug variadic templates in order to chain the projections, for example to zoom twice in a row. But I tried it and it was not pretty.

Too many functions combined at call site should be a warning that we should start creating correctly named wrapper functions. The lenses are mainly built to handle the trivial cases. A bit like lambdas, they should stay lightweight.

Language support

This kind of tooling capability is not specific to C++. Most language offer the possibility to define similar constructs. It does not require C++14 either.

In fact, I first implemented these tools back when I had only C++03. It is much more verbose to write, less flexible, but nevertheless doable:

template<typename projection_fct>
struct comparing_t
{
   explicit comparing_t(projection_fct projection)
      : m_projection(projection) {}

   template<typename target_t>
   bool operator() (target_t const& a, target_t const& b) const {
      return m_projection(a) < m_projection(b);
   }

private:
   projection_fct m_projection;
};

template<typename projection_fct>
comparing_t<projection_fct> comparing(projection_fct projection) {
   return comparing_t<projection_fct>{ projection };
}

template<typename out_t, class target_t>
const_mem_fun_ref_t<out_t, target_t>
   on(out_t (target_t::*member_ptr)() const) {
   return mem_fun_ref(member_ptr);
}

If you have not done so already, I highly recommend you to write similar utils for you or your company. It will help you get rid of some more verbose lambdas, and get shorter and more readable code.

More examples

The comparing function is not the only useful lens function. There are many others. If you look for patterns in your lambdas, you should see plenty. I listed below the ones I used the most regularly.

Equality with projection: It is a bit like comparing, but with equality rather than ordering. In practice, we might even factorize the commonalities of comparing and equality further.

template<typename Projection>
auto equaling(Projection proj)
{
   return [proj](auto& lhs, auto& rhs) {
      return proj(lhs) == proj(rhs);
   };
}

//Check if the strings of the two collections are equal in size
equal(begin(strings1), end(strings1),
      begin(strings2), end(strings2),
      equaling(&string::size));

Matching predicate: It is just a special case of function composition, but with a better, more declarative name.

template<typename Predicate, typename Projection>
auto matching(Predicate pred, Projection proj) {
   return [pred, proj](auto& e) { return pred(proj(e)); };
}

//Moving all the small strings at the beginning
partition(begin(strings), end(strings),
   matching(
     [](int i){ return i < 15; },
      on(&string::size))
   );

Equal to: This is a special case of predicate and projection composition. The goal is to look for specific value.

template<typename Value, typename Projection>
auto equal_to(Value const& val, Projection proj) {
	return [&val, proj](auto const& e) {
		return val == proj(e);
	};
}

find_if(begin(strings), end(strings),
        equal_to(2, on(&string::size)));

partition(begin(strings), end(strings),
          equal_to(2, on(&string::size)));

These are just a few examples: there are plenty more you can define to give a more declarative taste to lambda constructs that you use very often.

What’s next?

We went over some lenses functions that allows to write more declarative code in combination with algorithms. I hope I convinced you of their utility.

These functions are not hard to write but are very helpful in replacing lambdas by shorter and more declarative code.

In the next post, we will go over some other higher order functions that works pretty well with algorithms: traversal combinators.