Saturday 3 December 2016

N3599 proposal, typesafe printf() and some C++ explanations

Reading the C++ Tips, 2016 Week 46 I stumbled over following syntax :
  template <typename T, T... chars> 
  constexpr CharSeq<chars...> operator""_lift() { return { }; }
Only to be instructed in the comments section that this is a non-standard extension of user defined literals, that Clang and gcc nonetheless seem to support (still learning C++14 as you can see). I wondered if my not-so-new Visual C++ 2013 compiler would support it too, so I had a closer look at it.

I soon found out*, that there's a proposal for this, namely N3599. This proposal being quite old (March 2013), I thought that my chances aren't that bad... So I took the first example usage of the feature I found in N3599 and tried to compile it in Visual Studio. Result? Not supported, of course, the "rejuvenation" of Microsoft compiler is still underway**.

At this point I got quite interested in the code itself and in the possibility to got it running even without the templated user defined literals. So here it comes, my explanation how the code is working, because it completes a non-trivial feat - to generate a function of given type from a given string (aka. textual description!).

But wait, type safe printf? Didn't Bjarne explain it somewhere already***? I think I heard something like this, but didn't check it... Stop, let us stay with the original, humble goal of understanding a piece of template code, which wasn't entirely clear at the first sight.

1. Template code analysis and explanation

The original code was:
// A tuple of types.
template<typename ...Ts> struct types {
  template<typename T> using push_front = types<T, Ts...>;
  template<template<typename...> class F> using apply = F<Ts...>;

// Select a type from a format character.
template<char K> struct format_type_impl;
template<> struct format_type_impl<'d'> { using type = int; };
template<> struct format_type_impl<'f'> { using type = double; };
template<> struct format_type_impl<'s'> { using type = const char *; };
// ...
template<char K> using format_type = typename format_type_impl<K>::type;

// Build a tuple of types from a format string.
template<char ...String>
struct format_types;
struct format_types<> : types<> {};
template<char Char, char ...String>
struct format_types<Char, String...> : format_types<String...> {};
template<char ...String>
struct format_types<'%', '%', String...> : format_types<String...> {};
template<char Fmt, char ...String>
struct format_types<'%', Fmt, String...> :
  format_types<String...>::template push_front<format_type<Fmt>> {};

// Typed printf-style formatter.
template<typename ...Args> struct formatter {
  int operator()(Args ...a) {
    return std::printf(str, a...);
  const char *str;

template<typename CharT, CharT ...String>
typename format_types<String...>::template apply<formatter>
operator""_printf() {
  static_assert(std::is_same<CharT, char>(), "can only use printf on narrow strings");
  static const CharT data[] = { String..., 0 };
  return { data };

void log_bad_guess(const char *name, int guess, int actual) {
  "Hello %s, you guessed %d which is too %s\n"_printf(
    name, guess, guess < actual ? "low" : "high");
Not quite easy to read, isn't it? But let us go through that step by step, using the old trusty top-down method.

1. log_bad_guess() function uses the custom _printf  literal operator for the format string, and then.... seems to be calling itself with more parameters??? Whassat? Are we moving towards Haskell-like unreadability?

2. Maybe not. If literal operator applied to a string can be called then it must return a callable, i.e. something with operator (). Look at _printf's definition - it is a template returning something complicated. This something must then define the call operator, so lets look for it.

3. This something is created by "calling" the apply "method" of a format_types structure with the  formatter function. Of course all of it at metaprogramming, i.e. compile-time, i.e. types-only level, so we need a short explanation here:

The apply metaprogramming "method" is one of basic type manipulation primitives which can be written quite simply with the new C++11 parameter packs. It just sets parameters for a given template expecting some parameters. In our case, it creates correctly typed version of the formatter function. Correctly typed means that the input parameter types will match the types required by the format string. And these types will be provided by  format_types type list. A type list is implemented as a template parameter pack.

The another utility is the push_front "method" - it just extends a typelist with a new type. Cool.

BTW, there is a nice, short and free ebook (by @joel_f and @edouarda14) about modern C++ metaprogramming explaining such basic operations if you want to learn more. So have a look at it, with parameter packs the typelist manipulations got so much easier in C++!

4. So how format_types is generated? This is done with a series of template specialzations which descends the format string recursively, processing 2 character at one step, and if the pair starts with %, an appropriate type is added to the type list. This is done by appropraitely specialized format_type_impl template, which ist than aliased to format_type as to directly acces it's type "member".

5. Now we have all the elements: _printf is generating a callable with function signature derived from the parsed format string, and then we are calling it with matching parameters. Was it that difficult? Not really, right?

2. Getting it runing with VS 2013 compiler

Alas, VS 2013 doesn't like the template<...> operator "" _printf construct: .... What can we do? I tried following changes:
template<typename CharT, CharT ...String>
typename format_types<String...>::template apply<formatter>
 printf() {
    static_assert(std::is_same<CharT, char>::value, "can only use printf on narrow strings");
    static const CharT data[] = { String..., 0 };
    return{ data };
You see, now we have a function printf() returning a functional object (thus it's a HOF - higher order function of kinds) which is generated based on the format string, as we have seen above.

Note that I had to change std::is_same() to  std::is_same::value as VS2013 compiler does not support constexpr, and operator() should be const here!

Then I wanted use it like that:
    void log_bad_guess(const char *name, int guess, int actual) {
    auto p = printf<char, "Hello %s, you guessed %d which is too %s\n">();
    p(name, guess, guess < actual ? "low" : "high");
but... compiler error! Seems compiler cannot match char string with CharT ...String&, hmm, lets try this:
void log_bad_guess(const char *name, int guess, int actual) {
    auto p = printf<char, '%', 's', '%', 'd', '%', 's'>();
    p(name, guess, guess < actual ? "low" : "high");
Yesss, now it's compiling! So if we change the format string to "%d" like this:
    auto p = printf<char, '%', 's', '%', 'd', '%', 'd'>();
We should get a compiler error, preferably something like: "Cannot call XXX with YYY parameters". What VS 12013 reports is however:
  error C2664: 'int formatter<format_type_impl<115>::type,format_type_impl<100>::type,format_type_impl<100>::type>::operator ()
(format_type_impl<115>::type,format_type_impl<100>::type,format_type_impl<100>::type)' : cannot convert argument 3
from 'const char *' to 'format_type_impl<100>::type'
Ok, speak about compiler template error messages... This could be probably improved with some judicious usage of static_assert, but here we are only seeking understanding, not production code quality. But we know what is going on, right?

So there is a last final touch missing: automatic conversion from char string to a char parameter pack. Well, that's the problem! The proposed (missing) user defined literal operator would do that! As one of fellow bloggers said:
" But a similar template syntax is standardized for raw numeric literals, so the lack of raw string user defined literals in C++14 is a pretty big inconsistency that I hope C++1z will resolve."
So we must implement this ourselves, using a techniques similar to one of the integer_sequence's implementations here, then just pass the char string length via the T (&) [N] syntax, an than it should be working. Or use some of modern C++ metaprogramming libraries like Brigand or Hana...

I won't implement the character sequence generation here, because this post is very long already. The second option is out of question as well, because I want only use "naked" C++ standard library... Thus our goal won't be fully achieved in this blogpost. Sorry 😒!

3. Summary

You might be asking "why are you writing that"? It's nothing earth-shattering, only some regurgitated, already known stuff... My explanation: maybe somewhere there's a (young) programmer looking at code like that and thinking "OMG, I'm never gonna to understand that, it's too advanced, it's magic". And the entire industry is asserting this view. I am against such elitism, it's just a piece of code doing some work, humans wrote it so another human might understand it. It's the same I feel about the category theory cargo-cult (read here) - don't fear, don't let discourage yourself by the general sentiment in programming industry, don't believe 10x programmer fairy tales. It's only computing, and it boils down to if, then, else and loop 😊.

So what did we achieve? We just synthesized a function signature via a textual description, and then called a vararg function with parameters of that types. Higher order functions anyone?

* through Sumant Tambe's blogpost; "Dependently-typed Curried printf in C++"You may wonder here what that "dependently-typed" moniker might mean.... In course of demythologizing of functional programming ;) I dare say that a dependent type is a type depended on some non-type parameter, aka. tag. An I hope I'm right here, because I didn't google it, only restate what I remember about Idris...Wish me luck 😉.

** Why there is no expression SFINAE yet, etc, quite interesting piece of MS compiler histrory: Rejuvenating the Microsoft C/C++ Compiler

*** Python Style printf for C++ with pprintpp, this goes in similar vein, but the code is somehow very complex, maybe owing to the fact that pairs of curly braces have to be found (Python syntax). Additionally it needs macros - fancy if N3599 would solve that?

No comments: