Templated argument passing policy

It is a well-known fact that pass-by-value semantics is often better implemented in terms of pass-by-reference-to-const in C++. This pill first introduces the plain basic technique of passing by reference-to-const and the criterion to do it, then shows how to make it a global template-based policy rather than a local decision.

1.1.1.1 Regular C++ syntax

Pass-by-value versus pass-by-reference-to-const arguments

Passing an argument to a function or method by value is almost semantically equivalent to passing it a reference to const: in the code excerpt

1: struct ChessBoard {
2:   char square[8][8];
3: };
4: int count_pawns_value(ChessBoard board);
5: int count_pawns_reference(ChessBoard const &board);
both count_pawns_value() and count_pawns_reference() are guaranteed not to modify their argument, and both have access to the same information regarding their argument.

The difference between count_pawns_value() and count_pawns_reference() is that the first works on a copy of the argument passed to it, whereas the second works on an alias of the argument passed to it. This means that using count_pawns_value() requires a copy of a whole instance of ChessBoard, which takes at least as much time and space as copying an array of 64 char elements. In contrast, count_pawns_reference() is working directly on the ChessBoard instance passed to it, and is equivalent, in time and space consumption, to passing a regular pointer.

We said at the beginning of this section that passing by value is almost equivalent to passing by reference to const; the word almost becomes exactly when the copy constructor of the argument has strict copy semantics; if it doesn't, its non-conformance to strict copy semantics will be made visible by its being copied to the formal argument. Of course, such non-conformant copy constructors should be avoided, and it will be assumed in the rest of this C++ Pill that all copy constructors have strict copy semantics.

When to pass by reference to const

To put it in a nutshell, pass by reference to const when it is more efficient than passing a copy of the whole object. This criterion involves, though, two sets of considerations that are often contradicting:

  • copying an object that occupies more memory than a pointer will always be more time- and space-expensive than passing a reference to it; copying an object with a non-default copy constructor will very probably be more time-expensive than passing a reference to it;
  • accessing an object through a reference is always more time-expensive than accessing it (or a copy of it) directly; the overhead incurred is one pointer de-reference.

Care must be exerted when judging the first consideration: the memory occupied by an object is not always the size of its type as reported by the sizeof() operator. Indeed, it is very usual to have objects contain other objects by means of dynamic memory management, i.e. by having them allocating and deallocating other objects dynamically as required; this involves allocating other objects at construction and at copy, and deallocating them at destruction. Clear examples are stl containers: the sizeof() of a list is always the same, and very small, but the memory occupied by it, which is the memory that has to be copied when copying the list, will grow roughly linearly with the number of elements held by the list. Another example is the pImpl idiom.

All in all, the second consideration rarely outweighs the first one. What is more, when it does, you can pretend it doesn't anyway: even if you pass the argument by reference to const (as a consequence of a argument-passing policy, which is our final aim), you can always make a copy inside the function or method, which buys you back the direct access at the small cost of an additional pointer copy. Still, since copying an argument for the sake of efficiency should be regarded as a bad coding practice, you can compensate by making the cut-off size several times that of a pointer instead.

So, in the end, you will end up deciding a set of rules such as:

  • an object bigger than four pointers will be passed by reference to const;
  • an object whose copy constructor is time-expensive will be passed by reference to const;
  • all the rest will be passed by value.

You could then apply these rules by hand to all your function and method arguments. This has at least two drawbacks:

  • it is cumbersome and the resulting syntax is non-homogeneous;
  • the decision about objects being bigger than pointers is only valid for a given compiler and target processor; if the code is ported to a different compiler or target processor, the argument specifications will have to be rewritten by hand, or the code will bear known inefficiencies;
  • last but not least, this approach does not mix well with templated code.

The last drawback deserves an example. Here is a specification for a templated abstract class that represents a predicate with one argument:

1: template <typename T>
2: class Predicate {
3: public:
4:   ~Predicate() { }
5:   virtual bool operator()(T)=0;
6: };
Now, we have to choose between an operator argument passed by value (as in the example) or by reference to const, but the choice will have to apply to all values of the template type argument T. If we allow for some variability in this choice by partial or full specialisation of Predicate, we will have the first two drawbacks listed previously.

1.1.1.2 Template-based policy

The solution is to turn the rules for argument passing into a template-based policy. Since the rule about the copy constructor being time-expensive cannot be automated, we will have to allow for specification of exceptions to the size-based rules.

Thus, the policy will be implemented as an automatic rule based on object size alone, that allows for exception specification by partial or full specialisation. Some known exceptions (mainly stl containers: lists, for instance, may have a small sizeof but are expensive to copy) can be incorporated in the policy by default; the user will have to judge and implement his own exceptions for the classes he has implemented himself.

The size-based rule

We would like the policy to result in a choice between

1: void function(T x);
and
1: void function(T const &x);
to be always turned into
1: void function(arg<T> x);
But we know C++ does not allow this, since this would tantamount to having templated typedefs (something I think is being considered for the next C++ standard, but which is forbidden as of today; I will have to make the appropriate modifications to this C++ Pill when the next C++ standard comes into existence).

Instead, we have to resign ourselves to the following syntax:

1: void function(typename arg<T>::pass x);

We can always resort to old-style syntactic sugar mixed with new-style template constructs:

1: #define arg_pass(type) typename arg<type>::pass
which allows us to write
1: void function(arg_pass(T) x);

The policy can be written

 1: template <typename argT>
 2: struct arg {
 3: private:
 4:   template <typename T, bool big=false> 
 5:   struct choice {
 6:     typedef T pass;        // small object by value
 7:   };
 8:   template <typename T>
 9:   struct choice<T, true> {
10:     typedef T const &pass; // large object by reference
11:   };
12: public:
13:   static unsigned int const cutoff_size=4*sizeof(void *);
14:   typedef typename
15:     choice<argT, (sizeof(argT)>cutoff_size)>::pass
16:     pass;
17: };
The default "bool big=false" in the template argument specification for the first struct choice is there only for the human reader (not only you, but also the code maintainer), so that it is clear that the first version of struct choice is the one for small objects.

Exceptions to the size-based rule

An exception for a small class SmallButExpensive which is expensive to copy (perhaps because of a complex copy constructor) can be written

1: template <>
2: struct arg<SmallButExpensive> {
3:   typedef SmallButExpensive const &pass;
4: };
An exception for a big class BigButCopy that for some reason we would like to always pass by copy-constructed value can be written
1: template <>
2: struct arg<BigButCopy> {
3:   typedef BigButCopy pass;
4: };
And an exception for stl lists can be written
1: template <typename T>
2: struct arg<std::list<T> > {
3:   typedef std::list<T> const &pass;
4: };

1.1.1.3 Example code

A file that tests all the concepts described in this C++ Pill is listed below. You can also download the file from here.

  1: #include <iostream>
  2: #include <list>
  3: 
  4: template <typename argT>
  5: struct arg {
  6: private:
  7:   template <typename T, bool big=false> 
  8:   struct choice {
  9:     typedef T pass;        // small object by value
 10:   };
 11:   template <typename T>
 12:   struct choice<T, true> {
 13:     typedef T const &pass; // large object by reference
 14:   };
 15: public:
 16:   static unsigned int const cutoff_size=4*sizeof(void *);
 17:   typedef typename
 18:     choice<argT, (sizeof(argT)>cutoff_size)>::pass
 19:     pass;
 20: };
 21: 
 22: 
 23: #define arg_pass(type) typename arg<type>::pass
 24: 
 25: ///////////////////////
 26: // testing apparatus //
 27: ///////////////////////
 28: 
 29: template <typename T>
 30: struct ArgAddress {
 31:   static void const *get(arg_pass(T) t) {
 32:     void const *address=static_cast<void const *>(&t);
 33:     return address;
 34:   }
 35: };
 36: 
 37: #define var_address(var) static_cast<void const *>(&var)
 38: 
 39: #define check_type(type)                                  \
 40: do {                                                      \
 41:   std::cout << "type \"" << #type << "\" ";               \
 42:   type object;                                            \
 43:   if (var_address(object)==ArgAddress<type>::get(object)) \
 44:     std::cout << "by reference";                          \
 45:   else                                                    \
 46:     std::cout << "by value";                              \
 47:   std::cout << std::endl;                                 \
 48: } while (false)
 49: 
 50: struct Small {
 51:   char c;
 52: };
 53: 
 54: struct Big {
 55:   char array[1000];
 56: };
 57: 
 58: struct FourPointers {
 59:   typedef void *Pointer;
 60:   Pointer array[4];
 61: };
 62: 
 63: struct FivePointers {
 64:   typedef void *Pointer;
 65:   Pointer array[5];
 66: };
 67: 
 68: struct SmallButExpensive {
 69:   SmallButExpensive() { }
 70:   SmallButExpensive(SmallButExpensive const &) {
 71:     // expensive copy-construction
 72:   }
 73:   char c;
 74: };
 75: 
 76: template <>
 77: struct arg<SmallButExpensive> {
 78:   typedef SmallButExpensive const &pass;
 79: };
 80: 
 81: struct BigButCopy {
 82:   char array[1000];
 83: };
 84: 
 85: template <>
 86: struct arg<BigButCopy> {
 87:   typedef BigButCopy pass;
 88: };
 89: 
 90: 
 91: template <typename T>
 92: struct arg<std::list<T> > {
 93:   typedef std::list<T> const &pass;
 94: };
 95: 
 96: 
 97: int main() {
 98:   check_type(char);
 99:   check_type(double);
100:   check_type(Small);
101:   check_type(Big);
102:   check_type(FourPointers);
103:   check_type(FivePointers);
104:   check_type(SmallButExpensive);
105:   check_type(BigButCopy);
106:   check_type(std::list<char>);
107: }
Of course, in production code, you would clearly separate the policy from the type definitions and from the user code in different source files.

If you compile and run the code you should get something similar to

 1: type "char" by value
 2: type "double" by value
 3: type "Small" by value
 4: type "Big" by reference
 5: type "FourPointers" by value
 6: type "FivePointers" by reference
 7: type "SmallButExpensive" by reference
 8: type "BigButCopy" by value
 9: type "std::list<char>" by reference
(you can download the corresponding output file from here). This output shows, for each type, whether the arguments are passed by value or by reference-to-const. Important points are
  • unless you have a very unusual system, "char" and "Small" should be passed by value, and "Big" should be passed by reference;
  • unless your system produces more padding than it should, "FourPointers" should be passed by value; "FivePointers" should always be passed by reference
  • "SmallButExpensive" and "std::list<char>" should always be passed by reference, and "BigButCopy" should always be passed by value;
  • "double" happens to be passed by value in my system, but your mileage may vary.

Website produced by Skribe; last make 2009-10-29.

© 2007, 2008, 2009 Miguel González Cuadrado (mgcuadrado@gmail.com)