Hierarchy introspection

Introspection is the capability of a piece of software to get details about its own implementation. Typical details are

  • what the members of a given class are,
  • what classes derive from a given class,
  • what class a given class derives from,
  • what is the return type of a function or method.
None of these, nor many others, are available in C++.

1.1.2.1 Uses of introspection in C++

Introspecion may seem an unnecessary luxury: after all, if you have got the code, you can determine all of this, instead of discovering it; what's more, there seems to be no way to use these details, since C++ is a staticly-typed language, and it is difficult to hold an object whose type is unkown.

The truth is that introspection is more useful in more dynamic languages that are closer to being interpretable, since they can respond to the details with fewer preconceptions about their form, type, size, and so on. But still, there are two features of C++ that can yield great benefits if introspection is available: polymorphism and template-based programming.

Regarding polymorphism, if you have an abstract class A, the result of introspection may be an object of a concrete class derived from A that you can treat type-safely by a pointer-to-A.

Regarding template-based programming, relations between classes are at the core of meta-programming, and very often the introspection must be added by hand (turning it into something that should be called "discipline" rather than introspection); thus, e.g., a meta-programming library may require you to include in all your classes a typedef with an agreed name stating the class' base class. This is error-prone, since there is no language-supported means to check that information. And then, of course, there is the lack of a standard typeof() operator, which would allow to retrieve the return type of a function or method (most compilers already have it anyway, since the compiler machinery to implement it is needed anyway for the sizeof() operator); the lack of a typeof() operator gives a lot of trouble to meta-programming library developpers, who devote a lot of effort to getting portable pale imitations of it; let's hope the next C++ standard will include typeof(), at last.

1.1.2.2 What about hierarchy introspection

What I call "hierarchy introspection" is a polymorphism-related run-time (as opposed to "compile-time") kind of introspection, that must be partially supported by the programmer, but that can yield great benefits in modularity, scalability, code readability, and maintainability.

Often we want to decouple the set of implementations of an abstract class (i.e., the set of concrete, working derived classes of an abstract class) from their usage. We may, for instance, have an abstract class called Instrument for a synthesiser program, that supports some (abstract) methods for producing sound. Implementations of Instrument may be called GrandPiano, Glockenspiel, TwoSlightlyDistortedGuitars, TubularBells, et cćtera. It is very natural that we interactively offer the user a list of the available instruments, structured in a logical way, or that we can read instrument names from a text script file in order to batch-produce music with them. In order to achieve this, we need to know at run-time what classes derive from Instrument.

1.1.2.3 The easy, not very good, solution

Let's assume that we give each concrete instrument class a normal, English name, such as "grand piano" or "two slightly distorted guitars". We want to be able to refer to the appropriate classes or instances by these names.

First, let's define the common interface for instruments:

 1: // file "instrbad.h"
 2: #ifndef INSTRUMENT_BAD_HEADER_
 3: #define INSTRUMENT_BAD_HEADER_
 4: 
 5: class Instrument {
 6: public:
 7:   virtual ~Instrument() { }
 8:   virtual void play()=0;
 9: };
10: 
11: #endif

Now, let's define a grand piano

 1: // file "gpianbad.h"
 2: #ifndef GRAND_PIANO_BAD_HEADER_
 3: #define GRAND_PIANO_BAD_HEADER_
 4: 
 5: #include "instrbad.h"
 6: 
 7: class GrandPiano
 8:   : public Instrument {
 9: public:
10:   GrandPiano();
11:   void play();
12: private:
13:   int note_number;
14: };
15: 
16: #endif
 1: // file "gpianbad.cpp"
 2: #include "gpianbad.h"
 3: #include <iostream>
 4: 
 5: GrandPiano::GrandPiano() : note_number(0) { }
 6: 
 7: void GrandPiano::play() {
 8:   if (note_number==0)      std::cout << "planck ";
 9:   else if (note_number==1) std::cout << "plonck ";
10:   else                     std::cout << "plunck ";
11:   note_number=(note_number+1)%3;
12: }
and tubular bells
 1: // file "tubblbad.h"
 2: #ifndef TUBULAR_BELLS_BAD_HEADER_
 3: #define TUBULAR_BELLS_BAD_HEADER_
 4: 
 5: #include "instrbad.h"
 6: 
 7: class TubularBells
 8:   : public Instrument {
 9: public:
10:   TubularBells();
11:   void play();
12: private:
13:   int note_number;
14: };
15: 
16: #endif
 1: // file "tubblbad.cpp"
 2: #include "tubblbad.h"
 3: #include <iostream>
 4: 
 5: TubularBells::TubularBells() : note_number(0) { }
 6: 
 7: void TubularBells::play() {
 8:   if (note_number==0) std::cout << "gaang ";
 9:   else                std::cout << "goong ";
10:   note_number=(note_number+1)%2;
11: }

We could normally use these two concrete instruments by creating instances for them knowing their class names. Let's separate, in a suboptimal manner, the knowledge of their existence from their usage. This will serve as a starting point for the real solution.

 1: // file "synthbad.cpp"
 2: #include <iostream>
 3: #include <map>
 4: #include <string>
 5: #include "instrbad.h"
 6: #include "gpianbad.h"
 7: #include "tubblbad.h"
 8: 
 9: std::map<std::string, Instrument *> instrument_map;
10: 
11: void instrument_setup() {
12:   // instrument setup: knowledge of the hierarchy is required
13:   instrument_map["grand piano"]=new GrandPiano;
14:   instrument_map["tubular bells"]=new TubularBells;
15: }
16: 
17: int main() {
18:   instrument_setup();
19: 
20:   // usage: no knowledge of the hierarchy is required
21:   instrument_map["grand piano"]->play();
22:   instrument_map["tubular bells"]->play();
23:   instrument_map["grand piano"]->play();
24:   instrument_map["tubular bells"]->play();
25:   instrument_map["grand piano"]->play();
26:   instrument_map["tubular bells"]->play();
27:   std::cout << std::endl;
28: }
Running this code should result in the following being output to the standard output:
1: planck gaang plonck goong plunck gaang 

The function instrument_setup() sets up a dictionary of available concrete instruments, that is, instances of concrete classes derived from Instrument. After having set up the dictionary, we use the instruments only by their names, by accessing that same dictionary. This is just a very simple demonstration, but we could have presented the user interactively with a set of the known instruments, so that he can choose among them, or we could have read a text script that uses the instruments by name.

The problems with this solution are obvious:

  • we had to put the knowledge of the hierarchy somewhere where it did not belong; whether close to the main executable code or in a separate file does not matter: when we add a new instrument, we will have to (and often forget to) add a new entry to an existing and completely unrelated file, namely the file where the function instrument_setup() is defined;
  • we had to explicitly activate the function instrument_setup(), which should not be the duty of the user code, and is error-prone and easy to forget, especially if several hierarchies are to be "introspected".

The rest of this C++ Pill will be dedicated to solve these problems in a satisfactory manner. When we are done we will have a means of asking the class what its concrete subclasses are, and will be able to add new concrete subclasses by adding new files defining them, without changing a single line of the existing code.

1.1.2.4 The easy part: automatic activation

Let's solve first the easy part: the automatic activation of the derived class dictionary. We will see later that the hard part can be solved as en extension if this one, and we will have already solved some problems that we would have found anyway.

We would like not to have to do anything for the concrete instruments to be declared to the dictionary. Calling an external function from the main function is not an option, since it is easy to forget to call it. So, we would like to reduce the code of the main executable file to

 1: // file "synthmed.cpp"
 2: #include <iostream>
 3: #include <map>
 4: #include <string>
 5: #include "instrmed.h"
 6: 
 7: int main() {
 8:   // usage: no knowledge of the hierarchy is required
 9:   instrument_map["grand piano"]->play();
10:   instrument_map["tubular bells"]->play();
11:   instrument_map["grand piano"]->play();
12:   instrument_map["tubular bells"]->play();
13:   instrument_map["grand piano"]->play();
14:   instrument_map["tubular bells"]->play();
15:   std::cout << std::endl;
16: }
Since we did not fill the dictionary, we need no knowledge about the concrete instruments, and the #include directives to the header files of the concrete instruments are not needed anymore. We need not worry about making the magic happen, that is, we do not have to call anything to prepare the dictionary; we just enter the main() function, and it is there waiting for us. And we can still use these concrete intruments through the instrument dictionary, retreaving them by name as abstract instruments.

So, how does the magic happen? Well, as you may imagine, this is not magic by any standard. We are using the "static variable initialisation trick" (is this a standard name, or just the way I name this in my head?). First of all, we need to declare the instrument dictionary, and define it somewhere. Here is the declaration

 1: // file "instrbad.h"
 2: #ifndef INSTRUMENT_MED_HEADER_
 3: #define INSTRUMENT_MED_HEADER_
 4: 
 5: #include "instrbad.h"
 6: 
 7: extern std::map<std::string, Instrument *> instrument_map;
 8: 
 9: #endif
(instead of producing a new header file for Instrument, we reused the interface by including it from "instrmed.h", and added the instrument dictionary; this is ok for simplicity of exposition, but in production declaration we should add this code to "instrbad.h".) Then we must make sure that the declared dictionary gets filled automatically during the initialisation of the program, i.e. before we get to the opening brace of the main() function. The way to achieve it is as a side-effect of the initialisation of a global variable, since global variables are guaranteed to be fully initialised before the program starts (before the opening brace of main()). See it with your own eyes:
 1: // file "instrmed.cpp"
 2: 
 3: #include "instrmed.h"
 4: 
 5: std::map<std::string, Instrument *> instrument_map;
 6: 
 7: namespace {
 8: 
 9:   bool instrument_setup() {
10:     // instrument setup: knowledge of the hierarchy is required
11:     instrument_map["grand piano"]=new GrandPiano;
12:     instrument_map["tubular bells"]=new TubularBells;
13:     return true;
14:   }
15: 
16:   bool const instrument_setup_done=instrument_setup();
17: 
18: };
(we pack it all up into an anonymous namespace so that the global namespace is not polluted with implementation details). As you can see, the global variable instrument_setup_done gets an initial value, and it gets it from the function that really does the dictionary filling job, instrument_setup(); this function must return something so that the global variable initialisation makes sense; a not-so-bad something to return is true, marking that the function ran all-right, which is also consistent with the name chosen for the global variable, but in any case, this is only a clue for the human reader and has no functional significance, since it is invisible from outside.

With this automatic dictionary filling apparatus, the file "synthmed.cpp" makes sense, and its execution produces the following output:

1: planck gaang plonck goong plunck gaang 
which is identical to what we obtained from the previous version "synthbad.cpp".

1.1.2.5 The whole shabang, almost

The solution explained in the previous section solves the automatic activation of hierarchy introspection, but does not buy us real hierarchy introspection, since the information is hand-coded in the file "instrmed.cpp", and every time we add a new concrete class, we have to modify that file (one bad point), and we may forget to do it (another bad point).

The whole solution consists simply of extending the "static variable initialisation trick" to every concrete class, and having each concrete class implementation declare itself to the dictionary, through the definition of a global variable with an initial value that results from running the function that really does the dictionary filling as a side-effect, with everything neatly packaged into an anonymous namespace in the concrete class implementation file.

Here we bump into one of those hairy C++ problems that you must be aware of as a C++ programmer: the "static initialisation order fiasco" (see a thorough explanation of the problem and its solution here). Briefly stated, it may (and will, in front of your boss or customer or whatever) happen that the dictionary is not yet initialised when your global variable initial value computation tries to access it. This did not happen in the previous version, since variables in the same compilation unit are guaranteed to be initialised in the textual order, but there is no way to define a initialisation order among different compilation units. So, we will apply the usual "anti static initialisation order fiasco implement", and turn the dictionary into a function that returns a reference to a dictionary. Since this dictionary is very tied to the abstract class Instrument (indeed, it exists only for that class), we will also make it a static method of that class. And since we are at it, we will put in place a whole interface for accessing it, so that only concrete implementations of Instrument get to fill it, but anyone can access it (I hope it's not too much for a single code iteration).

So, here we go:

 1: // file "instrument.h"
 2: #ifndef INSTRUMENT_HEADER_
 3: #define INSTRUMENT_HEADER_
 4: 
 5: #include <string>
 6: #include <list>
 7: 
 8: class Instrument {
 9: public:
10:   virtual ~Instrument() { }
11:   virtual void play()=0;
12:   static Instrument *concrete(std::string name);
13:   static std::list<std::string> concrete_name_list();
14: protected:
15:   static bool declare(std::string name, Instrument *instrument);
16: };
17: 
18: #endif
The hierarchy introspection interface is made up by the static methods declare(), concrete(), and concrete_name_list(). The fact that the interface is static is appropriate since class-static elements are elements of the class, not of the instances, and hierarchy introspection is precisely telling us things about the class (although these things are instances; watch your meta-levels). The static method declare() declares a new concrete instrument instance with a given name. The static method concrete() retrieves an already declared concrete intrument by its name. The static method concrete_name_list() gives a list of all the known instrument names. Since users should not be able to declare new concrete instruments, but concrete instruments should be able to declare themselves, the method declare() is in the protected section of Instrument.

This interface allows us to use a convenient syntax in the user code, while having, as before, no knowledge whatsoever about the concrete instrument classes:

 1: // file "synthesiser.cpp"
 2: #include <iostream>
 3: #include <map>
 4: #include "instrument.h"
 5: 
 6: int main() {
 7:   // demonstration of the known instruments
 8:   std::list<std::string> instrument_list=Instrument::concrete_name_list();
 9:   std::cout << "Known instruments: ";
10:   for (std::list<std::string>::const_iterator scan=instrument_list.begin();
11:        scan!=instrument_list.end();
12:        ) {
13:     std::cout << *scan;
14:     ++scan;
15:     if (scan!=instrument_list.end())
16:       std::cout << ", ";
17:   }
18:   std::cout << std::endl;
19: 
20:   // play the music...
21:   Instrument::concrete("grand piano")->play();
22:   Instrument::concrete("tubular bells")->play();
23:   Instrument::concrete("grand piano")->play();
24:   Instrument::concrete("tubular bells")->play();
25:   Instrument::concrete("grand piano")->play();
26:   Instrument::concrete("tubular bells")->play();
27:   std::cout << std::endl;
28: }

The implementation of the hierarchy introspection for Instrument is

 1: // file "instrument.cpp"
 2: #include "instrument.h"
 3: #include <map>
 4: 
 5: namespace {
 6: 
 7:   class Dictionary {
 8:   public:
 9:     typedef std::map<std::string, Instrument *> Map;
10:     bool declare(std::string name, Instrument *instrument)
11:       { dictionary_map[name]=instrument; return true; }
12:     Instrument *get(std::string name)
13:       { return dictionary_map.find(name)->second; }
14:     std::list<std::string> get_list() const {
15:       std::list<std::string> result;
16:       for (Map::const_iterator scan=dictionary_map.begin();
17:            scan!=dictionary_map.end();
18:            ++scan)
19:         result.push_back(scan->first);
20:       return result;
21:     }
22:     static Dictionary &singleton() {
23:       static Dictionary *const one=new Dictionary;
24:       return *one;
25:     }
26:   private:
27:     Dictionary() { } // private constructor for the singleton
28:     Map dictionary_map;
29:   };
30: 
31: };
32: 
33: bool Instrument::declare(std::string name, Instrument *instrument)
34:   { return Dictionary::singleton().declare(name, instrument); }
35: 
36: Instrument *Instrument::concrete(std::string name)
37:   { return Dictionary::singleton().get(name); }
38: 
39: std::list<std::string> Instrument::concrete_name_list()
40:   { return Dictionary::singleton().get_list(); }
As before, all the implementation details are in an anonymous namespace. The class Dictionary wraps an stl map of strings to instruments, and provides a singleton of itself, by making its constructor private and having as the sole way to obtain an instance the "anti static initialisation order fiasco implement" method singleton(). The hierarchy introspection interface is then implemented in terms of the Dictionary singleton.

Finally, here are our traditional instrument implementations. Unlike in the previous versions, we can implement them entirely in implementation files, with no header files; what's more, we can put them in anonymous namespaces, so that their fingerprint is really naught.

 1: // file "grand_piano.cpp"
 2: #include "instrument.h"
 3: #include <iostream>
 4: 
 5: namespace {
 6: 
 7:   class GrandPiano
 8:     : public Instrument {
 9:   public:
10:     GrandPiano() : note_number(0) { }
11:     void play() {
12:       if (note_number==0)      std::cout << "planck ";
13:       else if (note_number==1) std::cout << "plonck ";
14:       else                     std::cout << "plunck ";
15:       note_number=(note_number+1)%3;
16:     }
17:   private:
18:     int note_number;
19:     static bool const declared;
20:   };
21:   bool const GrandPiano::declared=
22:     Instrument::declare("grand piano", new GrandPiano);
23: 
24: };
plus...
 1: // file "tubular_bells.cpp"
 2: #include "instrument.h"
 3: #include <iostream>
 4: 
 5: namespace {
 6: 
 7:   class TubularBells
 8:     : public Instrument {
 9:   public:
10:     TubularBells() : note_number(0) { }
11:     void play() {
12:       if (note_number==0) std::cout << "gaang ";
13:       else                std::cout << "goong ";
14:       note_number=(note_number+1)%2;
15:     }
16:   private:
17:     int note_number;
18:     static bool const declared;
19:   };
20:   bool const TubularBells::declared=
21:     Instrument::declare("tubular bells", new TubularBells);
22: 
23: };

The output is

1: Known instruments: grand piano, tubular bells
2: planck gaang plonck goong plunck gaang 
(check it out). This time it includes the list of known instruments so that we verify that we have not lost it to the increased encapsulation (we can no longer use the raw map). The music produced is the same as before.

You can download the final version of the code here.

1.1.2.6 The real whole shabang with factories and plugins

There remains only one thing to be said: this kind of hierarchy introspection mixes very well with the "factory" pattern. Since the main point of this C++ Pill has been already explained, we will not produce any new code in this section, but I will give you whe whys and the clues to do it; the implementation will be left as an exercise for the dilligent reader...

The abstract method play() of Instrument is not declared const, which is sensible since the sound may depend on the state of the instrument, and the state may change by playing it (as it indeed does in our implementation). Thus, if you need two grand pianos, for instance, you have to use two separate instances, lest their output gets mixed up. But in our implementation we only have one instance of each concrete instrument. We would like to be able to make new instances by name, instead of retrieving single already-created instances by name.

Another reason for wanting to do that is parameterisation of classes. The instruments may come with parameters to define them; for example, we may want to have tubular bells in several keys (C major, G minor in a higher octave...), which may be specified as an argument to the concrete instrument constructor.

In both cases the solution is to base hierarchy introspection not on concrete class instances, but on concrete class factories. An abstract factory is declared as a class inside the Instrument; this factory features an abstract method that produces a pointer to a newly-allocated concrete Instrument implementation instance. For each concrete Instrument implementation we implement a concrete Instrument factory class that constructs instances of that Instrument implementation class, derived from the abstract factory. The centrepiece of this change in perspective is that instead of declaring and accessing concrete instances by name, we declare and access concrete factories by name. This would be the solution for the first problem, where we need several instances of the same concrete instrument.

As for having parameterisable instruments, the solution needs a little plus. We need to decide on a single signature for the constructors of all Instrument implementations (it may be based, for instance, on something as general as a string). The factory must accept that same signature for its instance-creating method, and pass down the arguments to the concrete Instrument class that it instantiates.

A final word about plugins: all this mixes perfectly well with them. Plugins are essentially dynamically loaded object code developed independently from the main program, which makes it utterly impossible to anticipate the concrete implementations that can be fitted. If the techniques described in this C++ Pill are used, whenever the main program loads a plugin, the implementation or implementations that it provides will declare themselves, and will thus be automatically available to the rest of the program.

Website produced by Skribe; last make 2009-10-29.

© 2007, 2008, 2009 Miguel González Cuadrado (mgcuadrado@gmail.com)