Sunday, 28 February 2010

Fooling the ODR (and the compiler)

I just discovered a technique that is prety convenient but rather not self-explaining, nonportable (?), and possibly a maintenance nightmare. So you should never use it, should you? Except, as it's sort of cool, if you'd like to go a little on the geeky side...

1. The need for speed

Imagine you are writing some quick classes in the Java style (i.e. no header files), locate them in some C++ source file, and forget about them. Then you discover that you need to refactor your program (in my case it was a custom test framework) and to move some classes out to a separate file (a.k.a. compilation unit). Now you shun to do the right thing (i.e. to refactor) because you'd have to cut up your classes in two: the .h file and the .cc file, and to duplicate some (non always trivial) amount of code. So yo aren't doing that...

But stop, following trick did it for me: I defined the classes in a separate .h file and kept the .cc file with the extracted classes unchanged. I included the new .h file in the rest of the old .cc file where the classes were removed from. Then I just linked the the whole thing, et voilá, everything worked like a breeze - a refactoring as we like it!

2. Why does it work?

But wait, we spared some editing work, but didn't we violate the One Definition Rule of C++? I'd say we did. So why didn't the compiler complain? Well, as to answer it, we've got to look at the example a little closer.

The extracted classes looked like that:

TestThreads.cpp:

  #include "WorkerThread.hpp"

// test thread A
class ThreadA : public WorkerThread
{
public:
...
private:
...
} g_workerA;
  // test thread B
class ThreadB : public WorkerThread
{
public:
...
private:
...
} g_workerB;
Note that there's no TestThreads.hpp definition file here, thus we don't have a direct ODR violation for the compiler to complain! The newly created definitions file TestThreads.hpp looked like this:
  #include "WorkerThread.hpp"
  class ThreadA : public WorkerThread
{
public:
virtual void addRequest(Request& r);
...
};
  class ThreadB : public WorkerThread
{
public:
virtual void addRequest(Request& r);
...
};
The WorkerThread.hpp definition file looks like this:
  class WorkerThread
{
public:
virtual void addRequest(Request& r);
...
};
And the rest of the TestMain.cpp file where the TestThread classes were previously placed:
  #include "TestThreads.hpp"
  extern TestThreadA g_workerA;

extern TestThreadB g_workerB;
  g_workerA.addRequest(req);
Can you see why we could fool the ODR? Well, in this special case we didn't used the constructors directly in the TestMain.cpp file, only the methods of the WorkerThread superclass which then call the appropriate virtuall methods defined in the ThreadA class.
The linker then generates a reference to the virtual ThreadA::addRequest() method. Well, guess what, a method with the very name can be found in TestThreads.cpp, and the linker can resolve it no probs!

Although we clearly have two definitions of the ThreadA class, we get away with it! The ultimate reason for that is that the C++ compiler doesn't have the global knowledge of the whole program - it works only on the compilation unit level, and the relinquishes the responsibility for the complete programm to the linker! I.e. there's no whole programm optimization possible (at least not in the standard model, although Visual C++ compiler can do it).

3. How to extend the "hidden" classes?

Suppose we'd like to call a new method in the TestThread.cpp file from the outside. You cannot call this method directly, as we do not include the real definitions of the TestThread classes, but only the "fake" ones. Alas, there is a price to pay for fooling the compiler and for beeing lazy! The price here is, that we have to modify the base class (TestWorker.hpp) by adding a new public virtual method, if we need a new TestThreadA method to be called in the TestMain.cpp file!

Not pretty, uh? You know, sometimes you can go away with breaking the rules, but in the end you'll have to face the consequences. In the art of software design, your skills are measured at your ability to estimate for how long you will go away with laziness and tricks, and when you have to resort to the known, sound and boring "best practices".

But nonetheless, the price I had to pay was OK for the deeper knowledge of the compiler-linker cooperation and the increased knowlede of my own code.

No comments: