Saturday, April 29, 2006

Inlines are Evil!

We just ran into an interesting problem with inline functions and shared libraries. Consider the following piece of code:

// inlineContainer.h
class inlineContainer {
     inline void setA(int a) {
         _a = a;      
     }

     int _a;
     int _b;
     void setB(int b);
};

Now, consider the following piece of code (test.cc):

#include "inlineContainer.h"

void testMe() {
   inlineContainer mine;

   mine.setB(1);
}

The file inlineContainer.cc contains the implementation for setB. It is compiled into a shared library. Additionally, nothing in your test.cc file calls setA. Most people would expect that when you compile test.o, it wouldn't have any symbols from inlineContainer defined.

They'd be wrong.

The C++ standard says that the compiler should include a non-inline version of a function, just in case it can't inline it all of the time. GCC doesn't wait for the function to be called before it creates a copy. It creates a copy as soon as the inline is seen. So, test.o ends up with a copy of setA, albeit a weak one.

Still, that wouldn't be too much of a problem in itself, however, the shared library will also have a weak copy. This leads us to the actual problem, what happens when we want to change the implementation of setA? Even though we have a copy of the new version in the shared library, unless we recompile test.cc, we will more than likely end up with an interestingly hard to find bug.

The problem is the linker. When there are multiple copies of the symbol that are of equal "weakness", the first one seen by the linker is used. Since the shared library will be seen after the old one in test.o, the linker will use that one.

So, we end up with a very strange bug. It won't happen every time the function is called, only in the situations where the compiler decides the function shoudln't (or can't) be inlined. That means that invocations separated by a couple of lines will have vastly different behaviours.

Of course, this all breaks the C++ "one definition rule", which requires an inline function to be defined with identical meaning in every translation unit. Additionally, compilers and linkers are not required to detect violations of the rule. GCC is obviously one of the ones that can't. ref: http://en.wikipedia.org/wiki/One_Definition_Rule

How do we fix it?

First off, don't expose inlines to code that isn't going to be part of the same shared library. It isn't part of your exposed interface, don't let them see it. Do this through private, friends, any favourite way you have. One way would be to put the inlines inside a #if that is only presented to code that is allowed to see them.

However, that won't solve the problem for code that has already been delivered. That is harder. In fact, I'm not sure it's even possible. Testing with G++ produces interesting results:

Consider two files:

// test.cc
#include 
#include 
#include 

class inlineContainer {
 public:
   void setA(int a);
   void setB(int b);
   int _a;
   int _b;
};

inline void inlineContainer::setA(int a) {
   _a = 1;
}

void inlineContainer::setB(int b) {
   _b = b;
}

void test_func(inlineContainer &a);

int main(int argc, char **argv){
   inlineContainer funky;

   test_func(funky);
   std::cerr <<>

class inlineContainer {
 public:
   void setA(int a);
   void setB(int b);
 private:
   int _a;
   int _b;
};

void inlineContainer::setA(int a) {
   _a = a;
}

void test_func(inlineContainer &a) {
 a.setA(15);
}

# compilation instructions
g++ -fPIC -c test2.cc
g++ -shared test2.o -o libtest2.so
g++ test.cc libtest2.so -o testA
g++ test.cc test2.o -o testB

Now, what happens when we run testA?

[jpollock@pollock ~]$ ./testA
1
1

When we run testB?

[jpollock@pollock ~]$ ./testB
15
5

As you can see, the shared library version ALWAYS uses the copy produced in the main executable, even though it is defined as weak. The staticly linked version uses the newer version in test2.cc, since it is a normal, strong symbol.

It becomes even more interesting if we change test.cc to remove the call to setA. When we recompile an re-run, we see that while setA is defined in testA, the version in the shared library is the one that is used.

That indicates that regardless of the weakness of the symbol, if it's in the main executable and called, it will always be used in preference to the one in the shared library.

Scary, huh? That means we can't override that symbol. Even if we ignore the problems presented by versions that are actually inlined, we still cannot override that symbol!

That indicates to me that the only way we can change setA is to either hack the name of setA to ensure that the new version of the library has a different name, or redeliver everything that contains setA as a weak (or otherwise) symbol.

Of course, you could be smart and not expose inline functions outside of your library.

No comments: