Here Be Dragons

Don't get me wrong here, I like C++ but sometimes you just find things that make you question the sanity of its designers. This is a long and crazy story so I should probably start from the beginning: My game engine uses a message passing system to marshal calls between different threads. Basically you define the interface to a component and the messages it uses in an XML file

<message name ="Viewport Axis Input" subsystem="Graphics" id="0x02F1">
  <field type="INPUT_CONTROLLER_CODE" name="Controller"/>
  <field type="uint32_t" name="ControllerIndex"/>
  <field type="AXIS_CODE" name="Axis"/>
  <field type="float" name="Value"/>
</message>
<!-- ...snip... -->
<component subsystem="Graphics" type="Window Viewport" id ="0x2">
  <sends name="Viewport Closed"/>
  <sends name="Viewport Resized"/>
  <sends name="Lost Focus"/>
  <sends name="Got Focus"/>
  <sends name="Viewport Button Input"/>
  <sends name="Viewport Axis Input"/>
</component>

Then, as a pre-build step, the XML files are processed and a tool I wrote generates structs, serialization and deserialiazation code, and allocators and destructors for each message along with abstract base classes to implement the components (i.e. the skeletons in normal RMI terminology) and client-side stubs. However since this code is generated there needs to be a generic way of run-time allocating a message based on its type ID. You could implement this as a giant switch statement but that's pretty terrible and then your code starts to look like this:

A unholy union of generated and hand-written code

This was getting way too ugly and thread-unsafe since the code resided in the component skeletons themselves so I decided a new approach was in order. Since all the message allocators have the same signature I can easily store a map of IDs to function pointers and allocate the messages that way, but how is the map populated? Well, one way would be just to write a function that added them based on the predictably-named generated functions but "real programmers" don't use hardcoded lists! The solution is to use (and/or abuse) C++'s support for static initialization. Enter the MessageAllocator class:

MessageAllocator g_GraphicsViewportResizedMessageAlloc(GRAPHICS_VIEWPORT_RESIZED_MESSAGE_ID, 
  (AllocateMessageFunc)Create_GraphicsViewportResizedMessage);

The code generator generates a global statically-initialized instance of the class for each message type. This allows the constructor to add the message allocator the map, which is a static member of the class. However, at this point that so-called "static order initialization fiasco" rears its ugly head. How can we be sure that the map is even constructed when the first instance of the allocator is initialized? The answer is by using a kind of odd C++ vagary (initially I used another more complex method but thanks to Martin York's response to my post on Stack Overflow I'm just going to skip that):

std::map<uint32_t, AllocateMessageFunc>& MessageAllocator::get_s_map()
{
  static std::map<uint32_t, AllocateMessageFunc> s_map;
  return s_map;
}

This defers the construction of the map until the function is called and subsequent calls return the same map. As a bonus the map also gets correctly destroyed when your program exits. Of course, it would never be this easy.

When I actually ran the program most messages showed up but once in a while one would go missing, but why? Well it turns out that in Section 3.6.2 of the C++ Standard "It is implementation-defined whether the dynamic initialization (8.5, 9.4, 12.1, 12.6.1) of an object of namespace scope with static storage duration is done before the first statement of main. If the initialization is deferred to some point in time after the first statement of main, it shall occur before the first use of any function or object defined in the same translation unit as the object to be initialized." This means that the compiler is free to effectively ignore the initialization of static initializer if nothing explicitly references anything in the translation unit since it can "defer" it's initialization indefinitely. Normally this sounds reasonable since if nothing references the file then what's the point of including it all? However what if, as in my case, the file is only referenced through a reference populated through the static initializer? To be fair, the version of the standard I have (which is a draft of the upcoming version of the language) includes a footnote that states "An object defined in namespace scope having initialization with side-effects must be initialized even if it is not used" however as far as I can tell my compiler of choice, Visual Studio 2008, does not implement this. Additionally it's not clear that this would prevent entire object files from being discarded by the linker as is the case here.

The answer: terrible hacks. The Microsoft compiler allows you to embed extra parameters for the linker using a pragma. Using this I was able to force a reference to the object file by having the code generator sprinkle the header file with pragmas:

#ifdef _MSC_VER
#ifdef _WIN64
#pragma comment(linker, "/include:Create_GraphicsViewportButtonInputMessage")
#else
#pragma comment(linker, "/include:_Create_GraphicsViewportButtonInputMessage")
#endif
#endif

But it gets better. The linker only deals with raw symbols so I need the mangled name of the function which the code generator doesn't have. In order to work around this I had to specify C linkage on the allocator functions to disable name mangling so that their raw symbols would have predictable names. Finally and puzzlingly the 64-bit compiler omits the standard additional underscore that C linkage symbols usually contain.

It works and I'm probably going to programming hell, but at least I'm not eating quiche.

Here Be Dragons

Archives

Me on StackOverflow

Bookmarks

Meta