Well, that’s a terrible idea!

A few weeks ago a coworker and I were complaining about how annoying it is to make self contained tools and utilities. It seems like everything depends on 1000 DLLs these days. All our engine is statically linked because a) Is anybody really going to build off a binary-only version and then want to upgrade to a new version without recompiling? People tout this advantage a lot, but honestly I'd like to see somebody do that with non-trivial C++ code. And b) then you can't statically link the CRT anymore (since each DLL would get it's heap and you need to remember who allocated what pointer, etc.) which means you need a CRT redistributable to be installed on PCs that run your software. Microsoft has this local manifest thing for jamming at that crazy WinSxS goodness/badness into your folder but you're still stuck with a folder of crap.

Even then, sometimes you have the source for your dependencies and sometimes you don't... and even sometimes when you do it's so annoying to build them that you would go to extreme and terrible lengths to use the binaries you are given. But I'm not naming any names here :). Also there can be sub dependencies that are binaries and require certain a CRT type (debug/release/DLL/static). The Microsoft compiler won't let you combine two CRTs into one binary unfortunately, but if it did this would work. Previously you couldn't have malloc-ed in one and freed in another anyway, it just doesn't know that you know you can't do that. Also there would be symbol clashes since there would be two "malloc" functions so you would need some kind of pre-link symbol-binding-then-wiping pass:

clip_image002

Unfortunately the green stuff doesn't exist so that leaves us with a few other options:

Pack the DLLs in a resource and extract them to a temp directory

This is the most straightforward and self-explanatory approach but it means you can't use DLLs that aren't dynamically loaded since your unpacking code hasn't run yet when Windows tries to find your dependencies. (Yes, you can also delay-load, I'm getting to that!) Also this lame because...well writing stuff to disk when you shouldn't have to is lame and also because it's basically impossible to pollute the system with more copies of the DLLs each time you are run. How does your app delete the DLLs after it exits? There is the FILE_FLAG_DELETE_ON_CLOSE flag but that requires you use FILE_SHARE_DELETE and that's not allowed for a loaded module on Windows.

Use delay-loaded DLLs

There is a special linker option that tells the linker to auto-generate stubs around all the call sites into a DLL which attempt to do a LoadLibrary/GetProcAddress sequence to just-in-time find the address of the import. Ok great! This gives us a chance to pre-load the DLLs from the temp folder right before we start doing anything and could possibly call a function from the DLL. This works because Windows only uses the name of the DLL and not the path to check if the DLL is loaded.

Still, it would be really nice to avoid having to leak those temporary files (and also writing to disk is for losers). Enter __pfnDliNotifyHook2, this is a user callback from the Microsoft CRT that allows you to customize the code used to locate the delay loaded functions. So if we just go LoadLibraryFromMemory we would be all set! Unfortunately that function doesn't exist. There is no way of loading a module in Windows except from a file on disk. I think this is actually because of a weird 16-bit Windows legacy choice where Windows doesn't actually write pages from executables to the swap file and instead loads them from the file again directly when needed (or at least it used to do that back in the day). But for whatever reason you can't mess with the files of loaded executables on Windows unlike other OSes.

If we want to keep going down this path we need to parse the PE file format ourselves and write our own implementation of Windows DLL loading code. This sounds pretty sketchy but people out there have done it (see https://github.com/fancycode/MemoryModule) but that has the weird effect that the DLL isn't actually "loaded" as far as Windows is concerned since the code can't update any of the OS-internal structures. Also I would be worried about shipping this code. What if something inside the Windows PE loader changes and our code goes out of sync? But since we are parsing all the headers already this leads to option #3...

Somehow turn the compiled DLL into static library and link against that

This sounds totally crazy at first glance but when I checked out the spec for the PE files (the format of DLL and EXE files) and COFF files (the format of .obj files and by extension the contents of .lib files) I noticed that the formats are actually pretty similar (they are even defined by a single specification document).

Theoretically all the information you need to run the program is in the DLL so it should be possible to extract the binary code and data from the DLL and repack it into a static library. There would be no information about symbols only used internally in the module but that's OK, in fact it's even good since those duplicate symbols are the reason you can't normally link together mismatched object files.

With this information I decided to try and write a tool that does this, honestly mostly just for the hell of it. It's called "DLLMasher" and yes it actually does work (with a few caveats) and I learned a lot of interesting stuff about how Windows binaries work along the way.

So next time: How I went about writing DLLMasher and how they hell does it even work at all. For the impatient you can find the source here: https://github.com/lmagder/DLLMasher

Comments are closed.