You have a face, I have a face, so check out this article about something cool I worked on!

fxguide just posted a pretty extensive article about the state of the art in face rendering: "The Art of Digital Faces at ICT – Digital Emily to Digital Ira" and it has section about the NVIDIA Digital Ira/FaceWorks project I worked on.

We've actually made two different version of Digital Ira so far. The "super-duper" version that runs on a GTX Titan (which you can even download here if you happen to own one):

If you scroll down you might notice that I also did part of the Dawn demo that get's skewered at the beginning. <shrug> New and pretty becomes old and busted pretty fast with this stuff. :)

We also made a mobile version using the same data set, but with a simplified shading model. Mobile chips have come a long way since back in the day. Plus handwriting NEON assembly is fun.

It's not downloadable yet since there are no devices in the wild yet that could run it, but it went over pretty well at Siggraph so hopefully it makes it out there eventually.

Well, that’s a terrible idea!

A few weeks ago a coworker and I were complaining about how annoying it is to make self contained tools and utilities. It seems like everything depends on 1000 DLLs these days. All our engine is statically linked because a) Is anybody really going to build off a binary-only version and then want to upgrade to a new version without recompiling? People tout this advantage a lot, but honestly I'd like to see somebody do that with non-trivial C++ code. And b) then you can't statically link the CRT anymore (since each DLL would get it's heap and you need to remember who allocated what pointer, etc.) which means you need a CRT redistributable to be installed on PCs that run your software. Microsoft has this local manifest thing for jamming at that crazy WinSxS goodness/badness into your folder but you're still stuck with a folder of crap.

Even then, sometimes you have the source for your dependencies and sometimes you don't... and even sometimes when you do it's so annoying to build them that you would go to extreme and terrible lengths to use the binaries you are given. But I'm not naming any names here :). Also there can be sub dependencies that are binaries and require certain a CRT type (debug/release/DLL/static). The Microsoft compiler won't let you combine two CRTs into one binary unfortunately, but if it did this would work. Previously you couldn't have malloc-ed in one and freed in another anyway, it just doesn't know that you know you can't do that. Also there would be symbol clashes since there would be two "malloc" functions so you would need some kind of pre-link symbol-binding-then-wiping pass:

clip_image002

Unfortunately the green stuff doesn't exist so that leaves us with a few other options:

Pack the DLLs in a resource and extract them to a temp directory

This is the most straightforward and self-explanatory approach but it means you can't use DLLs that aren't dynamically loaded since your unpacking code hasn't run yet when Windows tries to find your dependencies. (Yes, you can also delay-load, I'm getting to that!) Also this lame because...well writing stuff to disk when you shouldn't have to is lame and also because it's basically impossible to pollute the system with more copies of the DLLs each time you are run. How does your app delete the DLLs after it exits? There is the FILE_FLAG_DELETE_ON_CLOSE flag but that requires you use FILE_SHARE_DELETE and that's not allowed for a loaded module on Windows.

Use delay-loaded DLLs

There is a special linker option that tells the linker to auto-generate stubs around all the call sites into a DLL which attempt to do a LoadLibrary/GetProcAddress sequence to just-in-time find the address of the import. Ok great! This gives us a chance to pre-load the DLLs from the temp folder right before we start doing anything and could possibly call a function from the DLL. This works because Windows only uses the name of the DLL and not the path to check if the DLL is loaded.

Still, it would be really nice to avoid having to leak those temporary files (and also writing to disk is for losers). Enter __pfnDliNotifyHook2, this is a user callback from the Microsoft CRT that allows you to customize the code used to locate the delay loaded functions. So if we just go LoadLibraryFromMemory we would be all set! Unfortunately that function doesn't exist. There is no way of loading a module in Windows except from a file on disk. I think this is actually because of a weird 16-bit Windows legacy choice where Windows doesn't actually write pages from executables to the swap file and instead loads them from the file again directly when needed (or at least it used to do that back in the day). But for whatever reason you can't mess with the files of loaded executables on Windows unlike other OSes.

If we want to keep going down this path we need to parse the PE file format ourselves and write our own implementation of Windows DLL loading code. This sounds pretty sketchy but people out there have done it (see https://github.com/fancycode/MemoryModule) but that has the weird effect that the DLL isn't actually "loaded" as far as Windows is concerned since the code can't update any of the OS-internal structures. Also I would be worried about shipping this code. What if something inside the Windows PE loader changes and our code goes out of sync? But since we are parsing all the headers already this leads to option #3...

Somehow turn the compiled DLL into static library and link against that

This sounds totally crazy at first glance but when I checked out the spec for the PE files (the format of DLL and EXE files) and COFF files (the format of .obj files and by extension the contents of .lib files) I noticed that the formats are actually pretty similar (they are even defined by a single specification document).

Theoretically all the information you need to run the program is in the DLL so it should be possible to extract the binary code and data from the DLL and repack it into a static library. There would be no information about symbols only used internally in the module but that's OK, in fact it's even good since those duplicate symbols are the reason you can't normally link together mismatched object files.

With this information I decided to try and write a tool that does this, honestly mostly just for the hell of it. It's called "DLLMasher" and yes it actually does work (with a few caveats) and I learned a lot of interesting stuff about how Windows binaries work along the way.

So next time: How I went about writing DLLMasher and how they hell does it even work at all. For the impatient you can find the source here: https://github.com/lmagder/DLLMasher

A New Dawn Demo Released

I know there have been screenshots and stuff online for a while, but today the public build of the demo finally passed QA and all that other good stuff and got put on the web! The demo is our most recent attempt at a photorealistic(-ish :)) character and it's a sequel to everyone's favorite scantily clad fairy (and company mascot) from the original Dawn demo from way back in 2002.

Stuff always ends up the cutting room floor and there's always stuff people want to tweak, but all in all I think final result turned out to be pretty cool. Hat's off to my teammate Zohir for awesome skin shader stuff. The facial animation in this was a beast and moving forward I'm definintiely going to enjoy not spending my nights trying to figure out why the hell our custom Maya exporter broke again :)

Anyway, enough filler text by me! You can download the demo here: http://www.geforce.com/games-applications/pc-applications/a-new-dawn

 

More Demos

We've been busy at work!

NVIDIA just released the first video from one of things I've been working on and already it has 600k views on YouTube! Crazy!

It was cool to mix it up from the standard PC demos and I think made something pretty cool. (The videos have HD versions to if you fullscreen them)

Also, (not so recently but since I last updated my site :) ) we released a few other demos:

An endless city build using the tessellation power of the GTX 580 and new multi-threading features in DX11

I'm really proud of this one since I wrote the L-system-based city generator with Steve our lead artist. There also was lot of cool stuff I worked on that got packed in there in terms of fancy metal shaders, volumetric lights that work in stereoscopic 3D, and multi-threaded path-finding for all the little flying dudes.

Also, we released "Alien vs. Triangles":

This one has a dynamically tessellating skinned character with dynamic transform that can grow and spread. I didn't do much on this one since I was mostly working on the city at the time, but I did make the toasty laser :)

It’s Shiny

If you just happen to have a GeForce GTX 480 (or two) lying around you should check out the new demo I worked on at NVIDIA: Supersonic Sled. I'll let the pitch page do most of the talking but it's got some crazy features like PhysX destruction, CUDA-accelerated debris and fluids, Tessellation shaders, and NVIDIA 3D Vision Stereo glasses support.

Make sure you both kill the dude and upload your replays either as high scores on the NVIDIA site or to YouTube using the built-in video recorder ('cause I wrote that part :) )

It’s Happening Again!

Well, so much for regularly updating the new site. My excuse is that it's been a crazy summer. I moved to California, started work at Cryptic, and helped ship my first game: Champions Online

Champions Online

If you're itching to fight crime in colorful spandex I would recommend you check it out. (NOTE: itching may be due to spandex suit. Champions Online only soothes metaphorical itching. )

Also, I heard if you buy ten or more copies the game runs faster....just saying :)