This is largely a copy of the Windows demo application with a few key
changes:
- heap_3 (use malloc()/free()) so tools like valgrind "just work".
- printf() wrapped in a mutex to prevent deadlocks on the internal
pthread mutexes inside printf().
SCons (https://scons.org/) is used as the build system.
This will be built as a 64-bit application, but note that the memory
allocation trace points only record the lower 32-bits of the address.