Over the course of creating these tutorials, I have been confronted with attempting to make the compiled binaries small. Usually, after entering a three line program in C++, Visual Studio will assume I would like every DLL, API and function ever created by Microsoft to be included in my binary, and I end up having something close to a 6 meg file. (Don’t even get me started on the fact that you can open a new Word document, type one letter, and the file no longer fits on a 32Gig USB key!)

Because you don’t want the binary filled with a bunch of useless crap to detract from the learning process, the binary should ONLY contain the instructions you want used, and nothing else. You would think this would be easy- perhaps a button somewhere that says “De-crapify” or something, but this is Microsoft, so you actually have to do quite a bit of experimenting in Visual Studio to get the binary size even close to what it should actually be.

Over the weekend I did some experimenting, attempting to get the binary as small as possible and trying to figure out what all this crap is that gets inserted into our binary, and this tutorial covers what I learned. A lot of this info was performed by Zer0Flag

, so many thanks (and kudos) go out to him for his hard work. If you would rather have the PDF of this tutorial, you can download it on the tutorials page. Otherwise, read on…

.

First off, here is the source code. It simply opens a message box with a string in it, then closes the app:

Because this is basically two lines of code, you would guess that it should be like, oh, maybe 50 bytes? You guessed wrong:

Yes, try 30,000 bytes. For two lines of code? Oh, come on! Let’s see what’s going on in Olly:

This is where Olly first breaks, at address 71B00220. After a little digging, though, I found this is not the real EP. Looking in the PE header, the real entry point is at107114A:

This jumps to our initialization code. Interestingly, after we perform this code, the jump at 107114A will be dynamically changed to point to CRTMain later on. But for now, this jumps to the code in the picture above, starting at address 71B00220.

This initialization code looks for command line arguments and loads in DLLs for the application. At the end, we return to our original jump that is now changed to point to WinMainCRTStartup:

CRTStartup is used for loading the C RunTime libraries. The CRT provides the fundamental C++ runtime support, including:

  • setup the C++ exception model
  • making sure the constructor of global variables get called before entering main function
  • parse command line arguments, and call the main function
  • initialize the heap
  • setup the atexit chain

After the runtime is initialized,  CRTStartup calls the __security_init_cookie function:

This function detects some buffer overruns that overwrite a function’s return address, exception handler address, or certain types of parameters. Causing a buffer overrun is a technique used by hackers to exploit code that does not enforce buffer size restrictions.

After this function checks the code for potential buffer overruns, we finally get to our actual code:

Changing the Build

The first thing we should notice is that Visual Studio defaults to debug mode, so we should definitely change to Release:


Now when we check the size, we see already a big difference:

Wow, that debug information was almost 75% of the binary’s size! Loading this in CFF Explorer, we see that we lost the .textbss and .idata sections, and the other sections have been reduced drastically.

Debug version:

Release version:

Removing the C Runtime

Of course, 8,000 bytes is already pretty good, but who wants to stop at “pretty good”?

Next, we have to take a step backward in order to take a couple steps forward. Right-clicking the main project’s name in the Project Explorer and selecting Properties, we have the main properties window. Open the C/C++ tree and select the “Code Generation” item. We want to change the “Runtime Library” to “Multi-threaded (/MT)”. This will make the binary load the C++ runtime files when the executable is loaded. The reason we want to do this is so we can manually delete it later.


Changing this adds a significant amount back in, but will allow us to delete it (and more) later:

One thing you will notice in OllyDBG is that our jump table has all but disappeared:

This is because our DLLs have been inserted into our binary, so they will be called directly.

Ignore Default Libraries

Clicking on the “Input” label under the “Linker” tree, we can force Visual Studio to ignore all the default libraries usually automatically loaded.

Changing this to “Yes” and trying to build the program gives us an error though:


To fix this we must change the entry point of our program. The reason for this is that Visual Studio incorporates several function calls before our program actually starts, namely the CRTStartup and security_cookie calls. That means the entry point is set to these functions instead of the true beginning of our app. Since we just told Visual Studio to ignore these functions, if we don’t change the entry point it is still pointing to these functions, that are now being ignored. Clicking on the “Advanced” label under “Linker” we can change this to our actual entry point, WinMain:

*** You may also need to change the “Buffer Security Check” option to “No (/GS-)” under C/C++ in the Code Generation tab to make it build properly. ***

Now when we build it we get no errors and also a file size of 3,000 bytes:

Now we’re talking! Loading this in Olly, we start to see some improvements:

The setup code has also shrunk:

Removing the Manifest

Next we want to ditch the manifest as it’s never used (at least not in our case). Under Linker, click Manifest File and change “Generate Manifest” to “No”:

Doing this only saves about 200 bytes, but hey, that’s something :) Here we can see exactly what the manifest looks like (in CFF Explorer):

The next thing we may notice is that our binary has four sections:

One that we could potentially lose is the .reloc section…

Removing Randomized Base Addresses

We don’t need a relocations section if we never relocate code, so let’s turn random relocations off:

Doing that and rebuilding automatically removes our .reloc section, shaving off another 1,000 bytes:

This also has the nice quality of loading our binary in at the usual address of 401000:

Combining Sections

Next, we don’t necessarily need both sections, especially since the second section only needs a couple dozen bytes but takes up 1,000. Here, we can set the .rdata section to share the .text section by merging them. Still in the Advanced tab, enter this for “Merge Sections”:

Here is our original .rdata section:

and after combining sections, we can see that this data was inserted into the beginning of our .text section in the binary:

and that our entry point has been changed to 4010D0:

Changing Optimizations

Lastly, we can change the optimizations that Visual Studio uses, telling it to optimize for size over speed. Under C/C++, in the Optimization tab, change these four settings:

One last look at our file size and we see we’ve done quite a nice job:

And this is the complete disassembly in Olly (the RETN instruction is a little cut off at the bottom):

From 31,000 bytes to less than 1,000 (620 bytes to be exact). I guess the real question we should be asking is “Why didn’t Microsoft just start here and then add things as we need them?” I’m sure they’re crying all the way to the bank.

R4ndom