Build your own executable crypter

This article will take you through the basic steps of building an executable crypter. All of the steps performed in this article require manual setup and integration to prepare the exe for the crypter stub. The focus of this article is to walk you through the theory and know-how of how crypters work and does not attempt to create the latest greatest point and click solution.

For a basic background, here is how executable crypters work:

1) The actual processor commands of a protected binary are
crypted/obscured/munged whatever

2) When the protected application first starts, a small decrypter
stub is first run that restores all of the original processor
commands for the executable in memory.

3) Finally, the decrypter stub ends and transfers execution to the
original entry point (OEP) and the program runs normally.

In the course of this paper, we are going to manually implement a very simple 'crypter' to show you all of the development techniques, design considerations, and debugging details required to make your own.

First, let me introduce you to our target executable. It is a 28kb hello world application written in C. This simple application merely prints out "Hello World" to the screen, waits for a keypress and then exits.

To get us started, lets examine the PE structure of the executable file. Below is an image of the PE section table. You will notice that the .text section (where the actual executable code is housed) has a raw size of 4000h and a virtual size of 3DCEh .

The discrepancy in the numbers indicates that at the end of the .text section there is a certain amount of unused space not currently mapped into memory when the file is loaded. This blank spot in the executable file is good because it means we have an empty pad where we can place our own executable code.

To visually verify this you can open up the file in a hexeditor and look for a null pad. To know where to look you have to be able to find the right file offset. In our sample exe this is simplified because all of our sections have a virtual size <= their raw size and each sections raw offset = its virtual offset. This is nice because it keeps all of the rva values in the PEheader = raw file offsets however this is not always the case. V.2 of the pe editor classes now take this into account and can calculate file offsets from rva values correctly. The assumption of rva = file offset will be made through out the remainder of this article because it holds true for this particular sample we are analyzing. see this null pad open up the original exe file in a hexeditor and check out the area between 4DCEh and 5000h (RawOffset + VirtualSize) Offset 0 1 2 3 4 5 6 7 8 9 A B C D E F 00004DC0 C0 74 06 0F B6 45 0B C9 C3 83 C8 FF C9 C3 00 00 Àt..E.ÉÃÈÿÉÃ.. 00004DD0 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ 00004DE0 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ 00004DF0 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ 00004E00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ 00004E10 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ 00004E20 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ 00004E30 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ 00004E40 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ 00004E50 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ 00004E60 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ 00004E70 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ 00004E80 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ For our needs this will be more than enough space to place our simple decrypter stub. We do not necessarily need to squeeze our code into an existing section. Had we been short on space, we could have resorted to adding a new PE section and placing our code there. Ok, we have found a home for our decrypter block, but first we have to make some adjustments to the PE section characteristics so that: A) our decrypter code gets loaded into memory B) once mapped into memory, we have write access to the main body of code C) when the program is first loaded, execution begins with our decrypter code As noted above, the virtual size of the section (the size loaded into memory) does not include this null pad we found in the file. Since we are going to be adding code to this area, we need to make sure that this area is loaded into memory as well. This is accomplished by increasing the virtual size for this PE section using a PE editor such as LordPE. The second change we have to make is to make sure that the .text section is flagged as a writable area once mapped into memory. This is necessary because our decrypter stub needs to dynamically rewrite (decode) the actual processor codes to be executed. This too is easily done with LordPE from the "edit section header" dialog. Below is a graphic of the dialog sequence and field manipulations required in LordPE. Highlighted in yellow are the fields that we have altered.
Our next goal now becomes to make sure that when the executable first loads it is our decrypter stub that is first run. Since the real processor commands for the executable will not present on disk, having the program start at the original entry point would have the machine trying to execute what is essentially a jumbled block of data.

The program entry point can be directly edited from LordPEs main interface. For our demonstration lets choose to set the entry point at 4E00h. This offset sits 32 bytes from the end of our real applications code and gives us a nice easy spot to find in the hexeditor.

With the PE structure modifications out of the way, now we can move on to the actual work. Here is what we have left:

D) build the decrypter stub
E) crypt the actual executable's opcodes
F) integrate our decrypter stub into the modified binary

Lets start with some design visions for our encoding mechanism. Since this is a demo and a trainer, the encoding mechanism is going to be kept as lightweight and simple as possible. For these reasons a simple XOR encoding will be used.

The next design consideration is to enumerate what kind of variables a generic crypter stub is going to need. Basically any crypter stub is going to need three things:

1) what offset (in memory) to start decrypt data
2) length of the data to decrypt
3) entry point to transfer execution to after decrypted

Since we are designing a really simple stub, I am going to take a short cut and start the encryption routine right at the programs original entry point. While the EP is not at the very beginning of the code section, it is usually close enough that the majority of processor commands will be encrypted.

Before we get into the actual design and development of out decrypter stub, lets knock off the easy part of XORing the original opcodes first. This is a simple operation, and can be done in whatever way is the most convenient for the developer. The implementation I chose was to create a quick VB program that loops through the binary applying the XOR to the appropriate bytes representing the applications opcodes.

For a quick refresher:

Q) How do i know where the opcodes begin?
A) for our simple setup we are starting at the original program
entry point found in the PE Header

Q) How do I know how long of a block to encode?
A) Since we want to encode all of the opcodes after the entry point,
length of the data to encrypt is Original Virtual Size - Entry Point

Inline below is the VB source code used to encode the executable's opcodes:

StartAt = &H1048 'original entry point
length = &H2D86 '3DCE - 1048 (virtual size - entrypoint)

Open p2 For Binary As f

For i = 1 To length
offset = StartAt + i
Get f, offset, b
b = b Xor &HF
Put f, offset, b

Close f

With that out of the way, we are now down to developing our decrypter stub. Basically what we need is a small block of ASM commands that we can paste into the encoded binary at our new entry point.

Below is the decoder block I came up with written in C:

void main(void){

int i;
char b;

char *buffer = 0x400000 ; // imagebase
long length = 0xBEEF ; // <-length of code (placeholder) buffer += 0xDEAD ; // <- OEP offset (placeholder) for(i=0; i < length; i++){ b = buffer ;
b = b ^ 0xF ;
buffer = b ;


_asm jmp buffer


Let me mention a couple points and design considerations about the above code.

* To make the stub generic you are going to have to edit the length and entry point offsets each time you use it. Make these some recognizable values in hex to make it easier to find them in the hexeditor.

* *buffer initially points to the imagebase, remember you are going to be working on memory addresses. The reason I increment *buffer latter to the entry point offset is because I will have to edit this value independently in a hexeditor.

* to transfer execution to the original entry point we just use a inline asm command jmp buffer. At this point *buffer is already pointing directly to the programs original entry point.

All in all it is a very simple decoder stub. The trick comes in debugging and implementing it. Since the decoder is designed to work on data and offsets not found in this standalone application, we can really only use the compiler to generate the opcodes for the commands we need. Debugging takes place by integrating the actual stub byte codes into our crypted exe and running that through the debugger.

Now that we have our proposed C source, we need the assembler byte codes associated with it. The easiest way I have found to get the asm byte codes from the compiler is to set a break point at the top of the code and start up the VC debugger by pressing F5.

Once VC has compiled the code, it will then launch the built in debugger which pauses execution at your preset breakpoint. Now you can right click on the main window and choose "goto disassembly" to see a mixed assortment of C and ASM commands.

Below is a stripped down ASM block generated by the compiler for us. On the left are the actual byte codes associated with the string assembler commands on the right.

C7 45 F4 00 00 40 00 mov dword ptr [ebp-0Ch],400000h
C7 45 F0 EF BE 00 00 mov dword ptr [ebp-10h],0BEEFh
8B 45 F4 mov eax,dword ptr [ebp-0Ch]
05 AD DE 00 00 add eax,0DEADh
89 45 F4 mov dword ptr [ebp-0Ch],eax
C7 45 FC 00 00 00 00 mov dword ptr [ebp-4],0
EB 09 jmp main+43h
8B 4D FC mov ecx,dword ptr [ebp-4]
83 C1 01 add ecx,1
89 4D FC mov dword ptr [ebp-4],ecx
8B 55 FC mov edx,dword ptr [ebp-4]
3B 55 F0 cmp edx,dword ptr [ebp-10h]
7D 22 jge main+6Dh
8B 45 F4 mov eax,dword ptr [ebp-0Ch]
03 45 FC add eax,dword ptr [ebp-4]
8A 08 mov cl,byte ptr [eax]
88 4D F8 mov byte ptr [ebp-8],cl
0F BE 55 F8 movsx edx,byte ptr [ebp-8]
83 F2 0F xor edx,0Fh
88 55 F8 mov byte ptr [ebp-8],dl
8B 45 F4 mov eax,dword ptr [ebp-0Ch]
03 45 FC add eax,dword ptr [ebp-4]
8A 4D F8 mov cl,byte ptr [ebp-8]
88 08 mov byte ptr [eax],cl
EB CD jmp main+3Ah
FF 65 F4 jmp dword ptr [ebp-0Ch]

In order for us to insert this into our executable, we must further strip out just the byte codes and write the hex values into our executable file. A nice way to do this is to strip out the assembler commands, remove all of the spaces, and place then in a long string such as this:


From here, you can copy the text string and write the associated hex values directly into the binary using the Winhex hexeditor by highlighting the start offset (4E00h) pressing Ctrl-B (write clipboard) and then choosing the "ACII Hex" clipboard format.

Once that is done, all we have left is to edit the data length and start offset placeholders compiled into the stub and it will be configured for this binary. If you wrote the stub in starting at offset 4E00h then you will find the BEEFh data length marker at offset 4E0Ah , and the DEADh entry point marker at offset 4E12h.

Note that both of these values are in little endian format. When you go to modify them with the actual values, remember to also write the new values in little endian format.

Below are hexeditor views of the modifications made.

Offset 0 1 2 3
00004E10 .. .. AD DE (DEAD)
00004E10 .. .. 48 10 (1048)

Offset 0 1 2 3 4 5 6 7 8 9 A B
00004E00 .. .. .. .. .. .. .. .. .. .. EF BE (BEEF)
00004E00 .. .. .. .. .. .. .. .. .. .. 86 2D (2D86)

With our decrypter block in place, our main code crypted, and the entry point now aimed at the decrypter, everything should be set and ready to run !

Open it up in Olly, give it a shot and see what happens. Before you start stepping through code, look around the original entry point and see what the disassembly looks like.

004010DC 12 DB 12
004010DD 05 DB 05
004010DE 0F DB 0F
004010DF 0F DB 0F
004010E0 AE DB AE
004010E1 53 DB 53
004010E2 63 DB 63
004010E3 4F DB 4F
004010E4 0F DB 0F
004010E5 AC DB AC
004010E6 6F DB 6F
004010E7 63 DB 63
004010E8 4F DB 4F

Yup, thats a jarbled mess characteristic of a data block or encrypted opcodes... Now go back to the end of the decrypter block and set a breakpoint on the final "jmp buffer" command:

00404E55 >^FF65 F4 JMP DWORD PTR SS:[EBP-C] ; final.00401048

After reaching this point, scroll back up again and take another look at the original entry point 401048. If you still see a junk block of commands such as the above mess, it is because Olly has not yet analyzed the new byte values for processor commands. To fix this, right click in the main disassembly window and choose 'analyze code'. Now you should see the actual decoded instructions:

00401048 /. 55 PUSH EBP
00401049 |. 8BEC MOV EBP,ESP
0040104B |. 6A FF PUSH -1
0040104D |. 68 B8504000 PUSH final.004050B8
00401052 |. 68 9C244000 PUSH final.0040249C ; SE handler installation
00401057 |. 64:A1 00000000 MOV EAX,DWORD PTR FS:[0]
0040105D |. 50 PUSH EAX
0040105E |. 64:8925 000000>MOV DWORD PTR FS:[0],ESP
00401065 |. 83EC 10 SUB ESP,10

Now you can hit the run button and Voila ! It should all function just as expected !

Looks like everything is in place and running just as it should be Smiley

Note that using C to generate the Opcodes can make the decoder a bit bloated. If you wanted to write your decoder directly in asm you could use a stub similar to the following: (even this could be optimized further)

00404E3A B8 48104000 MOV EAX,401048 ;start offset
00404E3F B9 862D0000 MOV ECX,2D86 ;length
00404E44 8BD0 MOV EDX,EAX ;copy of start offset (OEP)
00404E46 8030 0F XOR BYTE PTR DS:[EAX],0F ;top_of_loop decode inst
00404E49 40 INC EAX ;next byte
00404E4A 49 DEC ECX ;dec counter
00404E4B ^75 F9 JNZ SHORT 00404E46 ;counter !=0 goto top_of_loop
00404E4D FFE2 JMP EDX ;jmp OEP

As one last little nugget, let me throw out a quick tip you can use to restore a crypted exe such as this to its former state Smiley

Lets assume the decrypter stub did some actual encryption that we do not want to try to reverse engineer. If the crypter stub only operated on an uncompressed data block that was fully present in the exe and did not perform any other tricks or manipulations the restoration of the executable can actually be very simple.

Give this a shot..load the exe in olly and break on the last jmp buffer. Here the actual executable code is fully decrypted in memory and ready to be run. Now fire up LordPE and dump the 401000 - 405000 memory address range to grab the full .text section from memory. You now have all of the decrypted opcodes saved to disk Wink

Write down the address of the original entry point that the jmp command was going to take you to and exit olly. Open up the memory dump and the crypted exe in Winhex and write the entire dump of the .text section over the crypted .text section in the executable.

Save it, then change the entry point back to the original you wrote down and give it a click. Tadaahhh magic.....kinda..well not really...but you know. *shrugs*

Anyway, this was a fun bit to design and figure out how to do. Hopefully this takes some of the "magic" out of how executable crypters work and should be enough to help someone else along.

I also caved in and wrote an quick point and click utility to integrate this crypter stub into arbitrary executables. You can snag the app plus VB source here. (also has a nice set of classes for PE header manipulation)


Article written by AUTHOR_NAME