Friday, January 15, 2010

Converting EXE or OBJ files to Unicode

Most of the times when exploiting vulnerabilities within the application space we would like to encode our code to another form just to avoid detection or make the reversing harder for a noob hacker! Once such thing would be to convert the code to unicode form.

There are numerous tools on the internet that will help us in achieving this, but let's quickly learn the manual way to do it ;)

Have your code written in your favorite assembler and convert the resultant EXE or OBJ file to the shellcode using objdump (comes with binutils package) or any other equivalent tool. We will follow a step by step approach to convert our code to unicode format. Consider the below example program,

mov eax, ebx
mov eax, 12345678H

It is just a simple program that moves the contents of EBX to EAX and moves 12345678H (immediate value) into EAX. We compile the program and dump the resultant EXE using objdump. I'm using MASM as my assembler, you could use anyother assembler of your choice. Take care of the syntax if you are choosing another assembler!

Save the above snippet as eg.asm and we compile it. It will result in two files namely eg.obj and eg.exe. Use "objdump" tool to dump the assembler contents of the object file like......

objdump -d eg.obj

And the tool throws out an output like below,

eg.obj: file format pe-i386


Disassembly of section .text:

00000000 <_start>:
0: 8b c3 mov %ebx,%eax
2: b8 78 56 34 12 mov $0x12345678,%eax

We now cant see the disassembled view of our object file. Take the disassembled code now separately. This code should be used for the conversion of our program to a unicode format.

In order to conver the program to unicode format we take the disassembled code output'd by objdump in terms of 2 bytes. Hence the first two bytes would be 8b c3 that is equivalent to the ocde mov eax, ebx (MASM32 syntax). We then interchage the positions so as the least significant byte is moved to the first position and the most significant bytes is moved to the second position. So the result is c3 8b. We the remove the space between the bytes and add %u notation to the prefix which results in %uc38b.

Consider the next instruction mov eax, 12345678H which is disassembled into b8 78 56 34 12. Obviously we can perform the same steps as we did in the case of the first instruction. But care should be take here. We convert the first four bytes of the instruction as %u78b8%u3456. What happens to the last single byte 12. For those byte that are single or don't have another byte to interchange have interchanged with 00. So the last single byte that doesn't have another byte to interchange when interchanged with 00 results in %u0012. Now we combine unicode results of both the first and the second instruction that sums up to %uc38b%u78b8%u3456%u0012

How do we verify whether we have done everything right? Here comes another handly tool called ConvertShellCode that can be use to convert our unicode formatted code to pure assembly equivalents.

ConvertShellCode.exe %uc38b%u78b8%u3456%u0012

should result in

Assembly language source code :
***************************************
00000000 mov eax,ebx
00000002 mov eax,0x12345678

which is what we had compiled earlier!

No comments:

Post a Comment