Monday, July 24, 2017

Beginning x86 disassembly – Understanding the basics of “memcpy” with Visual Studio 2017

In this series of posts, I’m going through the Open Security Training for beginning Assembly Language and thus am putting my own spin on things to enhance my knowledge of x86 disassembly. However, to make the most of these tutorials you may be better of reviewing the material from Open Security Training directly.

Let’s get started!

Here is the code to help us:

/*
* This file focuses on learning the basics of memcpy
* The objective is to get a better understanding of how these are disassembled in C
* Author Nik Alleyne
* Blog: securitynik.blogspot.com
* File: memcpy.c
*
*/
#include <stdio.h>
#include <memory.h>

int main()
  {
    char mySrc[] = "securitynik.com";
    char myDst[16];
   
    printf(" [Before Memcpy] | \n    -> mySrc @ [%p] contains '%s' \n    -> myDst @ [%p] contains '%s'  \n", &mySrc, mySrc, &myDst, myDst);
    printf(" -=-=-=-=-=-||-=-=-=-=-=-\n");
    memcpy(myDst, mySrc, sizeof(myDst));
    printf(" [After Memcpy] | \n    -> mySrc @ [%p] contains '%s' \n    -> myDst @ [%p] contains '%s'  \n", &mySrc, mySrc, &myDst, myDst);

    return 0;
  }

Here is what this looks like when executed:

D:\>.\Into_to_x86.exe
 [Before Memcpy] |
    -> mySrc @ [0019FEF4] contains 'securitynik.com'
    -> myDst @ [0019FEE4] contains 'É@'
 -=-=-=-=-=-||-=-=-=-=-=-
 [After Memcpy] |
    -> mySrc @ [0019FEF4] contains 'securitynik.com'
    -> myDst @ [0019FEE4] contains 'securitynik.com'

And here is the disassembly

int main()
  {
00401000  push        ebp 
00401001  mov         ebp,esp 
00401003  sub         esp,20h 
    char mySrc[] = "securitynik.com";
00401006  mov         eax,dword ptr ds:[00404000h] 
0040100B  mov         dword ptr [ebp-10h],eax 
0040100E  mov         ecx,dword ptr ds:[00404004h] 
00401014  mov         dword ptr [ebp-0Ch],ecx 
00401017  mov         edx,dword ptr ds:[00404008h] 
0040101D  mov         dword ptr [ebp-8],edx 
00401020  mov         eax,dword ptr ds:[0040400Ch] 
00401025  mov         dword ptr [ebp-4],eax 
    char myDst[16];
   
    printf(" [Before Memcpy] | \n    -> mySrc @ [%p] contains '%s' \n    -> myDst @ [%p] contains '%s'  \n", &mySrc, mySrc, &myDst, myDst);
00401028  lea         ecx,[ebp-20h] 
0040102B  push        ecx 
0040102C  lea         edx,[ebp-20h] 
    char myDst[16];
   
    printf(" [Before Memcpy] | \n    -> mySrc @ [%p] contains '%s' \n    -> myDst @ [%p] contains '%s'  \n", &mySrc, mySrc, &myDst, myDst);
0040102F  push        edx 
00401030  lea         eax,[ebp-10h] 
00401033  push        eax 
00401034  lea         ecx,[ebp-10h] 
00401037  push        ecx 
00401038  push        404010h 
0040103D  call        004010D0 
00401042  add         esp,14h 
    printf(" -=-=-=-=-=-||-=-=-=-=-=-\n");
00401045  push        40406Ch 
0040104A  call        004010D0 
0040104F  add         esp,4 
    memcpy(myDst, mySrc, sizeof(myDst));
00401052  push        10h 
00401054  lea         edx,[ebp-10h] 
00401057  push        edx 
00401058  lea         eax,[ebp-20h] 
0040105B  push        eax 
0040105C  call        004027F6 
00401061  add         esp,0Ch 
    printf(" [After Memcpy] | \n    -> mySrc @ [%p] contains '%s' \n    -> myDst @ [%p] contains '%s'  \n", &mySrc, mySrc, &myDst, myDst);
00401064  lea         ecx,[ebp-20h] 
00401067  push        ecx 
00401068  lea         edx,[ebp-20h] 
0040106B  push        edx 
0040106C  lea         eax,[ebp-10h] 
0040106F  push        eax 
00401070  lea         ecx,[ebp-10h] 
00401073  push        ecx 
00401074  push        404088h 
00401079  call        004010D0 
0040107E  add         esp,14h 

    return 0;
00401081  xor         eax,eax 
  }
00401083  mov         esp,ebp 
00401085  pop         ebp 
00401086  ret


Now that we have both our code and disassembly, let’s get going!

As always, the disassemble codes starts off with the function prologue “push ebp”  and  “mov ebp,esp”. To learn more about function prologue and epilogue, see this post.

Next instruction states “sub esp,20h”. This means 32 bytes is being allocated on the stack for our two variables. These are the “char mySrc[]” and “char myDst[]”. Before executing this instruction, let’s take a look at our registers …

EAX = 0F1F1944 EBX = 00275000 ECX = 00000001 EDX = 0040447C ESI = 00401490 EDI = 00401490 EIP = 00401003 ESP = 0019FF04 EBP = 0019FF04 EFL = 00000202

.. and now our stack
0x0019FF04  0019ff18  .ÿ.. – ESP – top of the stack
0x0019FF08  0040147e  ~.@.
0x0019FF0C  00000001  ....
0x0019FF10  004ecff8  øÏN.
0x0019FF14  004ec940  @ÉN.
0x0019FF18  0019ff70  pÿ..

From above we see “ESP = 0019FF04” contains “0019ff18”.  After execution, our registers contain …
EAX = 0F1F1944 EBX = 00275000 ECX = 00000001 EDX = 0040447C ESI = 00401490 EDI = 00401490 EIP = 00401006 ESP = 0019FEE4 EBP = 0019FF04 EFL = 00000206

… and our stack loos like …
0x0019FEE4  00401490  ..@. – Current ESP
0x0019FEE8  00000001  ....
0x0019FEEC  00000001  ....
0x0019FEF0  0019fefc  üþ..
0x0019FEF4  004011f3  ó.@.
0x0019FEF8  00000000  ....
0x0019FEFC  0019ff10  .ÿ..
0x0019FF00  0f1456e5  åV..
0x0019FF04  0019ff18  .ÿ.. – EBP

If we were to subtract the current value of ESP “0019FEE4” from the value of EBP “0019FF04” , we will get a value of “0x20” which is basically 32 bytes. This represents the space allocated for the two variables.

Moving along to the next instruction “mov eax,dword ptr ds:[00404000h]”. When we take at the data segment “00404000h” we see the following:
0x00404000  75636573  secu
0x00404004  79746972  rity
0x00404008  2e6b696e  nik.
0x0040400C  006d6f63  com.

Basically our pointer address “00404000h” is the beginning of our string “securitynik.com”.

Going back to our instruction “mov eax,dword ptr ds:[00404000h]”, this means that we will move  the first 4 bytes of our string which starts at address “00404000h” into the EAX register.

Executing the instruction and looking at the registers we see …

EAX = 75636573 EBX = 00275000 ECX = 00000001 EDX = 0040447C ESI = 00401490 EDI = 00401490 EIP = 0040100B ESP = 0019FEE4 EBP = 0019FF04 EFL = 00000206

As we can see from above, the value in the EAX registers matches that of our memory print out above.

Our next instruction “mov dword ptr [ebp-10h],eax” now takes our EAX value and puts it into “[ebp-10h]”. Basically we are taking our 4 bytes EAX value and placing it into the space allocated for “mySrc”.

From the above printout of the registers we know that “EBP = 0019FF04”. Hence as we look at our stack, we see that ”[ebp-10h]” contains “”

0x0019FEE4  00401490  ..@. – [EBP-32]
0x0019FEE8  00000001  .... – [EBP-28]
0x0019FEEC  00000001  .... – [EBP-24]
0x0019FEF0  0019fefc  üþ.. – [EBP-20]
0x0019FEF4  004011f3  ó.@. – [EBP-16]
0x0019FEF8  00000000  .... – [EBP-12]
0x0019FEFC  0019ff10  .ÿ.. – [EBP-8]
0x0019FF00  0f1456e5  åV.. – [EBP-4]
0x0019FF04  0019ff18  .ÿ  - EBP

After execution of the instruction, we see the ”[ebp-10h]” now contains the value “EAX = 75636573” which is in EAX regiser.

0x0019FEE4  00401490  ..@. – [EBP-32]
0x0019FEE8  00000001  .... – [EBP-28]
0x0019FEEC  00000001  .... – [EBP-24]
0x0019FEF0  0019fefc  üþ.. – [EBP-20]
0x0019FEF4  75636573  secu – [EBP-16]
0x0019FEF8  00000000  .... – [EBP-12]
0x0019FEFC  0019ff10  .ÿ.. – [EBP-8]
0x0019FF00  0f1456e5  åV.. – [EBP-4]
0x0019FF04  0019ff18  .ÿ  - EBP

Our next instruction “mov ecx,dword ptr ds:[00404004h]” then takes the next 4 bytes within our string and place that into the ECX register.

Looking at our registers above we see “ECX = 00000001”. After executing the instruction, we see the registers showing …

EAX = 75636573 EBX = 00275000 ECX = 79746972 EDX = 0040447C ESI = 00401490 EDI = 00401490 EIP = 00401014 ESP = 0019FEE4 EBP = 0019FF04 EFL = 00000206

Next instruction “mov dword ptr [ebp-0Ch],ecx” says to now take value in ECX and place it at [ebp-0Ch], which is basically [EBP-12]. As we can see above, [ebp-0Ch] is below [EBP-16].

0x0019FEE4  00401490  ..@. – [EBP-32]
0x0019FEE8  00000001  .... – [EBP-28]
0x0019FEEC  00000001  .... – [EBP-24]
0x0019FEF0  0019fefc  üþ.. – [EBP-20]
0x0019FEF4  75636573  secu – [EBP-16]
0x0019FEF8  79746972  rity – [EBP-12]
0x0019FEFC  0019ff10  .ÿ.. – [EBP-8]
0x0019FF00  0f1456e5  åV.. – [EBP-4]
0x0019FF04  0019ff18  .ÿ  - EBP

Looking at above, we can see our string is being chunked into 4 bytes pieces and placed unto the stack.

Executing the next instruction “mov edx,dword ptr ds:[00404008h]” says to move the value at the memory address “00404008h” into the EDX register. Above we see the “EDX = 0040447C”. After execution of the instruction, and the printout of the registers, we see …

EAX = 75636573 EBX = 00275000 ECX = 79746972 EDX = 2E6B696E ESI = 00401490 EDI = 00401490 EIP = 0040101D ESP = 0019FEE4 EBP = 0019FF04 EFL = 00000206

Similar to the two previous instructions, the next instruction “mov dword ptr [ebp-8],edx” moves the value in EDX to [ebp-8]. After execution, we see the following …
0x0019FEE4  00401490  ..@. – [EBP-32]
0x0019FEE8  00000001  .... – [EBP-28]
0x0019FEEC  00000001  .... – [EBP-24]
0x0019FEF0  0019fefc  üþ.. – [EBP-20]
0x0019FEF4  75636573  secu – [EBP-16]
0x0019FEF8  79746972  rity – [EBP-12]
0x0019FEFC  0019ff10  .ÿ.. – [EBP-8]
0x0019FF00  0f1456e5  åV.. – [EBP-4]
0x0019FF04  0019ff18  .ÿ  - EBP

Looking at above, we can see our string is being chunked into 4 bytes pieces and placed unto the stack.

Executing the next instruction “mov edx,dword ptr ds:[00404008h]” says to move the value at the memory address “00404008h” into the EDX register. Above we see the “EDX = 0040447C”. After execution of the instruction, and the printout of the registers, we see …

EAX = 75636573 EBX = 00275000 ECX = 79746972 EDX = 2E6B696E ESI = 00401490 EDI = 00401490 EIP = 0040101D ESP = 0019FEE4 EBP = 0019FF04 EFL = 00000206

Similar to the two previous instructions, the next instruction “mov dword ptr [ebp-8],edx” moves the value in EDX to [ebp-8]. After execution, we see the following …
0x0019FEE4  00401490  ..@. – [EBP-32]
0x0019FEE8  00000001  .... – [EBP-28]
0x0019FEEC  00000001  .... – [EBP-24]
0x0019FEF0  0019fefc  üþ.. – [EBP-20]
0x0019FEF4  75636573  secu – [EBP-16]
0x0019FEF8  79746972  rity – [EBP-12]
0x0019FEFC  2e6b696e  nik. – [EBP-8]
0x0019FF00  0f1456e5  åV.. – [EBP-4]
0x0019FF04  0019ff18  .ÿ  - EBP

Yet again another move instruction. The instruction “mov eax,dword ptr ds:[0040400Ch]”. This instruction moves the value at memory address “0040400Ch” and places it into the EAX register. From what we have been doing so far at this point we should know this is the last 4 bytes of our string “securitynik.com”. from above we see “EAX = 75636573”. After execution, we see the registers containing …

EAX = 006D6F63 EBX = 00275000 ECX = 79746972 EDX = 2E6B696E ESI = 00401490 EDI = 00401490 EIP = 00401025 ESP = 0019FEE4 EBP = 0019FF04 EFL = 00000206

Now that the value has been moved to the EAX register, executing the next instruction “mov dword ptr [ebp-4],eax” moves the value in eax to our stack thus building out our string “securitynik.com”. Looking at the stack after execution we see …
0x0019FEE4  00401490  ..@. – [EBP-32]
0x0019FEE8  00000001  .... – [EBP-28]
0x0019FEEC  00000001  .... – [EBP-24]
0x0019FEF0  0019fefc  üþ.. – [EBP-20]
0x0019FEF4  75636573  secu – [EBP-16]
0x0019FEF8  79746972  rity – [EBP-12]
0x0019FEFC  2e6b696e  nik. – [EBP-8]
0x0019FF00  006d6f63  com. – [EBP-4]
0x0019FF04  0019ff18  .ÿ  - EBP

At this point we have completed building out our string on our stack for the variable “mySrc”

Since we have occupied the first 16 bytes after the EBP for “mySrc”, the next 16 bytes is for the “myDst” variable.

Looking at the “printf()” statement …
… we see the first instruction “lea ecx,[ebp-20h]”. This loads the effective address for [ebp-20h], which is EBP-32 in decimal. As we look above, the value at “ebp-20h” is at address “0x0019FEE4”. Hence after execution, this value is placed in the ECX register as shown below.

EAX = 006D6F63 EBX = 00275000 ECX = 0019FEE4 EDX = 2E6B696E ESI = 00401490 EDI = 00401490 EIP = 0040102B ESP = 0019FEE4 EBP = 0019FF04 EFL = 00000206

The next instruction “push ecx” pushes the value in the ECX register to the top of the stack upon execution as shown below.

0x0019FEE0  0019fee4  äþ.. – ESP now points here
0x0019FEE4  00401490  ..@.
0x0019FEE8  00000001  ....
0x0019FEEC  00000001  ....
0x0019FEF0  0019fefc  üþ..
0x0019FEF4  75636573  secu
0x0019FEF8  79746972  rity
0x0019FEFC  2e6b696e  nik.
0x0019FF00  006d6f63  com.
0x0019FF04  0019ff18  .ÿ..

The next instruction “lea edx,[ebp-20h]”, once again takes the address of [ebp-20h] but this time stores it in EDX register as shown below after execution.

EAX = 006D6F63 EBX = 00275000 ECX = 0019FEE4 EDX = 0019FEE4 ESI = 00401490 EDI = 00401490 EIP = 0040102F ESP = 0019FEE0 EBP = 0019FF04 EFL = 00000206

Once again, there is a push instruction “push edx” which now pushes the value of EDX unto the stack. As we know this value is the same as our previous push. Looking at our stack after the push we see …

0x0019FEDC  0019fee4  äþ.. – ESP now points here
0x0019FEE0  0019fee4  äþ..
0x0019FEE4  00401490  ..@.
0x0019FEE8  00000001  ....
0x0019FEEC  00000001  ....
0x0019FEF0  0019fefc  üþ..
0x0019FEF4  75636573  secu
0x0019FEF8  79746972  rity
0x0019FEFC  2e6b696e  nik.
0x0019FF00  006d6f63  com.
0x0019FF04  0019ff18  .ÿ..

The next instruction “lea eax,[ebp-10h]” loads the address at “[ebp-10h]” which is basically [ebp-16] in decimal “0x0019FEF4 which is also the “mySrc” variable  which also points to the beginning of our string “securitynik.com”.

Looking at the registers after the instruction is executed we see …
EAX = 0019FEF4 EBX = 00275000 ECX = 0019FEE4 EDX = 0019FEE4 ESI = 00401490 EDI = 00401490 EIP = 00401033 ESP = 0019FEDC EBP = 0019FF04 EFL = 00000206

The next instruction “push eax” now pushes the value in the EAX register, unto the stack. After execution, when we look at the stack, we see …

0x0019FED8  0019fef4  ôþ.. – ESP now points here
0x0019FEDC  0019fee4  äþ..
0x0019FEE0  0019fee4  äþ..
0x0019FEE4  00401490  ..@.
0x0019FEE8  00000001  ....
0x0019FEEC  00000001  ....
0x0019FEF0  0019fefc  üþ..
0x0019FEF4  75636573  secu
0x0019FEF8  79746972  rity
0x0019FEFC  2e6b696e  nik.
0x0019FF00  006d6f63  com.
0x0019FF04  0019ff18  .ÿ..

Next instruction states “lea ecx,[ebp-10h]”, which once again loads the address of “mySrc”. However, this time it stores the value in the ECX register. After execution, we see the registers look as follows …

EAX = 0019FEF4 EBX = 00275000 ECX = 0019FEF4 EDX = 0019FEE4 ESI = 00401490 EDI = 00401490 EIP = 00401037 ESP = 0019FED8 EBP = 0019FF04 EFL = 00000206

The next instruction “push ecx” now pushes the value in “ecx” unto the top of the stack. Looking at the stack after execution of the instruction we see …

0x0019FED4  0019fef4  ôþ.. – ESP now points here
0x0019FED8  0019fef4  ôþ..
0x0019FEDC  0019fee4  äþ..
0x0019FEE0  0019fee4  äþ..
0x0019FEE4  00401490  ..@.
0x0019FEE8  00000001  ....
0x0019FEEC  00000001  ....
0x0019FEF0  0019fefc  üþ..
0x0019FEF4  75636573  secu
0x0019FEF8  79746972  rity
0x0019FEFC  2e6b696e  nik.
0x0019FF00  006d6f63  com.
0x0019FF04  0019ff18  .ÿ..

At this point, we have pushed the arguments for “printf()” unto the stack in the following order:
“myDst”, “&myDst”, “mySrc”, “&mySrc”. Notice the values were pushed in the reverse order unto the stack.

Our next instruction “push 404010h”, pushes the address of the string in printf() unto the stack. If we take a look at the address “404010h”, we see our string …

0x00404010  65425b20   [Be
0x00404014  65726f66  fore
0x00404018  6d654d20   Mem
0x0040401C  5d797063  cpy]
0x00404020  0a207c20   | .

Now that we have push all the arguments and the pointer to our string unto the stack, the next instruction “call 004010D0” calls the printf() function leveraging the arguments and string from the stack.

Since the objective is not to analyze the printf() function, I will now “step over” the instruction “call 004010D0”

Our next instruction “add esp,14h” basically adds 20 bytes to the stack, thus cleaning up the space which was used for the pushing the arguments and its string unto the stack.

At this time I’m ignoring the following instruction:
push        40406Ch 
0040104A  call        004010D0 
0040104F  add         esp,4 

Which is related to printing the line -=-=-=-=-=-||-=-=-=-=-=-\n

Time to take a quick basic look at “memcpy” changing the location of items in memory.

The first instruction “push 10h” says to push the value 16 unto the stack.  This is basically our argument “sizeof(myDst)”

After executing the instruction “push 10h” the stack looks like  
0x0019FEE0  00000010  .... – ESP now points here
0x0019FEE4  00401490  ..@.
0x0019FEE8  00000001  ....
0x0019FEEC  00000001  ....
0x0019FEF0  0019fefc  üþ..
0x0019FEF4  75636573  secu
0x0019FEF8  79746972  rity
0x0019FEFC  2e6b696e  nik.
0x0019FF00  006d6f63  com.
0x0019FF04  0019ff18  .ÿ..

Our next instruction “lea edx,[ebp-10h]” says to take the address at “[EBP-16]” decimal which is “0x0019FEF4” and places it in the “edx” register. After execution, when we look at the registers we see …

EAX = 0000001A EBX = 00275000 ECX = ADBD27E9 EDX = 0019FEF4 ESI = 00401490 EDI = 00401490 EIP = 00401057 ESP = 0019FEE0 EBP = 0019FF04 EFL = 00000206

The next instruction “push edx” now pushes the value in EDX register unto the stack. This value is basically the beginning address to our string on the stack which is our second argument for memcpy “mySrc”. Looking at our stack after execution we see …

0x0019FEDC  0019fef4  ôþ.. - ESP now points here
0x0019FEE0  00000010  ....
0x0019FEE4  00401490  ..@.
0x0019FEE8  00000001  ....
0x0019FEEC  00000001  ....
0x0019FEF0  0019fefc  üþ..
0x0019FEF4  75636573  secu
0x0019FEF8  79746972  rity
0x0019FEFC  2e6b696e  nik.
0x0019FF00  006d6f63  com.
0x0019FF04  0019ff18  .ÿ..

The next instruction “lea eax,[ebp-20h]” takes the address of our third “memcpy” argument “myDst” and loads it into the “eax” register. After execution our registers show …

EAX = 0019FEE4 EBX = 00275000 ECX = ADBD27E9 EDX = 0019FEF4 ESI = 00401490 EDI = 00401490 EIP = 0040105B ESP = 0019FEDC EBP = 0019FF04 EFL = 00000206

Our next instruction “push eax” now pushes the address of our argument “myDst” unto the stack. Looking at the stack after execution we see …

0x0019FED8  0019fee4  äþ..
0x0019FEDC  0019fef4  ôþ..
0x0019FEE0  00000010  .... -
0x0019FEE4  00401490  ..@.
0x0019FEE8  00000001  ....
0x0019FEEC  00000001  ....
0x0019FEF0  0019fefc  üþ..
0x0019FEF4  75636573  secu
0x0019FEF8  79746972  rity
0x0019FEFC  2e6b696e  nik.
0x0019FF00  006d6f63  com.
0x0019FF04  0019ff18  .ÿ..

At this point we now have all the arguments to “memcpy” on the stack. Thus our next instruction “call 004027F6” calls the “memcpy” function.

At this point, we will “step over” the “call 004027F6” instruction rather than trying to go deep into “memcpy” internal operations.

Once we step over and take a look at our stack we see …

0x0019FED8  0019fee4  äþ..
0x0019FEDC  0019fef4  ôþ..
0x0019FEE0  00000010  ....
0x0019FEE4  75636573  secu
0x0019FEE8  79746972  rity
0x0019FEEC  2e6b696e  nik.
0x0019FEF0  006d6f63  com.
0x0019FEF4  75636573  secu
0x0019FEF8  79746972  rity
0x0019FEFC  2e6b696e  nik.
0x0019FF00  006d6f63  com.
0x0019FF04  0019ff18  .ÿ..

From above we can see our “myDst” variable now has the value which is also in “mySrc”.

At this point, even though there are other instructions to be executed, but there is not much analysis for us to perform that would add to the learnings.

The rest of the instructions are more around the “printf()” function.


References:
  

Other posts in this series:
8. Beginning x86 disassembly – Understanding the basics of “memcpy” with Visual Studio 2017

No comments:

Post a Comment