Monday, January 11, 2021

Understanding Linux x32 calling conventions with Ghidra and GDB - CDECL

This post and all others for this month are part of the series which I used to help me prepare for my GIAC Reverse Engineer Malware (GREM) certification.

In this post, I am attempting to get a better understanding of the calling convention used by Linux software. Specifically Kali running GCC.

┌──(kali㉿securitynik)-[~]
└─$ lsb_release --all       
No LSB modules are available.
Distributor ID: Kali
Description:    Kali GNU/Linux Rolling
Release:        2020.4
Codename:       kali-rolling


                                                                                                                                                                           
┌──(kali㉿securitynik)-[~]
└─$ gcc --version                               
gcc (Debian 10.2.0-16) 10.2.0
Copyright (C) 2020 Free Software Foundation, Inc.
This is free software; see the source for copying conditions.  There is NO
warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.

This is all part of my continued GREM journey. To make this learning somewhat easier, I am using a combination of Ghidra and GDB. Realistically, I could have done this with GDB but wanted to understand things also from the perspective of Ghidra. In this first post, I am looking at CDECL in a 32 bit Linux application.

According to Microsoft CDECL is the default calling convention for C and C++ programs. 

Two key takeaways for CDECL are:
1. Arguments passed from right to left
2. Calling function is responsible for cleaning up, by popping the arguments from the stack

Let's look at these two in practice from GDB's perspective. Code first

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
#include <stdio.h>


// cdecl calling
// Compile with: gcc -m32 cdecl-calling.c -o cdecl-calling.exe



// declare a function with 3 parameters

int myFunc(int first_parm, int second_parm, int third_parm)
  {
    // declare a local variable
    unsigned int sum;
    
    // add the parameters
    // store the value in the local variable sum
    sum = first_parm + second_parm + third_parm;

    // print the result of sum to the screen
    printf("[*] %d + %d + %d = %d", first_parm, second_parm, third_parm, sum);

    // return sum as return value
    return sum;
  }


int main()
  {
    // call myFunc with 3 arguments 
    myFunc(5, 3, 1);

    return 0;
  }

Compile and run, 

┌──(kali㉿securitynik)-[~/CallingConventions]
└─$ gcc -m32 cdecl-calling.c -o cdecl-calling_32.exe
                                                                                                                               
┌──(kali㉿securitynik)-[~/CallingConventions]
└─$ ./cdecl-calling_32.exe                          
[*] 5 + 3 + 1 = 9  

Taking a first glance at Ghidra's output for the main function below, we see the function defined as __cdecl and that the parameters are pushed to the stack in reverse order. Below we see PUSH 0x1, PUSH 0x3 and PUSH 0x5.

                      ************************************************
                      *                   FUNCTION                   *
                      ************************************************
                      undefined4 __cdecl main(undefined1 param_1)
          undefined4    EAX:4      <RETURN>
          undefined1    Stack[0x4] param_1                        XREF[1]:  000111e2(*)  
          undefined4    Stack[0x0] local_res0                     XREF[1]:  000111e9(R)  
          undefined4    Stack[-0xc local_c                        XREF[1]:  00011213(R)  
                      main                                  XREF[4]:  Entry Point(*), 
                                                                      _start:00011086(*), 
                                                                      0001204c, 00013ff8(*)  
     000111e2 8d 4c 24 04        LEA      ECX=>param_1,[ESP + 0x4]
     000111e6 83 e4 f0           AND      ESP,0xfffffff0
     000111e9 ff 71 fc           PUSH     dword ptr [ECX + local_res0]
     000111ec 55                 PUSH     EBP
     000111ed 89 e5              MOV      EBP,ESP
     000111ef 51                 PUSH     ECX
     000111f0 83 ec 04           SUB      ESP,0x4
     000111f3 e8 23 00 00 00     CALL     __x86.get_pc_thunk.ax                 undefined4 __x86.get_pc_t
     000111f8 05 08 2e 00 00     ADD      EAX,0x2e08
     000111fd 83 ec 04           SUB      ESP,0x4
     00011200 6a 01              PUSH     0x1
     00011202 6a 03              PUSH     0x3
     00011204 6a 05              PUSH     0x5
     00011206 e8 8e ff ff ff     CALL     myFunc                                int myFunc(int param_1, i
     0001120b 83 c4 10           ADD      ESP,0x10
     0001120e b8 00 00 00 00     MOV      EAX,0x0
     00011213 8b 4d fc           MOV      ECX,dword ptr [EBP + local_c]
     00011216 c9                 LEAVE
     00011217 8d 61 fc           LEA      ESP,[ECX + -0x4]
     0001121a c3                 RET


And now for Ghidra's output of myFunc function. 

                      ************************************************
                      *                   FUNCTION                   *
                      ************************************************
                      int __cdecl myFunc(int param_1, int param_2,
          int           EAX:4      <RETURN>
          int           Stack[0x4] param_1                        XREF[2]:  000111aa(R), 
                                                                             000111c6(R)  
          int           Stack[0x8] param_2                        XREF[2]:  000111ad(R), 
                                                                             000111c3(R)  
          int           Stack[0xc] param_3                        XREF[2]:  000111b2(R), 
                                                                             000111c0(R)  
          undefined4    Stack[-0x8 local_8                        XREF[1]:  000111dd(R)  
          undefined4    Stack[-0x1 local_10                       XREF[3]:  000111b7(W), 
                                                                             000111bd(R), 
                                                                             000111da(R)  
                      myFunc                                XREF[3]:  Entry Point(*), 
                                                                      main:00011206(c), 
                                                                      00012044  
     00011199 55                 PUSH     EBP
     0001119a 89 e5              MOV      EBP,ESP
     0001119c 53                 PUSH     EBX
     0001119d 83 ec 14           SUB      ESP,0x14
     000111a0 e8 76 00 00 00     CALL     __x86.get_pc_thunk.ax                 undefined4 __x86.get_pc_t
     000111a5 05 5b 2e 00 00     ADD      EAX,0x2e5b
     000111aa 8b 4d 08           MOV      ECX,dword ptr [EBP + param_1]
     000111ad 8b 55 0c           MOV      EDX,dword ptr [EBP + param_2]
     000111b0 01 d1              ADD      ECX,EDX
     000111b2 8b 55 10           MOV      EDX,dword ptr [EBP + param_3]
     000111b5 01 ca              ADD      EDX,ECX
     000111b7 89 55 f4           MOV      dword ptr [EBP + local_10],EDX
     000111ba 83 ec 0c           SUB      ESP,0xc
     000111bd ff 75 f4           PUSH     dword ptr [EBP + local_10]
     000111c0 ff 75 10           PUSH     dword ptr [EBP + param_3]
     000111c3 ff 75 0c           PUSH     dword ptr [EBP + param_2]
     000111c6 ff 75 08           PUSH     dword ptr [EBP + param_1]
     000111c9 8d 90 08 e0 ff     LEA      EDX,[EAX + 0xffffe008]
     000111cf 52                 PUSH     EDX
     000111d0 89 c3              MOV      EBX,EAX
     000111d2 e8 59 fe ff ff     CALL     printf                                int printf(char * __forma
     000111d7 83 c4 20           ADD      ESP,0x20
     000111da 8b 45 f4           MOV      EAX,dword ptr [EBP + local_10]
     000111dd 8b 5d fc           MOV      EBX,dword ptr [EBP + local_8]
     000111e0 c9                 LEAVE
     000111e1 c3                 RET


Switching now to GDB for a more dynamic view. First we look at the disassemble of main function.

└─$ sudo gdb cdecl-calling.exe -q
Reading symbols from cdecl-calling.exe...
(No debugging symbols found in cdecl-calling.exe)
(gdb) set disassembly-flavor intel
(gdb) break main
Breakpoint 1 at 0x11f0
(gdb) run
Starting program: /home/kali/CallingConventions/cdecl-calling.exe 

Breakpoint 1, 0x565561f0 in main ()
(gdb) disassemble main
Dump of assembler code for function main:
   0x565561e2 <+0>:     lea    ecx,[esp+0x4]
   0x565561e6 <+4>:     and    esp,0xfffffff0
   0x565561e9 <+7>:     push   DWORD PTR [ecx-0x4]
   0x565561ec <+10>:    push   ebp
   0x565561ed <+11>:    mov    ebp,esp
   0x565561ef <+13>:    push   ecx
=> 0x565561f0 <+14>:    sub    esp,0x4
   0x565561f3 <+17>:    call   0x5655621b <__x86.get_pc_thunk.ax>
   0x565561f8 <+22>:    add    eax,0x2e08
   0x565561fd <+27>:    sub    esp,0x4
   0x56556200 <+30>:    push   0x1
   0x56556202 <+32>:    push   0x3
   0x56556204 <+34>:    push   0x5
   0x56556206 <+36>:    call   0x56556199 <myFunc>
   0x5655620b <+41>:    add    esp,0x10
   0x5655620e <+44>:    mov    eax,0x0
   0x56556213 <+49>:    mov    ecx,DWORD PTR [ebp-0x4]
   0x56556216 <+52>:    leave  
   0x56556217 <+53>:    lea    esp,[ecx-0x4]
   0x5655621a <+56>:    ret    
End of assembler dump.

Before going into myFunc, let's see how these values are pushed to the stack

Currently the EIP is at 0x565561f0. This can be seen from the arrow above and can be confirmed by looking at the registers:

(gdb) info registers $eip
eip            0x565561f0          0x565561f0 <main+14>

Let's start this process by starting at address 0x56556200 following this and the next 4 instructions. 

(gdb) x/5i $eip
=> 0x56556200 <main+30>:        push   0x1
   0x56556202 <main+32>:        push   0x3
   0x56556204 <main+34>:        push   0x5
   0x56556206 <main+36>:        call   0x56556199 <myFunc>
   0x5655620b <main+41>:        add    esp,0x10

To understand how these are pushed unto the stack, let's look at the stack before the first argument is pushed. Specifically, let's start by looking at 8 hex words from the perspective of ESP register.

(gdb) x/8xw $esp
0xffffd5dc:     0x565561f8      0xf7fe4080      0xffffd600      0x00000000
0xffffd5ec:     0xf7de3df6      0xf7faa000      0xf7faa000      0x00000000

Now let's run the next instruction via ni wish pushes 1 unto the stack. If you remember, above, 1 is the last parameter. Specifically, the parameters are placed in the order 5,3,1.

(gdb) ni
0x56556202 in main ()
(gdb) x/8xw $esp
0xffffd5d8:     0x00000001      0x565561f8      0xf7fe4080      0xffffd600
0xffffd5e8:     0x00000000      0xf7de3df6      0xf7faa000      0xf7faa000

Running the next instruction we see 1 is followed by 3.

(gdb) ni
0x56556204 in main ()
(gdb) x/8xw $esp
0xffffd5d4:     0x00000003      0x00000001      0x565561f8      0xf7fe4080
0xffffd5e4:     0xffffd600      0x00000000      0xf7de3df6      0xf7faa000

When we run the next instruction, we now execute the final PUSH before the call to myFunc.

(gdb) ni
0x56556206 in main ()
(gdb) x/8xw $esp
0xffffd5d0:     0x00000005      0x00000003      0x00000001      0x565561f8
0xffffd5e0:     0xf7fe4080      0xffffd600      0x00000000      0xf7de3df6

At this point, we confirm that the values are pushed unto the stack from right to left. Once again, our arguments were placed as follow myFunc(5, 3, 1);

Giving this a simplier view.


Above, we confirm the arguments are pushed from first to last. 

As for the cleanup, after the call to myFunc, we see add, esp, 0x10, this is my understanding of the caller cleaning up the stack.

(gdb) x/7i $eip
=> 0x56556206 <main+36>:        call   0x56556199 <myFunc>
   0x5655620b <main+41>:        add    esp,0x10
   0x5655620e <main+44>:        mov    eax,0x0
   0x56556213 <main+49>:        mov    ecx,DWORD PTR [ebp-0x4]
   0x56556216 <main+52>:        leave  
   0x56556217 <main+53>:        lea    esp,[ecx-0x4]
   0x5655621a <main+56>:        ret    


Let's switch to looking at stdcall in the next post.


References:

__cdecl
How to compile a 32-bit binary on a 64-bit linux machine with gcc/cmake
StackOverflow - stdcall and cdecl
The History of Calling Conventions Part 3
6.33.35 x86 Function Attributes
FOR610 - GREM






No comments:

Post a Comment