##Exploiting Buffer Overflows in C
Project developed by Michael Hicks
This problem will give you some hands-on experience to understand buffer overflows and how to exploit them. You will carry out the project using a virtual machine, on your own computer. In carrying it out, you will have to answer specific questions, given at the bottom, to show that you have followed each of the necessary steps.
###Installing and running the Virtual Machine
The next step is to download the virtual machine image, in OVF format, that we will use for the project. It has extension
.ova meaning it is an archive with all of the relevant materials in it. The file can be found here. This virtual machine runs a version of Ubuntu Linux.
Finally, you must import this OVF file, which is called
mooc-vm.ova, and run it. To import it, it should be as simple as double-clicking the
.ova file. Doing so will start VirtualBox and ask you whether to import it the image. You should then click “import”. Alternatively, rather than double clicking the archive file, you can select “File” -> “Import appliance” from the Manager window and select the file. Further instructions are available here.
Having imported the VM, you should see it in your list of VMs. Select it and click “Start”. This will open a window running the virtual machine, starting up Ubuntu Linux. When you get to a login screen, use username “seed” and password is “dees” (but without quotes). Then start up a terminal window; there is an icon in the menu bar at the top for doing so (it looks like a computer monitor).
###The vulnerable program
We have placed a C program
wisdom-alt.c in the
projects/1 directory in the virtual machine. Type
cd projects/1 to change into this directory. If you type
ls you will see that also in this directory is a compiled version of the program, called
wisdom-alt. This executable was produced by invoking
gcc -fno-stack-protector -ggdb -m32 wisdom-alt.c -o wisdom-alt (in case you accidentally delete it and need to reproduce it).
####Running the program
The program reads data from the
stdin (i.e., the keyboard) and writes to
stdout (i.e., the terminal). You can run the program by typing
./wisdom-alt on the command prompt. When we do this, we see the following greeting:
seed@seed-desktop:~/projects/1$ ./wisdom-alt Hello there 1. Receive wisdom 2. Add wisdom Selection >
At this point, it is waiting for the user to type something in. Typing the number 1 allows you to “receive wisdom” and typing 2 allows you to “add wisdom”. Extending the interaction, suppose we type 1 (and a carriage return).
seed@seed-desktop:~/projects/1$ ./wisdom-alt Hello there 1. Receive wisdom 2. Add wisdom Selection >1 no wisdom Hello there 1. Receive wisdom 2. Add wisdom Selection >
Notice that it outputs no wisdom and then repeats the greeting. Now if we type 2 we can try to add some wisdom; here’s what happens:
Selection >2 Enter some wisdom
Now the program is waiting for the user to type something in. Suppose we type in
sleep is important and press return. Then we will get the standard greeting again. If we type 1 at that point we will get the following:
Selection >2 Enter some wisdom sleep is important Hello there 1. Receive wisdom 2. Add wisdom Selection >1 sleep is important Hello there 1. Receive wisdom 2. Add wisdom Selection >
We can continue to add wisdom, by typing 2. For example, if we did this sequence again, with the entry
exercise is useful, we would get:
Selection >2 Enter some wisdom exercise is useful Hello there 1. Receive wisdom 2. Add wisdom Selection >1 sleep is important exercise is useful Hello there 1. Receive wisdom 2. Add wisdom Selection >
We can keep doing this as long as we like. We can terminate interacting with the program by typing control-D.
This program is vulnerable to a buffer overflow. It is easy to see there is a problem, by typing in something other than 1 or 2. For example, type in 156.
seed@seed-desktop:~/projects/1$ ./wisdom-alt Hello there 1. Receive wisdom 2. Add wisdom Selection >156 Segmentation fault
In fact, the program has at least two vulnerabilities; the above is demonstrating one of them, but there is one other. Your job in this lab is to find and exploit both vulnerabilities. The lab will guide you through steps to do so, and you will answer questions as you go along.
###Exploiting the program
We are now going to show you some tools you’ll need to exploit this program.
####Entering binary data
To exploit the program, you will need to enter non-printable characters, i.e., binary data. To input binary data to the program, use the following command line instead:
Then we can type in binary-format strings (e.g., with hex escaping). For example:
seed@seed-desktop:~/projects/1$ ./runbin.sh Hello there 1. Receive wisdom 2. Add wisdom Selection >2 Enter some wisdom \x41\x41 Hello there 1. Receive wisdom 2. Add wisdom Selection >1 AA
In the above,
\x41\x41 represents two bytes, defined in hexadecial format. 41 in hex is 65 in decimal, which in ASCII is the character
A. As a result, when we ask for wisdom, the program prints
AA. Entering something like
\x07 would be a byte 7. This is not a printable character, but is the “bell”. So when it “prints,” you would actually hear a sound (if sound were enabled on this VM).
To exploit the program, you will have to enter sequences of binary bytes that contain addresses, which are 4-byte (i.e., 32-bit) words on the VM. The x86 architecture is “little-endian”, meaning that the bytes in a word are stored from least significant to most significant. That means that the hexadecimal address 0xabcdef00 would be entered as individual bytes in reverse order, i.e., \x00\xef\xbc\xab. Here is a refresher on endianness, if you need it.
runbin.sh is a shell script that is just a wrapper around the following code:
while read -r line; do echo -e $line; done | ./wisdom-alt
This is what is converting the hex digits into binary before passing them to the
wisdom-alt program. When carrying out the lab, please use the
runbin.sh program, and not the above code directly, or your answers may be slightly off, as discussed at the end.
To exploit the program, you will have to learn some information about how it is laid out in memory. You can find out this information using the
gdb program debugger. You can attach
gdb to your running program, and then use it to print information about the state of that program, and step through executions of that program.
wisdom-alt, you should first invoke the above command line, and then, in a separate terminal, from the
projects/1 directory invoke the following line:
gdb -p `pgrep wisdom-alt`
Should you encounter any errors running the above line, you can first
pgrep wisdom-alt to obtain the PID of
wisdom-alt, and then run
gdb -p <PID>.
-p option to
gdb tells it to attach to a running program with the PID given to the option. The command
pgrep wisdom-alt searches the process table to find the PID of the
wisdom-alt program; this PID is then fed as the argument to
-p. Be warned: If you have multiple
wisdom-alt programs running, you may not attach to the one you expect! Make sure they are all killed (perhaps by killing and restarting the terminals you started them in) if you run into trouble.
Once you have connected to the process, you can start using
gdb commands to start examining its state and controlling it. For example:
seed@seed-desktop:~/projects/1$ gdb -p `pgrep wisdom-alt` GNU gdb 6.8-debian Copyright (C) 2008 Free Software Foundation, Inc. License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html> This is free software: you are free to change and redistribute it. There is NO WARRANTY, to the extent permitted by law. Type "show copying" and "show warranty" for details. This GDB was configured as "i486-linux-gnu". Attaching to process 29727 Reading symbols from /home/seed/projects/1/wisdom-alt...done. Reading symbols from /lib/tls/i686/cmov/libc.so.6...done. Loaded symbols for /lib/tls/i686/cmov/libc.so.6 Reading symbols from /lib/ld-linux.so.2...done. Loaded symbols for /lib/ld-linux.so.2 0xb7fe1430 in __kernel_vsyscall () (gdb)
This shows starting
gdb and attaching it to a running
wisdom-alt process. Then the
gdb command prompt comes up. At this point, the execution of that program is paused, and we can start entering commands. For example:
(gdb) break wisdom-alt.c:100 Breakpoint 1 at 0x80487ea: file wisdom-alt.c, line 100. (gdb) cont Continuing.
Here we enter a command to set a breakpoint at line 100 of
wisdom-alt.c. Then we enter command
cont (which is short for
continue) to tell the program to resume its execution. In the other terminal, running
wisdom-alt we enter
2 and press return. This causes execution to reach line 100, so the breakpoint fires, and the
gdb command prompt comes up again, pausing the program in the process.
Breakpoint 1, main () at wisdom-alt.c:100 100 int s = atoi(buf); (gdb) next 101 fptr tmp = ptrs[s]; (gdb) print s $1 = 2 (gdb) print &r $2 = (int *) 0xbffff530 (gdb) cont Continuing.
Above, we control the program by stepping using “next”, which executes the current line of code, proceeding to the next. Then we print the contents of variable
s with “print”, and it displays the value we entered in the other terminal. Then we print the address of the variable
r. Finally, we continue execution by entering “cont”. In the other terminal we see the prompt to enter some wisdom.
When you are done working with gdb (perhaps when you’ve terminated the other program), just type
quit to exit.
The basic GDB commands you will want to use are those we have already demonstrated: setting break points, stepping through execution, and printing values. If you are not familiar with GDB already, a quick GDB reference is available here, and a more in depth GDB tutorial is available here. You would find it helpful to be familiar with the “print”, “break”, and “step” commands. A full GDB user’s manual is also available.
You are now ready to start your process of developing an exploit.
The first step is to identify where the buffer overflows are. To do that you will have to look through the code of
wisdom-alt.c. You can do this by using an editor on Linux virtual machine, like
emacs, both of which are installed. Alternatively you can look through the file on your own machine outside of the VM, in an editor of your choice — the file is available here.
After looking over the code to see how it works, answer the following four questions:
- There is a stack-based overflow in the program. What is the name of the stack-allocated variable that contains the overflowed buffer?
- Consider the buffer you just identified: Running what line of code will overflow the buffer? (We want the line number, not the code itself.)
- There is another overflow, not dependent at all on the first, of a non-stack-allocated buffer. What variable contains this buffer?
- Consider the buffer you just identified: Running what line of code overflows the buffer? (We want the number here, not the code itself.)
Now use GDB to examine the running the program and answer the following questions. These questions are basically going to walk you through constructing an exploit of the non-stack-based overflow vulnerability you just identified.
Write your answers to the questions on your PDF submission for this homework.
Important: When carrying out the lab, you must follow the instructions given above for running the program (using
runbin.sh) and using GDB (attaching to
wisdom-alt in a separate terminal) exactly or else the answers you get may not match the ones we are expecting. In particular, the addresses of stack variables may be different. These addresses might also be different if you have altered any environment variables in the Ubuntu terminal. To confirm that things are as they should be, recall the GDB interaction above, where we print the address
&r with the result being
0xbffff530 - if you are not getting that result when you reproduce that interaction then something is wrong. You should restart fresh terminals and begin from scratch, following the instructions exactly.
On to the exploit:
- What is the address of
buf(the local variable in the main function)? Enter the answer in either hexadecimal format (a
0xfollowed by 8 “digits” 0-9 or a-f, like
0xbfff0014) or decimal format. Note here that we want the address of
buf, not its contents.
- What is the address of
ptrs(the global variable) ? As with the previous question, use hex or decimal format.
- What is the address of
write_secret(the function) ? Use hex or decimal.
- What is the address of
p(the local variable in the
mainfunction) ? Use hex, or decimal format.
- What input do you provide to the program so that
ptrs[s]reads (and then tries to execute) the contents of local variable
pinstead of a function pointer stored in the buffer pointed to by
ptrs? You can determine the answer by performing a little arithmetic on the addresses you have already gathered above - be careful that you take into account the size of a pointer when doing pointer arithmetic. If successful, you will end up executing the
pat_on_backfunction. Enter your answer as an (unsigned) integer.
- What do you enter so that
ptrs[s]reads (and then tries to execute) starting from the 65th byte in
buf, i.e., the location at
buf? Enter your answer as an (unsigned) integer.
- What do you replace
\xEE\xEE\xEE\xEEwith in the following input to the program (which due to the overflow will be filling in the 65th-68th bytes of
buf) so that the
ptrs[s]operation executes the
write_secretfunction, thus dumping the secret? (Hint: Be sure to take endianness into account.)
Note: Anything not preceded by
\x is a 1 byte ASCII character. The
\x00 in ASCII is the NUL character, meaning it is a null terminator. So
771675175 is 9 bytes of ASCII,
\x00 is 1 byte of hex, and there are 54 bytes of the character
A in the above input.