We are excited about this topic, exploiting a buffer overflow requires knowledge in several areas to understand how this type of attack works. I’ll split this post into two parts, the first part will discuss how the CPU and Memory of a computer function, how a buffer overflow works and then we’ll pull it all together for the attack. The second post will be a practical walkthrough using a simple application to illustrate how a buffer overflow attack is performed.
How Memory is Organized?
The memory model for an X86 Processor is segmented and organized from higher addresses to lower, like you see in the figure.
We won’t go into the purpose of each segment, but suffice it to know that when running a program, the instructions of the program are in the lower end of memory and the stack is at the high end.
What are Registers?
What are registers and why are they used? It’s important to understand that processor architectures are different and the registers on a X86 processor will not be the same as a Motorola 6800 processor. You will even find differences in the size of registers between x86 processors models and it’s why we have 16, 32 and 64 bit processors depending on the model.
On the x86-32bit architecture there are 8 general purpose registers used to store data and addresses that point to positions in memory, the registers are:
The registers we’ll focus on are ESP, EBP and a specialized register EIP.
Extended Stack Pointer is the register that always marks the top of the stack.
The Extended Base Stack Pointer points to the bottom of the stack.
Extended Instruction Pointer is a read-only register that contains the address of the next instruction, it tells the computer where to go to execute the next command.
The stack in assembly language is a section of memory used as temporary storage to allow quick access to data used for the assembly program. The stack can be defined in different memory positions at any time, so the EBP register is used to point to the base of the stack. It’s important to mention that the stack grows and shrinks when we “push” data to or we “pop” data out of the stack. The top of the stack is always pointed to by ESP and its memory address is always changing.
Why is EBP Important?
EBP provides us an anchor point in memory that we can reference. If we call a function inside a program asking for parameters, the position in memory is always referenced by EBP.
Putting it Together
Now we understand some basics on how things work inside a CPU and the memory of a computer, let’s talk about buffer overflow.
At a high level when you call a function inside a program the following happens:
- The Function Stack is created, inserting the register EBP in the stack to set the anchor.
- The parameters are passed to a memory address EBP+8, EBP+12, etc…
- The Function is called, and the returned data is saved in memory and pointed by the RET variable on the position EBP+4
Let’s focus on step 2 and if we sent a string of 12 A’s, the memory will look like the following figure:
Analyzing the figure, we see that PARAM1 points to the address where the data is saved in the stack, and we know ESP points to the top to the stack so when the string is copied from ADDR1 4 bytes at a time to Higher memory, as this is the only way to remain inside the stack.
If the function does not control the length of the buffer before writing to the stack and we send a large number of A’s, it will look like the next figure:
If this happens the EIP register is overwritten by “A’s”, and you have altered the program by overwriting the address it would use to jump to for its next instruction. When EIP is overwritten the program will crash and an exception raised.
Exploiting the Buffer Overflow
Rather than crash EIP we can try to manipulate the address of the programs next jump location and point it to code we’ve written. There are a few things needed to accomplish this:
- First, we need to find where the EIP register is located?
- Next is locating the ESP register, this needs to be done when the buffer overflow exception occurs.
- Last Is crafting an exploit that will work by ensuring there are no bad hexadecimal characters in our shell code.
How can we resolve these problems?
Locating the EIP Address offset
We can find the offset by sending a string of unique characters rather than sending all A’s to cause the buffer overflow. Once EIP is overwritten by a unique 4-byte pattern we can search the string and find the offset position, then replace this part of the string with a new ESP JMP address pointing to our payload.
Finding the ESP value
This problem can be solved in a couple ways, one by using a Disassembler/Debugger and manually attaching the process to analyze the registers when the exception occurs. Or we can use Immunity Debugger and ‘pydbg’ (a python debugger library) to analyze the exception and print the register values.
Identify Bad Characters
Finding which characters are bad in a function is easy but can take a while if you do it manually. Here we’ll send a string formed with every character in Hex (from 0x00 to 0xFF) and monitoring with a Debugger looking for where the string is truncated.
Now that we have all the bad characters identified, we check the ESP address to see if there are any of these characters, if there are then we need to solve it:
- In assembly there is a mnemonic (command) called JMP, what it does is to “jump” to the address specified after the command, so JMP ESP will jump to the address specified in the ESP register, now we need to find a memory address without any bad characters inside the code that’s loaded and use it as the EIP address.
- We'll use an encoder to generate our payload and exclude any bad characters.
Lets return to the basics
Now we know about Memory, Registers, Stack, and what a buffer overflow is but how do we determine if an application has a buffer overflow vulnerability?
There is no exact procedure on how to find if an application is or isn’t vulnerable. Normally you can disassemble the executable and see if there are calls to functions in dll’s prone to exploitation. For example, if calls to functions like strcpy, gets, scanf, and others are present there is a chance the programmer did not implement proper boundary checks and there may be a vulnerability.
You can also poke at the user interaction within the application with scripts and fuzzer and attempt to generate and exception.
With that we finished the first part of this post, and you have a basic understanding of a Buffer Overflow vulnerability, how to exploit it and the problems you could find along the way. In the second post we are going to put in practice this theory with an example of exploitation.