Knock Knock Who's There? – An NSA VM - 10 minutes read




Back in 2017 (feels like ages ago) I decided to take a peek into the ShadowBrokers leaks and reverse some of the tools.

I started on simply because it had a macOS version. I made local presentations at 0xOpoSec and BSidesLisbon but those slides were never published for obvious reasons (aka live implants all over the Internet).

Significant time has passed and everyone went crazy last week with the beautiful NSO exploit VM published by Project Zero, so why not ride the wave and present a simple NSA BPF VM. It is still an interesting work and you have to admire the great engineering that goes behind this code. It’s not everyday that you can take a peek at code developed by a well funded state actor.

This post is only going to focus on the BPF part of the implant so you will have to fill in the blanks about everything else.

After we start reversing the binary we reach a point where we understand that a libpcap sniffer is installed and we are dealing with a port knocking backdoor with multiple targets such as Solaris, Linux, FreeBSD, HP-UX, JunOS, OS X. If it’s connected to the Internet there is a backdoor version. SIGINT all the things!

FX created the first (AFAIK!) port knocking backdoor, cd00r.c. Knock is a another example. Port knocking removes the need for port listening backdoors and hiding those listeners from traditional network tools (netstat and friends). It’s the year 2000, rootkits are all the rage! Holy crap, 21 years already?

Port knocking is essentially implemented as a custom sniffer looking for a magic packet (or a group of magic packets) and do something when it matches such as open a port, make a callback to a remote host, etc. Without the right sequence and/or data you can’t get in and detect it from the network scans.

Libpcap has all the necessary features to implement this. We just need to install a listener and activate a BPF based filter since we don’t need to capture everything (it’s the 2000s, CPUs are slow, we don’t want to raise alarms). When the magic packet or sequence is triggered a callback is executed and we can do whatever we want.

The following is the observed backdoor process diagram:

The initial process just forks and exit as a debugging measure (gdb used to have problems following child processes - software breakpoints crash because no debugger installed in the child). The first fork is the daemon and watchdog process responsible for managing the rest of the code. It then forks a child worker where the port knocking sniffer is installed.

The watchdog process is also very simple. It redirects all output to , removes all signal handlers, forks a worker child and waits for it. Core files are disabled, file limits are increased, and a signal handler to kill the child is set, to avoid zombie processes if the watchdog is killed. Too much gore in Unix, uh?

Strings are XOR obfuscated. You can find a Unicorn Engine based deobfuscation utility here. At the time I was playing a lot with Unicorn so it was just faster to copy the shellcode into a quick Unicorn emulator. Recently I used a better approach with Lamberts obfuscation with the delambert IDA plugin.

The same obfuscation algorithm is reused in other tools. Ooops, code reusage sin!

After we understand the process diagram it is clear that we want to debug the worker child. My trick at the time was to patch a sleep (or just use Trammell’s simple and powerful infinite loop trick, which takes less bytes and doesn’t need offset computations) in the code that the child is going to run, attach the debugger, restore (or emulate) the original bytes, rewind EIP and have fun. Or, you can just skip the fork, and patch code or set EIP to the code that would be executed by the child.

Not sure why I originally used a sleep patch instead of the infinite loop. Maybe because it would look nicer on slides or to show how to call imported symbols. Go figure!

I can’t remember how but I understood that libpcap was used. All the library original strings are removed to avoid (easy) library identification. I think at the time I just compiled a libpcap version and manually identified the most interesting functions. Bindiff or Diaphora are the tools of the trade for this kind of task.

Usually the flow to install a libpcap based sniffer is:

Just read code and you will understand what’s going in the backdoor code.

In this sample the clear text BPF filter is missing and a call to to compile that filter. Instead of compiled on the fly, the (compiled) bytecode is embedded in the binary.

We can find the code that retrieves and installs the bytecode here:

The input to is a bpf program:

If we take a look at the referenced data it clearly looks like a potential :

We can define the following types in IDA to transform the data into an array:

Applying the new types to the data we can see the expected structure:

The header tells us the program has 57 instructions. We definitely want to disassemble and reverse it.

Before the BPF program is retrieved, the network frame header size is computed:

Quite a few cases are supported but the virtual machine executing the sample uses Ethernet so we will go through the case.

The frame header size is necessary because the bpf program is modified to support those different link types (old stuff, uh?). This is what the function that returns the bpf program does:

That function is just adapting the bpf program to the different types of data links and fixing offsets.

The default program is defined for Ethernet, so the values will be the same as the original, meaning that we can dump directly the program from the binary. The following table comparing the original and computed values to modify proves this:

To disassemble the bytecode we can use bpftools published by Cloudflare. This is based out of the available Linux kernel tools in kernel source code. This repo compiles easily while I think I had some issues compiling the kernel version. Just install the dependencies and compile (Linux only AFAIK). The bpf debugger and disassembler can be found at . If you feel brave (and lucky) you can always try radare2 (eh eh eh eh).

To load the bytecode we need to convert it to this tool format. Example:

The first field is the number of instructions, followed by each instruction in base 10 and space separated fields.

I created bpf_dbg_output to convert the instructions array to input format.

We can finally disassemble the bpf payload:

To understand what is going on here we need the bpf instruction set:

The instruction set consists of load, store, branch, alu, miscellaneous and return instructions. Documentation from Linux kernel.

The following registers are available:

Note the relevant sizes definitions used here:

The following valid ICMP packet (produced by their port knocking tool) is used to reverse the bpf program:

The reversed and commented version of the program (modified instructions are 0, 3, 4, 5, 28, 30, 35):

We can describe the format of the data payload:

Log from the tool that does the port knocking (different packet):

The port knocking tool is extremely flexible and can send all kinds of packets and payloads. It supports TCP, UDP, ICMP, and besides raw packets it can produce DNS, SMTP, SIP application payloads. Can set different flags in TCP packets, for example, send a RST packet with the port knocking payload. Even has a PIX firewall bypass (SYN only packet). Pretty much port knocking on steroids.

Next are some of the different packets it can build.

In the port knocking tool we can find references to . This appears to be the codename for the packet format (includes trigger and payload). Because the tool is to be used only by the attacker, it has a lot of debug messages and strings aren’t obfuscated. Guess they weren’t counting on losing all those tools.

The data contents are encrypted with RC5/6, and contain the callback address and port. In theory it would be possible to recover the callback hosts if we had network dumps of all the packets. The leaks contain traffic bouncer tools so these callback addresses should be just bouncer addresses.

Regarding the RC5/6 code, we can find the same constant identified by Kaspersky as specific to the Equation group:

Stephen Checkoway wrote that Kaspersky analysis is wrong (the original URL is gone since 2019 or something).

The funny thing is that the macOS binary analysed here was most probably compiled with Clang 4.x (and Xcode 4.x) so his GCC theory might be wrong. Who knows? :-)

The 159.1.0 version of can be found in 10.7 SDK and at least Xcode 4.6.3. It’s a Lion 10.7.2 to 10.7.5 system library, and so it should be available in older Xcode versions. Xcode 4 was released between 2011 (4.0) and 2013 (4.6).

There are some other hints that this codebase is quite old. For example, libpcap version is definitely between 0.9.6 and 0.9.8, all released in 2007. You can determine this via the definition. Version 0.9.5 was released in 2006, and version 1.0.0 in 2008. Most probably someone around 2007 or 2008 forked one of 0.9.6-0.9.8 versions and modified it to remove all strings and error messages build up. Need to verify the other operating systems versions to see if they match this information and use the same shared libpcap codebase.

You can definitely see that architecture and code are carefully thought and engineered. They have been doing this for a long time and definitely have more resources than most (nation state) attackers. This isn’t some random proof of concept code. Almost every operating system is a target so their catalog is impressive. The problems of SIGINT and wanting to collect all the things.

There is no heavily obfuscated code, there are no hardcore anti-debugging measures, and no packers and/or cryptors used. They try to blend in and not be too unique. The leaked Lamberts (CIA) sample has similar ideas behind it. A common cyber philosophy? :-)

But they aren’t perfect (nobody is)! They suffer from the code reusage sin. For example, the same string obfuscation algorithm can be seen in other tools. The RC5/6 constant might be another example, if Kaspersky theory is right. Maybe this was true in the past, and the post ShadowBrokers future is better segmented.

Hindsight is always 20/20. It is very easy to detect the hacked machines after we have access to these tools and spend some time reverse engineering them. Four years ago I asked my friends at BinaryEdge to scan the Internet since this was their thing. Why reinvent the wheel if you know people who do great wheels?

Asked them to send this type of packet and check for the answer. The result is that we managed to find a bunch of hacked machines all over the Internet. That’s the main reason why I never published this before and this post is just focused on the BPF program. I am just a curious reverse engineer ;-).

I believe that the tools from the ShadowBrokers leaks are way more interesting than the exploits (altough the leaked exploits sure created a ton more damage worldwide). Reversing the tools allows you to understand a bit of their mindset, their engineering, and operations. And more important, appreciate their work. Here we care about bits and bytes.

One last thing, and the real reason I wrote this blogpost.

I just opened a new startup, and I am looking for Linux developers. If hunting pandas, bears, kitties, chemical elements, or whatever weird naming out there is your thing then I have an interesting challenge for you.

Job description below, if you are in the US or Western Europe send me your CV (reverser at put.as). This is a fully remote position.

Senior Linux C developer to develop advanced next generation protection technology for Linux operating systems and to protect the Internet infrastructure.

Source: Reverse.put.as

Powered by NewsAPI.org