NFT: Nftables Fuzzing Target

BGGP3 Entry Write Up

Ahh, the Binary Golf Grand Prix, a time where people can bang their heads on their table as they suffer from brain drain. This year BGGP decided to go about finding targets to crash. This will be my first year joining in on the fun. Taking this as a chance to finally fuzz a program.

The rules are simple: Find the smallest file that will crash a program.


As with anything you do, there is always a first step. I had to choose what fuzzer I was going to use to fuzz the program I decided to poke at. Now, I had no prior knowledge of what fuzzer options were out there besides AFL, the BGGP3 announcement offers Honggfuzz as one.

I decided to use AFL++, which is the actively maintained fork of AFL. It provides feedback based fuzzing and is incredibly easy to set up. By compiling the program from source and choosing AFL++'s custom compilers to help instrument the binary while it's being fuzzed.

Choosing A Target

I had previously been reading up on the kernel network stack and had seen that iptables was being phased out by nftables. So, I cloned the nftables repo. I did some digging and was only able to find CVEs1 at the time that effected the kernel side of nftables. On top of that, the bugzilla was private. My goal ultimately was to find a new bug, so with no real access to check where bugs have been found, I decided to just to blindly fuzz the target.

NFTables - nft

The userland tool nft has few options available.
nft [ -nNscaeSupyjt ] [ -I directory ] [ -f filename ] -i | cmd ...]

Luckily one option is to pass a file to nft to load rulesets. Since AFL++ uses input files as it's method to fuzz the program, we were able to use our fuzzer. We need to supply a test file that is as bare as possible to mutate. I looked at nft's man page and decided to randomly use list table.

After compiling nft with AFL++, I let it run while I left to work. I was honestly not expecting to find anything, but when I got home, I saw this.


We had 2 unique crashes!

After minimizing the crash file, the following was the input used to crash nft
list table 000000000000000000000........
list set 00000000000000000000000........

Welp, I'm not going to win BGGP3. The crash file was 4093 bytes. I didn't want to give up on this crash, even if I'm not going to win the grand prize, I decided to still continue looking this over and began triaging the crash.


Messing with the compiler options in the ./configure script, I was able to find that the build script doesn't enable security flags for CC=clang, creating an easy target for picking.


Triaging Our Crash

Initial Tests

In order to help understand why the crash is happening, I decided to fire up gdb, along with gef to help minimize headaches. In the following, I may not get too into detail because there is just too much to write down.

Running the program against our test file, we are able to see where the crash occurs. Now, keep in mind that we get two different versions of a crash: one with a gcc compiled and one with a clang compiled binary.

Since there were no security flags enabled, I was able to further into the program until it segfaulted from a bad address call.

[#0] 0x7ffff7f366f7 → nft_mnl_talk(ctx=0x7fffffffcc00, data=0x7fffffffb960, len=0x101c, cb=0x7ffff7f39277 <table_cb>, cb_data=0x55555555e240)  
[#1] 0x7ffff7f3950a → mnl_nft_table_dump(ctx=0x7fffffffcc00, family=0x46346746, table=0x6746316746306746 <error: Cannot access memory at address 0x6746316746306746>)  
[#2] 0x7ffff7f1e184 → netlink_list_tables(ctx=0x7fffffffccd0, h=0x7fffffffcae0, filter=0x7ffff7959010)  
[#3] 0x7ffff7ef810a → cache_init_tables(ctx=0x7fffffffccd0, h=0x7fffffffcae0, cache=0x55555555d368, filter=0x7ffff7959010)  
[#4] 0x7ffff7ef8914 → nft_cache_init(ctx=0x7fffffffccd0, flags=0x2000007f, filter=0x7ffff7959010)  
[#5] 0x7ffff7ef8b86 → nft_cache_update(nft=0x55555555d2a0, flags=0x2000007f, msgs=0x7fffffffded0, filter=0x7ffff7959010)  
[#6] 0x7ffff7f44e95 → nft_evaluate(nft=0x55555555d2a0, msgs=0x7fffffffded0, cmds=0x7fffffffdee0)  
[#7] 0x7ffff7f45677 → __nft_run_cmd_from_filename(nft=0x55555555d2a0, filename=0x7fffffffe3ef "crash")  
[#8] 0x7ffff7f45a3f → nft_run_cmd_from_filename(nft=0x55555555d2a0, filename=0x7fffffffe3ef "crash")  
[#9] 0x555555557378 → main(argc=0x3, argv=0x7fffffffe0a8)

Now, this didn't give me enough details to pin down where the crash really occurs. What I did learn was where I was able to overwrite. Looking at the function mnl_nft_table_dump, we are able to see that local variables addresses are overwritten. This variables are members of the following struct.

type = const struct nft_cache_filter {
    struct {
        uint32_t family; NULL : Expects IPv4 or IPv6
        const char *table; JUNK DATA
        const char *chain; NULL
        const char *set;
        const char *ft;
    } list;
    struct {
        struct list_head head;
    } obj[8192];
} *

nft expects a table in the following format. Our specificied struct nft_cache_filter puts our data in their respective member when the input file has been parsed.

table ip inet-table { 
	chain output-filter-chain { 
		type filter hook output priority 0; 
		policy accept; 
		ip daddr counter packets 0 bytes 0 

Comparing to our crash file, we are only providing the table name.

list table <string>

nft allows family to be empty, our data string is then pushed into the following struct member table, where there is no checks on it's size. Due to the size of the string we pass it, the stack smash check crashes the program to prevent any stack overwrites. Our overwrite begins within the stack frame of mnl_nft_table_dump.

netfilter uses Netlink interface to transfer information between the kernel and userspace processes. Now, in order to do this, it must build the datagram. There are various types that Netlink supports, we are only dealing with a Netlink message.

Netlink messages are a byte stream with one or multiple nlmsghdr headers and associated payloads. These can only be accessed with NLMSG_*3 macros.

The following table is found within the codebase and structures a Netlink message datagram.
Within the payload section, it also adds the attribute section before the payload section.

	               Netlink Message Header
        |<----------------- 4 bytes ------------------->|
	|<----- 2 bytes ------>|<------- 2 bytes ------>|
	|      Message length (including header)        |
	|     Message type     |     Message flags      |
	|           Message sequence number             |
	|                 Netlink PortID                |
	|                                               | <== Attribute Header
	.                   Payload                     .
	                Attribute Header
	|<-- 2 bytes -->|<-- 2 bytes -->|<-- variable -->|
	|     length    |      type     |      value     |
	|<--------- header ------------>|<-- payload --->| 

During the process of parsing and caching the data we pass nftables, it is passing the same data into the nlmsghdr struct to create the datagram. This is where our crash happens, during serialization of our datagram.


In order for us to get to where the crash occurs, we first go through the following functions.

mnl_nft_table_dump does as the name implies, it dumps the table and pushes data into the function that begins the creation of our nlmsghdr,

struct nftnl_table_list *mnl_nft_table_dump(struct netlink_ctx *ctx,
					    int family, const char *table)
	nlh = nftnl_nlmsg_build_hdr(buf, NFT_MSG_GETTABLE, family,
				    flags, ctx->seqnum);
	if (nlt) {
		nftnl_table_nlmsg_build_payload(nlh, nlt); <== step into

void nftnl_table_nlmsg_build_payload(struct nlmsghdr *nlh, const struct nftnl_table *t)
	if (t->flags & (1 << NFTNL_TABLE_NAME))
		mnl_attr_put_strz(nlh, NFTA_TABLE_NAME, t->name); <= step into

This function updates the length to include the null terminator of our string, then passes it back to our mnl_attr_put function.


EXPORT_SYMBOL void mnl_attr_put_strz(struct nlmsghdr *nlh, uint16_t type,
				     const char *data)
	mnl_attr_put(nlh, type, strlen(data)+1, data);

So the culprit is mnl_attr_put, the function contains a memcpy. There is no checks in place to make sure that our destination can handle the size of the data we are passing it. Our data writes out-of-bounds, causing the stack frame to be overflowed and the canary to set off the check.


EXPORT_SYMBOL void mnl_attr_put(struct nlmsghdr *nlh, uint16_t type,
				size_t len, const void *data)
	struct nlattr *attr = mnl_nlmsg_get_payload_tail(nlh);
	uint16_t payload_len = MNL_ALIGN(sizeof(struct nlattr)) + len;
	int pad;

	attr->nla_type = type; 
	attr->nla_len = payload_len;
	memcpy(mnl_attr_get_payload(attr), data, len); <==
	pad = MNL_ALIGN(len) - len;
	if (pad > 0)
		memset(mnl_attr_get_payload(attr) + len, 0, pad);

	nlh->nlmsg_len += MNL_ALIGN(payload_len);

Within the same file, the following function is also available, it does the same thing by appending a null character, but this time it also checks that the data is able to fit into the buffer size we pass it.


EXPORT_SYMBOL bool mnl_attr_put_strz_check(struct nlmsghdr *nlh, size_t buflen,
					   uint16_t type, const char *data)
	return mnl_attr_put_check(nlh, buflen, type, strlen(data)+1, data);
Proposed Patch

After reading over the source code, there was a function within libmnl that checked that the data length was a valid size. There is a hardcoded value within libnftnl that we can use to do this check: NFT_NAME_MAXLEN

My change was simple, adding the maxlen for name to ensure that any data passed will not get pushed into a buffer that is too small, nullifying our overflow.

Within the function nftnl_table_nlmsg_build_payload, I went ahead and changed the first if statement's check to the following

mnl_attr_put_strz_check(nlh, NFT_TABLE_MAXLEN, NFTA_TABLE_NAME, t->name);

I recompiled and the program and tested.

There was no crash!

Looking over the program in gdb showed that it exited cleanly.

The netfilter team quickly made a fix when I reported the bug, I wasn't able to propose a patch myself.


I'm not going to win, but I learned a lot.

4096-4077 = 19 :(