Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
243 changes: 243 additions & 0 deletions pocs/linux/kernelctf/CVE-2026-23272_cos/docs/exploit.md

Large diffs are not rendered by default.

104 changes: 104 additions & 0 deletions pocs/linux/kernelctf/CVE-2026-23272_cos/docs/vulnerability.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,104 @@
# Vulnerability: Use-After-Free in nft_add_set_elem() err_set_full Path (CVE-2026-23272)

## Summary

A use-after-free exists in netfilter `nf_tables`, in the `err_set_full` path of `nft_add_set_elem()`. When a new element is inserted into a set that already reached its configured `size`, the element is first published to the hash chain with `hlist_add_head_rcu()` (through `nft_setelem_insert()` -> `nft_hash_insert()`), and only after that the code checks the capacity with `atomic_add_unless()`. If the set is already full, the rollback path calls `nft_setelem_remove()` (`hlist_del_rcu()`) and then `nf_tables_set_elem_destroy()` (`kfree()`) without waiting for an RCU grace period.

That breaks the normal RCU publish / grace period / free ordering. Packet-path readers in `nft_hash_lookup()` and control-plane readers in `nft_hash_get()` both walk the hash chain under `rcu_read_lock()`, so they can still hold a pointer after `hlist_del_rcu()` unlinks the element and before the missing grace period. At that point the element has already been freed.

## Affected Component

- **Subsystem**: netfilter / nf_tables
- **Source file**: `net/netfilter/nf_tables_api.c`
- **Function**: `nft_add_set_elem()`
- **Error path**: `err_set_full` label (approx. line 6680)

## Vulnerability Type

- **Cause**: Use-After-Free (UAF) due to freeing an RCU-published set element on the "set full" rollback path
- **Root cause**: Race condition between packet path (`nft_hash_lookup()`) or control path (`nft_hash_get()`) and the `err_set_full` rollback in `nft_add_set_elem()`

## Code Sequence

The vulnerable sequence in `nft_add_set_elem()` (nf_tables_api.c, around line 6642-6686):

```c
ext->genmask = nft_genmask_cur(ctx->net);

err = nft_setelem_insert(ctx->net, set, &elem, &ext2, flags);
// element is now in the hash chain -- visible to RCU readers

if (!(flags & NFT_SET_ELEM_CATCHALL) && set->size &&
!atomic_add_unless(&set->nelems, 1, set->size + set->ndeact)) {
err = -ENFILE;
goto err_set_full;
}
...

err_set_full:
nft_setelem_remove(ctx->net, set, &elem);
// hlist_del_rcu -- unlinks but doesn't wait
err_element_clash:
kfree(trans);
err_elem_free:
nf_tables_set_elem_destroy(ctx, set, elem.priv);
// kfree(elem) -- no synchronize_rcu() first!
```

The regular abort path (`__nf_tables_abort()`) and the async destroy worker (`nft_trans_gc_work_done()`) both wait for RCU readers before freeing removed elements. `err_set_full` does not.

## Requirements to Trigger

- **User namespaces**: Not required for the bug itself. This exploit uses them so an unprivileged user can create a network namespace and gain `CAP_NET_ADMIN` inside it.
- **Capabilities**: `CAP_NET_ADMIN` (inside the user+network namespace for this exploit)
- **Kernel configuration**: `CONFIG_NF_TABLES`, `CONFIG_NF_TABLES_INET`
- **Other**: A hash set with `size=1` so the capacity check fails on every new insertion.

## Commit Which Introduced the Vulnerability

- **Commit**: [`35d0ac9070ef`](https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=35d0ac9070ef) ("netfilter: nf_tables: fix set->nelems counting with no NLM_F_EXCL")
- **Version**: v4.10 (January 2017)
- This moved the capacity check to after insertion, which created the case where an RCU-visible element can be freed on the rollback path without a grace period.

## Commit Which Fixed the Vulnerability

- **Fix approach**: Upstream did not take the simple `synchronize_rcu()` fix from the initial report. Instead it unconditionally bumps `set->nelems` before insertion and lets rollback go through the existing transaction-abort path, which already does the right RCU teardown.
- **Patch commit**: [`def602e498a4`](https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=def602e498a4f951da95c95b1b8ce8ae68aa733a) ("netfilter: nf_tables: unconditionally bump set->nelems before insertion")

## Affected Kernel Versions

- **Introduced in**: 4.10
- **Affected stable ranges**: 4.10 - 6.1.164, 6.2 - 6.6.127, 6.7 - 6.18.16, and 6.19 - 6.19.6
- **Affected mainline range**: 7.0-rc1 - 7.0-rc2
- **Fixed in**: 6.1.165, 6.6.128, 6.18.17, 6.19.7, and mainline 7.0-rc3

## Blocking the Vulnerability

Ways to block this path:

- **Disabling user namespaces** (`kernel.unprivileged_userns_clone=0` or equivalent) stops this exploit path from obtaining `CAP_NET_ADMIN` in a private network namespace.
- **Blocking `NETLINK_NETFILTER` socket creation** from unprivileged contexts.
- **Disabling nf_tables support** (`CONFIG_NF_TABLES=n`) or INET tables (`CONFIG_NF_TABLES_INET=n`).
- **Restricting `nf_tables` access** with LSM policy (for example SELinux or AppArmor rules denying netfilter configuration).

## KASAN Report

```
BUG: KASAN: use-after-free in nft_hash_get+0xf0/0x120
Read of size 8 at addr ffff888103f6db00 by task init/158

Call Trace:
nft_hash_get+0xf0/0x120
nft_get_set_elem+0x248/0x500
nf_tables_getsetelem+0x326/0x4f0
nfnetlink_rcv_msg+0x37a/0x4c0

Allocated by task 156:
nft_set_elem_init+0x71/0x270
nft_add_set_elem+0xfda/0x1aa0

Freed by task 156:
nft_add_set_elem+0x1378/0x1aa0
```

Both alloc and free happen in `nft_add_set_elem` on the same task, which matches the `err_set_full` rollback path.
Original file line number Diff line number Diff line change
@@ -0,0 +1,51 @@
# kernelCTF exploit build for cos-113-18244.521.98
#
# Default target builds the kernelXDK-integrated version (exploit_xdk.cpp).
# Use 'make exploit_original' for the original C version (exploit.c).

KERNELXDK_INCLUDE_DIR ?=
KERNELXDK_LIB_DIR ?=

CXX ?= g++
CC ?= gcc
CXXFLAGS := -std=c++20 -I. -static -O2 -Wall -pthread
CFLAGS := -static -O2 -Wall -pthread

ifneq ($(strip $(KERNELXDK_INCLUDE_DIR)),)
CXXFLAGS += -I$(KERNELXDK_INCLUDE_DIR)
endif

LDFLAGS := -lkernelXDK -pthread
ifneq ($(strip $(KERNELXDK_LIB_DIR)),)
LDFLAGS := -L$(KERNELXDK_LIB_DIR) $(LDFLAGS)
endif

.PHONY: all run clean prerequisites

all: exploit

# Optional hook used by CI. No extra prerequisites are needed here.
prerequisites:
@true

# Download kernelXDK target database if not present
target_db.kxdb:
wget -q -O $@ https://storage.googleapis.com/kernelxdk/db/kernelctf.kxdb

# kernelXDK-integrated version (C++)
exploit: exploit_xdk.cpp target_db.kxdb
$(CXX) $(CXXFLAGS) -o $@ $< $(LDFLAGS)

# Debug build target used by CI verification.
exploit_debug: exploit_xdk.cpp target_db.kxdb
$(CXX) $(CXXFLAGS) -g -o $@ $< $(LDFLAGS)

# Original C version (does not require kernelXDK)
exploit_original: exploit.c
$(CC) $(CFLAGS) -o $@ $<

run: exploit
./exploit

clean:
rm -f exploit exploit_debug exploit_original target_db.kxdb
Binary file not shown.
Loading
Loading