Skip to content

[bpf-ci-bot] ASAN: token and test_bpffs tests fail due to exit() in forked children triggering LeakSanitizer #473

@kernel-patches-review-bot

Description

@kernel-patches-review-bot

Summary

The token and test_bpffs BPF selftests use exit() in forked child processes, which triggers ASAN's LeakSanitizer atexit handler. When LeakSanitizer detects leaks in the child, it overrides the exit code to 1, causing the parent process to see an unexpected waitpid failure even though all BPF operations succeeded. Replacing exit() with _exit() prevents atexit handlers from running in the child, eliminating the false failure.

Failure Details

  • Test / Component: token (all 11 subtests) and test_bpffs
  • Frequency: Rare/flaky — observed in ASAN runs on Apr 5, 2026; Apr 3-4 runs passed
  • Failure mode: Flaky — child exit code silently changed by LeakSanitizer
  • Affected architectures: x86_64 (ASAN builds only)
  • CI runs observed:

Root Cause Analysis

In token.c:410, the child() function calls exit(-err) after completing its work. In test_bpffs.c:146, the fn() function calls exit(err). Both are executed in forked child processes.

When compiled with ASAN (-fsanitize=address), exit() triggers atexit handlers, including LeakSanitizer's leak check. If LeakSanitizer detects any memory leak in the child process (which is common since the child is short-lived and doesn't free all allocations), it overrides the process exit code to 1.

The parent calls waitpid() and checks WEXITSTATUS(status). It expects 0 on success, but gets 1 from LeakSanitizer, producing:

waitpid_child unexpected error: 1 (errno 0)

No child-side FAIL messages appear because the BPF operations actually succeeded — the exit code was changed by LeakSanitizer after the child's test logic completed.

Other BPF selftests already correctly use _exit() in forked children:

  • cpumask.c:65, exhandler.c:37, task_kfunc.c:61, uprobe_syscall.c:313

Additional files with the same exit() issue in forked children (not patched here, but worth fixing):

  • bpf_iter.c:1585, cgroup_hierarchical_stats.c:177, cgrp_kfunc.c:124-139
  • send_signal.c:106, task_under_cgroup.c:46, task_local_storage.c:316
  • test_task_work.c:78, timer.c:111, uprobe_multi_test.c:96

Proposed Fix

Replace exit() with _exit() in the two most-affected files:

  1. token.c:410exit(-err)_exit(-err)
  2. test_bpffs.c:146exit(err)_exit(err)

_exit() terminates the process immediately without running atexit handlers, preventing LeakSanitizer from interfering with the exit code. This is safe because:

  • The child process is about to terminate; the kernel reclaims all resources
  • No critical cleanup depends on atexit handlers in these children
  • This is the established pattern in other BPF selftests

Patch files: 0001-selftests-bpf-Use-_exit-in-forked-children-to-avoid-.patch and 0002-selftests-bpf-Use-_exit-in-test_bpffs-forked-child-t.patch

Impact

Without this fix, ASAN CI runs will continue to see intermittent false failures in token and test_bpffs tests. The failures are confusing because no test assertion fails — the exit code is silently changed. This wastes developer time investigating phantom failures and erodes trust in CI signals.

A broader cleanup of all exit() calls in forked children (listed above) would prevent similar issues in other tests as ASAN coverage expands.

References

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions