Skip to content

fix(#146): fallocate memory.bin before BRANCH — flattens pause_ms across the chain#152

Merged
WaylandYang merged 1 commit into
mainfrom
fix/146-fallocate-memory-bin
May 23, 2026
Merged

fix(#146): fallocate memory.bin before BRANCH — flattens pause_ms across the chain#152
WaylandYang merged 1 commit into
mainfrom
fix/146-fallocate-memory-bin

Conversation

@WaylandYang
Copy link
Copy Markdown
Contributor

Summary

Closes #146. The multi-BRANCH pause_ms anomaly (~5× jump from BRANCH 3+) was traced over 5 probe rounds to ext4's delayed allocation + multi-block allocator + writeback throttle + block-bitmap checksumming compounding per BRANCH. This PR is the one-shot fix: `posix_fallocate` the destination memory.bin to its full size before either the diff-mode background `std::fs::copy` or FC's `/snapshot/create` writes to it.

Before / after (live, ext4 SSD, `coding-agent-fork-prewarm-v1` source, 10 consecutive diff BRANCHes, 3 s gap)

BRANCH 1 2 3 4 5 6 7 8 9 10
before 350 250 1300 1400 1500 2700 1500 1800 2700 1500
after 585 286 344 161 369 153 189 162 324 174
tmpfs (control) 728 196 138 114 168 138 111 124 259 110
  • BRANCH 6: 2700 ms → 153 ms = 17.6×
  • Median across BRANCH 3-10: ~1700 ms → ~200 ms = ~8.5×
  • After-curve matches the tmpfs control to within noise — the ext4 metadata overhead is gone

Implementation

`posix_fallocate(fd, 0, source_memory_bin_size)` on the destination right after `mkdir snap_dir`. The pre-allocated extents are claimed by ext4 immediately; subsequent writes don't run `ext4_mb_new_blocks` and don't update on-disk block bitmaps in-band.

Best-effort: failure (tmpfs, NFS, FAT, EOPNOTSUPP, etc.) is logged at WARN and the BRANCH continues with the old behavior. No behavior change on filesystems that don't benefit.

Files

  • `crates/forkd-controller/src/http.rs` — `preallocate_memory_file` helper (+ call in `branch_sandbox`'s spawn_blocking closure, before the diff-mode background copy and the FC `snapshot/create` write)
  • `crates/forkd-controller/Cargo.toml` — `libc = "0.2"` cfg(unix) dep for `posix_fallocate`

Refs

🤖 Generated with Claude Code

…balloc

5 rounds of probe (#128, #140, #143, #150, #151) traced the multi-BRANCH
pause_ms anomaly to ext4 delayed allocation + writeback throttle
+ multi-block allocator + block-bitmap CRC compounding per BRANCH.
tmpfs control confirmed: anomaly is 100% in the fs layer.

Fix: posix_fallocate the destination memory.bin to source full size
right after we create snap_dir, before either the diff-mode background
copy OR FC's snapshot/create write. ext4 reserves the extents up-front;
subsequent writes don't run mballoc or update block bitmap.

Best-effort — on tmpfs / unsupported FS the syscall returns ENOSYS, we
log at WARN and continue (no behavior change). On ext4 (the actual
problem case), this should flatten pause_ms across the BRANCH chain.

Adds libc 0.2 as a cfg(unix) dep for posix_fallocate.

Refs #146.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@WaylandYang WaylandYang merged commit e06285f into main May 23, 2026
2 checks passed
@WaylandYang WaylandYang deleted the fix/146-fallocate-memory-bin branch May 23, 2026 06:24
WaylandYang added a commit that referenced this pull request May 23, 2026
Bumps Cargo workspace + Python SDK 0.3.3 → 0.3.4.

After 5 probe rounds traced the multi-BRANCH pause_ms anomaly to
ext4's delayed allocation + writeback throttle + multi-block allocator
+ block-bitmap CRC, PR #152 fixed it with a single posix_fallocate
call. Measured impact:

  ext4 SSD, 10 consecutive diff BRANCHes:
    BRANCH 6 pause_ms: 2700 → 153 (17.6×)
    median BRANCH 3-10: ~1700 → ~200 ms (~8.5×)

This is the first release that should auto-publish to PyPI via the
new release.yml → publish-pypi.yml chain (PR #144). v0.3.1-0.3.3 all
required manual workflow_dispatch.

Full notes in CHANGELOG.md § 0.3.4.

Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

BRANCH 3+ pause_ms jumps 5× on same source (snapshot-worker single-thread loop)

1 participant