Bump llama.cpp to b9310 to fix Metal deadlock on macOS 26 by FuJacob · Pull Request #2 · FuJacob/cotabbyinference

FuJacob · 2026-05-28T08:36:51Z

Summary

Bumps the bundled llama-cpp binary target from b8665 to b9310 (URL + checksum update in Package.swift).

Closes the upstream deadlock that surfaces as Cotabby issue #262: launching Cotabby with Qwen3.5-2B-Q4_K_M.gguf selected hangs the main thread before the menu bar icon appears. Reproduces deterministically on Apple Silicon under macOS 26 (Tahoe).

Root cause

The bundled b8665 Metal backend hits a Tahoe-era IOGPU command-buffer timeout regression. Sampled stack from the affected user:

Main thread blocked on a pthread mutex waiting for loadModel to finish.
com.apple.root.default-qos worker spinning forever inside __ggml_metal_rsets_init_block_invoke → usleep → nanosleep → __semwait_signal. That is ggml-metal's residency-set keep-alive heartbeat (introduced in ggml-org/llama.cpp#17766); by design it spins, so the actual hang is the command buffer the loader is waiting on never completing.

Upstream tracked the symptom in ggml-org/llama.cpp#20141 (identical hardware: M4 / macOS 26). The fix was to set AGX_RELAX_CDM_CTXSTORE_TIMEOUT=1 unconditionally inside ggml-metal, which landed around build b8882. b8665 predates that fix. The issue reporter validated b9310 (e2ef8fe42) working on their machine via standalone llama-cli, so that is the target chosen here.

Validation

swift build (Xcode 26 toolchain): clean.
swift test: 9 passing, 1 skipped (testEndToEndWithModel, requires COTABBY_TEST_MODEL_PATH). The skipped test is the only one that exercises real model load; reviewers with a local GGUF can run it via COTABBY_TEST_MODEL_PATH=/path/to/model.gguf swift test.
The C++ wrapper (CotabbyInferenceEngine.cpp) compiles unchanged against b9310, so CotabbyInferenceEngine.h's ABI is unchanged. Downstream Cotabby picks this up on the next package resolve with no source-side edits.

Risk / rollout

Single-line URL + checksum bump. Reversion is trivial.
~675 upstream build numbers (~6 weeks) between b8665 and b9310. No public API removals in that window that the wrapper touches (verified by clean compile of CotabbyInferenceEngine.cpp and full unit test pass).
Cotabby pins CotabbyInference to branch: main in its project.yml, so merging this PR ships the fix to all Cotabby users on the next build/release without a Cotabby-side change.

Linked

The bundled llama.framework at b8665 deadlocks at model load on Apple Silicon under macOS 26 (Tahoe): the main thread blocks on a pthread mutex while ggml-metal's residency-set keep-alive thread spins in __semwait_signal. Reproduces deterministically with Qwen3.5-2B-Q4_K_M on an M4 / macOS 26.4.1. The underlying fix landed upstream around b8882 (ggml-org/llama.cpp#20141): AGX_RELAX_CDM_CTXSTORE_TIMEOUT is set unconditionally inside ggml-metal, preventing the IOGPU command-buffer timeout that left the loader stuck. b8665 predates that fix. b9310 was validated working on the reporter's hardware. swift build + swift test pass against b9310; the C++ wrapper compiles without changes, so CotabbyInferenceEngine.h ABI is unchanged and downstream Cotabby picks this up with no further edits.

FuJacob merged commit facf5b3 into main May 28, 2026
1 check passed

FuJacob deleted the bump/llama-b9310 branch May 28, 2026 08:38

FuJacob mentioned this pull request May 28, 2026

App deadlocks at launch when a Qwen3.5 GGUF is selected (bundled llama.framework hangs in ggml_metal_rsets_init) FuJacob/cotabby#262

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Bump llama.cpp to b9310 to fix Metal deadlock on macOS 26#2

Bump llama.cpp to b9310 to fix Metal deadlock on macOS 26#2
FuJacob merged 1 commit into
mainfrom
bump/llama-b9310

FuJacob commented May 28, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

FuJacob commented May 28, 2026

Summary

Root cause

Validation

Risk / rollout

Linked

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant