feat(cli): add --num-threads option for multi-process server (Linux)#2980
feat(cli): add --num-threads option for multi-process server (Linux)#2980
Conversation
🧪 BenchmarkShould we run the Virtual MCP strategy benchmark for this PR? React with 👍 to run the benchmark.
Benchmark will run on the next push after you react. |
Release OptionsSuggested: Minor ( React with an emoji to override the release type:
Current version:
|
There was a problem hiding this comment.
2 issues found across 5 files
Prompt for AI agents (unresolved issues)
Check if these issues are valid — if so, understand the root cause of each and fix them. If appropriate, use sub-agents to investigate and fix each issue separately.
<file name="apps/mesh/src/cli.ts">
<violation number="1" location="apps/mesh/src/cli.ts:269">
P0: Validate `--num-threads` as a finite positive integer before passing it to `startServer`; current parsing allows `Infinity` and fractional values, which can cause unbounded or unintended worker spawning.
(Based on your team's feedback about only flagging numeric validation for dynamic/user-provided inputs.) [FEEDBACK_USED]</violation>
</file>
<file name="apps/mesh/src/cli/commands/serve.ts">
<violation number="1" location="apps/mesh/src/cli/commands/serve.ts:182">
P1: Spreading `...process.env` before `...workerEnv` re-introduces every parent env var—including `POD_NAME`—into worker processes. This defeats `buildChildEnv`'s allowlist design and will cause NATS heartbeat KV collisions when `POD_NAME` is set in the parent environment (workers share the same pod name instead of generating their own UUIDs). Use `workerEnv` alone as the env.</violation>
</file>
Reply with feedback, questions, or to request a fix. Tag @cubic-dev-ai to re-run a review.
ed77720 to
1cdbeec
Compare
There was a problem hiding this comment.
If you aims to use this command with the chart, you have to bumps the chart version and changing it in the application to reflect it. values.yaml here are the default values of the chart
There was a problem hiding this comment.
Bumped chart version from 0.1.43 to 0.1.44 in Chart.yaml to reflect the values.yaml changes (7fbf69b).
0a3cf83 to
8ab211c
Compare
Adds a --num-threads CLI flag that spawns N-1 worker processes sharing the same port via SO_REUSEPORT (Linux only). Workers are spawned with an explicit env allowlist via buildChildEnv(), have their stdout/stderr piped through the TUI log store, and are killed on SIGINT/SIGTERM/exit. Local-mode seeding is guarded to the primary process only to prevent concurrent DB races. On non-Linux, emits a warning and falls back to 1 thread. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…p process.env spread - Use Number.isInteger(n) && n > 0 instead of Math.max to reject Infinity and fractional values for --num-threads - Remove ...process.env spread in worker Bun.spawn env to preserve buildChildEnv's allowlist design and prevent POD_NAME leakage Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Adds Dockerfile.test and docker-compose.test.yml to spin up PostgreSQL, NATS, and the mesh server from source inside Docker, enabling local testing of SO_REUSEPORT features that require Linux. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
…0, bump passthrough timeout - Add --no-tui flag to production Dockerfile CMD for consistency - Change test compose to expose port 8000 and set BASE_URL accordingly - Increase passthrough client list timeout from 1s to 2s Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
…ustom Dockerfiles - Update resilience Dockerfile.mesh to use CLI entrypoint with --num-threads support via NUM_THREADS env var (defaults to 1) - Set NUM_THREADS=4 in resilience docker-compose to run all tests with multi-core enabled via SO_REUSEPORT - Add multi-core.test.ts with health check, tool call, and concurrent request smoke tests - Remove deploy/docker-compose/Dockerfile.test and docker-compose.test.yml (superseded by resilience test infrastructure) Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
8ab211c to
6472bbd
Compare
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
What is this contribution about?
Adds a
--num-threads <n>CLI flag todeco servethat spawns N worker processes sharing the same port viaSO_REUSEPORT(Linux only). On non-Linux platforms, a warning is emitted and the server runs single-threaded as before.Key design decisions:
serve.tsafterbuildSettings()completes (Postgres/NATS already running), usingprocess.execPath+ the correct server entry for dev vs prodbuildChildEnv(settings)helper incli/build-child-env.tspasses an explicit allowlist of all required settings to workers (including secrets), avoiding the...process.envspread pattern and making the secret surface auditablePOD_NAMEis intentionally omitted from worker env so each worker generates its own UUID, preventing NATS heartbeat KV collisionsSIGINT/SIGTERM/exitfrom the primary processseedLocalModeis guarded to the primary process only (DECOCMS_IS_WORKER=1) to prevent concurrent DB racesREUSE_PORT=trueis set as an internal signal (bypasses the Settings pipeline by design — it is set programmatically immediately beforeimport("../../index")and is not user-facing config)Screenshots/Demonstration
N/A — CLI-only change.
How to Test
deco serve --num-threads 4ss -tlnp | grep <PORT>)deco serve --num-threads 4→ should warn and run single-threadeddeco --help→--num-threadsappears in the help textdeco completion→--num-threadsincluded in bash/zsh completionsMigration Notes
No migrations required. No breaking changes —
--num-threadsdefaults to1(existing behavior).Review Checklist
Summary by cubic
Add a Linux-only
--num-threads <n>flag todeco serveto run multiple worker processes viaSO_REUSEPORTfor better concurrency. Updates help/completions, server boot, logging, and Helm/test defaults; non-Linux stays single process.New Features
--num-threads(default1); strict integer validation; warns on non-Linux and runs single process.--no-tui); workers killed onSIGINT/SIGTERM/exit.buildChildEnv()allowlists env (noprocess.envspread) and omitsPOD_NAME; workers signaled viaREUSE_PORT=true; Bun.serve usesreusePort; local-mode seeding runs only in the primary.--num-threads 4(chart0.1.44); resilience suite runs withNUM_THREADS=4and adds multi-core smoke tests.Bug Fixes
--no-tuito the productionDockerfilefor consistent container logs.Written for commit 7fbf69b. Summary will update on new commits.