Skip to content

feat(cli): add --num-threads option for multi-process server (Linux)#2980

Merged
tlgimenes merged 8 commits intomainfrom
tlgimenes/reuse-port-option
Apr 6, 2026
Merged

feat(cli): add --num-threads option for multi-process server (Linux)#2980
tlgimenes merged 8 commits intomainfrom
tlgimenes/reuse-port-option

Conversation

@tlgimenes
Copy link
Copy Markdown
Contributor

@tlgimenes tlgimenes commented Apr 1, 2026

What is this contribution about?

Adds a --num-threads <n> CLI flag to deco serve that spawns N worker processes sharing the same port via SO_REUSEPORT (Linux only). On non-Linux platforms, a warning is emitted and the server runs single-threaded as before.

Key design decisions:

  • Worker processes are spawned in serve.ts after buildSettings() completes (Postgres/NATS already running), using process.execPath + the correct server entry for dev vs prod
  • A new buildChildEnv(settings) helper in cli/build-child-env.ts passes an explicit allowlist of all required settings to workers (including secrets), avoiding the ...process.env spread pattern and making the secret surface auditable
  • POD_NAME is intentionally omitted from worker env so each worker generates its own UUID, preventing NATS heartbeat KV collisions
  • Worker stdout/stderr is piped through the TUI log store to avoid corrupting Ink's cursor rendering
  • Workers are killed on SIGINT/SIGTERM/exit from the primary process
  • seedLocalMode is guarded to the primary process only (DECOCMS_IS_WORKER=1) to prevent concurrent DB races
  • REUSE_PORT=true is set as an internal signal (bypasses the Settings pipeline by design — it is set programmatically immediately before import("../../index") and is not user-facing config)

Screenshots/Demonstration

N/A — CLI-only change.

How to Test

  1. On a Linux machine: deco serve --num-threads 4
  2. Confirm 4 Bun processes are listening on the same port (ss -tlnp | grep <PORT>)
  3. On macOS: deco serve --num-threads 4 → should warn and run single-threaded
  4. deco --help--num-threads appears in the help text
  5. deco completion--num-threads included in bash/zsh completions

Migration Notes

No migrations required. No breaking changes — --num-threads defaults to 1 (existing behavior).

Review Checklist

  • PR title is clear and descriptive
  • Changes are tested and working
  • Documentation is updated (if needed)
  • No breaking changes

Summary by cubic

Add a Linux-only --num-threads <n> flag to deco serve to run multiple worker processes via SO_REUSEPORT for better concurrency. Updates help/completions, server boot, logging, and Helm/test defaults; non-Linux stays single process.

  • New Features

    • Linux-only multi-process with --num-threads (default 1); strict integer validation; warns on non-Linux and runs single process.
    • Spawns N-1 workers; stdout/stderr piped to the TUI (or inherited with --no-tui); workers killed on SIGINT/SIGTERM/exit.
    • buildChildEnv() allowlists env (no process.env spread) and omits POD_NAME; workers signaled via REUSE_PORT=true; Bun.serve uses reusePort; local-mode seeding runs only in the primary.
    • Help/completions updated; Helm runs with --num-threads 4 (chart 0.1.44); resilience suite runs with NUM_THREADS=4 and adds multi-core smoke tests.
  • Bug Fixes

    • Added --no-tui to the production Dockerfile for consistent container logs.

Written for commit 7fbf69b. Summary will update on new commits.

@github-actions
Copy link
Copy Markdown
Contributor

github-actions bot commented Apr 1, 2026

🧪 Benchmark

Should we run the Virtual MCP strategy benchmark for this PR?

React with 👍 to run the benchmark.

Reaction Action
👍 Run quick benchmark (10 & 128 tools)

Benchmark will run on the next push after you react.

@github-actions
Copy link
Copy Markdown
Contributor

github-actions bot commented Apr 1, 2026

Release Options

Suggested: Minor (2.237.0) — based on feat: prefix

React with an emoji to override the release type:

Reaction Type Next Version
👍 Prerelease 2.236.1-alpha.1
🎉 Patch 2.236.1
❤️ Minor 2.237.0
🚀 Major 3.0.0

Current version: 2.236.0

Note: If multiple reactions exist, the smallest bump wins. If no reactions, the suggested bump is used (default: patch).

Copy link
Copy Markdown
Contributor

@cubic-dev-ai cubic-dev-ai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

2 issues found across 5 files

Prompt for AI agents (unresolved issues)

Check if these issues are valid — if so, understand the root cause of each and fix them. If appropriate, use sub-agents to investigate and fix each issue separately.


<file name="apps/mesh/src/cli.ts">

<violation number="1" location="apps/mesh/src/cli.ts:269">
P0: Validate `--num-threads` as a finite positive integer before passing it to `startServer`; current parsing allows `Infinity` and fractional values, which can cause unbounded or unintended worker spawning.

(Based on your team's feedback about only flagging numeric validation for dynamic/user-provided inputs.) [FEEDBACK_USED]</violation>
</file>

<file name="apps/mesh/src/cli/commands/serve.ts">

<violation number="1" location="apps/mesh/src/cli/commands/serve.ts:182">
P1: Spreading `...process.env` before `...workerEnv` re-introduces every parent env var—including `POD_NAME`—into worker processes. This defeats `buildChildEnv`'s allowlist design and will cause NATS heartbeat KV collisions when `POD_NAME` is set in the parent environment (workers share the same pod name instead of generating their own UUIDs). Use `workerEnv` alone as the env.</violation>
</file>

Reply with feedback, questions, or to request a fix. Tag @cubic-dev-ai to re-run a review.

Comment thread apps/mesh/src/cli.ts Outdated
Comment thread apps/mesh/src/cli/commands/serve.ts Outdated
@tlgimenes tlgimenes force-pushed the tlgimenes/reuse-port-option branch from ed77720 to 1cdbeec Compare April 1, 2026 15:39
Comment thread deploy/helm/values.yaml
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If you aims to use this command with the chart, you have to bumps the chart version and changing it in the application to reflect it. values.yaml here are the default values of the chart

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Bumped chart version from 0.1.43 to 0.1.44 in Chart.yaml to reflect the values.yaml changes (7fbf69b).

@tlgimenes tlgimenes force-pushed the tlgimenes/reuse-port-option branch 4 times, most recently from 0a3cf83 to 8ab211c Compare April 6, 2026 20:43
tlgimenes and others added 7 commits April 6, 2026 17:57
Adds a --num-threads CLI flag that spawns N-1 worker processes sharing
the same port via SO_REUSEPORT (Linux only). Workers are spawned with an
explicit env allowlist via buildChildEnv(), have their stdout/stderr piped
through the TUI log store, and are killed on SIGINT/SIGTERM/exit. Local-mode
seeding is guarded to the primary process only to prevent concurrent DB races.
On non-Linux, emits a warning and falls back to 1 thread.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…p process.env spread

- Use Number.isInteger(n) && n > 0 instead of Math.max to reject Infinity
  and fractional values for --num-threads
- Remove ...process.env spread in worker Bun.spawn env to preserve
  buildChildEnv's allowlist design and prevent POD_NAME leakage

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Adds Dockerfile.test and docker-compose.test.yml to spin up PostgreSQL,
NATS, and the mesh server from source inside Docker, enabling local
testing of SO_REUSEPORT features that require Linux.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
…0, bump passthrough timeout

- Add --no-tui flag to production Dockerfile CMD for consistency
- Change test compose to expose port 8000 and set BASE_URL accordingly
- Increase passthrough client list timeout from 1s to 2s

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
…ustom Dockerfiles

- Update resilience Dockerfile.mesh to use CLI entrypoint with --num-threads
  support via NUM_THREADS env var (defaults to 1)
- Set NUM_THREADS=4 in resilience docker-compose to run all tests with
  multi-core enabled via SO_REUSEPORT
- Add multi-core.test.ts with health check, tool call, and concurrent
  request smoke tests
- Remove deploy/docker-compose/Dockerfile.test and docker-compose.test.yml
  (superseded by resilience test infrastructure)

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
@tlgimenes tlgimenes force-pushed the tlgimenes/reuse-port-option branch from 8ab211c to 6472bbd Compare April 6, 2026 21:01
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
@tlgimenes tlgimenes merged commit d61b72a into main Apr 6, 2026
15 checks passed
@tlgimenes tlgimenes deleted the tlgimenes/reuse-port-option branch April 6, 2026 21:48
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants