Skip to content

fix(ai): isolate cursor agent config dirs for parallel execution#440

Merged
janhesters merged 1 commit intomasterfrom
fix/cursor-parallel-config-race
May 1, 2026
Merged

fix(ai): isolate cursor agent config dirs for parallel execution#440
janhesters merged 1 commit intomasterfrom
fix/cursor-parallel-config-race

Conversation

@janhesters
Copy link
Copy Markdown
Collaborator

Summary

  • When multiple Cursor CLI agents run concurrently (result agents + judge agents), they all race on ~/.cursor/cli-config.json.tmp during atomic config writes, causing ENOENT crashes on rename
  • Fix: each spawned cursor agent process gets a unique CURSOR_CONFIG_DIR (using cuid2) pointing to an isolated temp directory, eliminating the race
  • Only applies to cursor agents — claude and opencode agents are unaffected

Root cause

The Cursor CLI writes config atomically via cli-config.json.tmpcli-config.json rename. With concurrent processes, one process completes the rename while another tries to rename the same .tmp file that no longer exists. This is the same class of bug reported in Gemini CLI and Claude Code.

Test plan

  • New test: each cursor agent spawn receives a unique CURSOR_CONFIG_DIR
  • New test: concurrent cursor spawns all get different CURSOR_CONFIG_DIR values
  • New test: non-cursor agents (claude) do NOT get CURSOR_CONFIG_DIR set
  • All 238 existing tests pass with no regressions

Copy link
Copy Markdown
Collaborator

@ianwhitedeveloper ianwhitedeveloper left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice, focused fix — correct for the documented race and tests pass locally (238/238). Requesting a couple of changes before merge; the rest are nits.

Must-fix

1. Temp dirs are never cleaned up. Each spawn creates tmpdir()/riteway-cursor-<cuid> and nothing removes it. macOS purges /var/folders/... only on reboot, and Linux often never cleans /tmp. CI workers and dev machines running riteway ai repeatedly will leak directories indefinitely. Please add a best-effort await rm(dir, { recursive: true, force: true }) in a finally block in runAgentProcess.

Should-fix

2. Cursor detection by literal command string is fragile.

Why this matters and a suggested shape
const isCursorAgent = (command) => command === 'agent';

A user-supplied registry (riteway.agent-config.json) or --agent-config file can use any command — /usr/local/bin/agent, cursor-agent, a wrapper script — and silently lose isolation. Conversely, any unrelated tool literally named agent would get a spurious CURSOR_CONFIG_DIR.

The PR description acknowledges Claude Code and Gemini CLI have the same class of bug, so the mechanism wants to be generic. Move the knowledge into agent-config.js:

cursor: {
  command: 'agent',
  args: ['--print', '--output-format', 'json'],
  outputFormat: 'json',
  isolateConfigEnv: 'CURSOR_CONFIG_DIR',
  isolateConfigPrefix: 'riteway-cursor-'
}

Then buildSpawnOptions(agentConfig) becomes generic and self-documenting, and the agentConfigFileSchema gains an explicit opt-in field for custom registries.

3. Verify (or guarantee) the dir exists. The code points CURSOR_CONFIG_DIR at a path that doesn't exist on disk. Works only if the Cursor CLI mkdir -ps it before writing .tmp. Either add a comment documenting the assumption against a specific cursor-cli version, or mkdirSync(dir, { recursive: true }) before spawn to be defensive.

Nits

Test assertions could be tightened
  • 'does not set CURSOR_CONFIG_DIR for non-cursor agents' asserts on options?.env?.CURSOR_CONFIG_DIR === undefined. The actual contract today is that the third arg is undefined entirely. Stronger:

    actual: options,
    expected: undefined
  • 'passes a unique CURSOR_CONFIG_DIR env to each cursor agent spawn' only type-checks dir1. If dir2 were undefined, dir1 !== dir2 would still be true and mask the bug. Type-check both.

Docs
  • executeAgent JSDoc doesn't mention the implicit env mutation when the command is cursor — a one-liner note (or a comment near buildSpawnOptions citing the upstream Cursor CLI bug) would help future readers.
  • No RELEASING.md / changelog entry; this unblocks parallel --agent cursor runs and probably deserves a note.

Out of scope but worth noting

No new untrusted input; spreading process.env into the child is appropriate; trust boundary documented in agent-config.js is preserved. No OWASP concerns from this diff.

@ianwhitedeveloper
Copy link
Copy Markdown
Collaborator

ianwhitedeveloper commented Apr 19, 2026

Forgot to mention: manual testing corroborates the fix — the ENOENT on cli-config.json.tmp rename that was very common before this change has not reproduced for me since applying it. So the core mechanism works; the requested changes above are about hardening (cleanup, generality) rather than correctness of the fix itself.

Copy link
Copy Markdown
Collaborator

@ianwhitedeveloper ianwhitedeveloper left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Approving with the caveat that requested changes that seem salient are created as follow up action items

@janhesters
Copy link
Copy Markdown
Collaborator Author

Addressed the review feedback in d5ccedd — here's the rundown:

1. Temp dir cleanup (must-fix) — We don't think this needs explicit cleanup. Cursor's CLI itself cleans up these config dirs (that's actually how we discovered the race — the concurrent cleanup/rename was the crash site). The OS also clears /tmp on reboot. Adding a finally block with rm would race against Cursor's own cleanup and add complexity for no real benefit.

2. Cursor detection is fragile (should-fix) — Fixed. Isolation is now config-driven via isolateConfigEnv and isolateConfigPrefix fields in agent-config.js. buildSpawnOptions is generic — no more isCursorAgent check. Custom registries can opt in for any CLI with the same class of bug.

3. Verify dir exists (should-fix) — Added a comment documenting that the CLI creates the directory on first write. cuid2 makes path collisions effectively impossible (~1.4×10⁻³⁶ probability).

4. Non-cursor test assertion (nit) — Fixed. Now asserts options === undefined instead of drilling into options?.env?.CURSOR_CONFIG_DIR.

5. Type-check both dirs (nit) — Fixed. Both dir1 and dir2 are now individually type-checked as string before the uniqueness assertion.

6. JSDoc (nit) — Fixed. Added isolateConfigEnv and isolateConfigPrefix params to the executeAgent JSDoc.

7. Changelog (nit) — The changelog is auto-generated by release-it during the release process, so no manual entry is needed.

…ndition

When multiple Cursor CLI agents run concurrently, they all race on the
same ~/.cursor/cli-config.json.tmp file during atomic config writes.
One process renames the .tmp file while another tries to do the same,
causing ENOENT crashes.

Fix: set a unique config dir (via cuid2) per spawned agent process when
the agent config specifies isolateConfigEnv. The isolation mechanism is
generic and config-driven via isolateConfigEnv/isolateConfigPrefix fields
in agent-config.js, so custom registries can opt in for any CLI with the
same class of bug. Only the built-in cursor config enables it by default.
@janhesters janhesters force-pushed the fix/cursor-parallel-config-race branch from d5ccedd to d1efa7f Compare May 1, 2026 08:26
@janhesters janhesters merged commit 9e8c2ce into master May 1, 2026
2 checks passed
@janhesters janhesters deleted the fix/cursor-parallel-config-race branch May 1, 2026 08:27
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants