Skip to content

chore(agent-setup): centralize shared agent config#1578

Open
hassiebp wants to merge 3 commits intomainfrom
codex/review-agent-best-practices
Open

chore(agent-setup): centralize shared agent config#1578
hassiebp wants to merge 3 commits intomainfrom
codex/review-agent-best-practices

Conversation

@hassiebp
Copy link
Contributor

@hassiebp hassiebp commented Mar 26, 2026

Summary

  • centralize repo-owned agent guidance, shared config, and generated provider shims under .agents/
  • replace the tracked Claude project file with compatibility symlinks plus generated local tool config for Claude, Cursor, VS Code, Codex, and shared MCP discovery
  • refresh the canonical AGENTS.md guidance using current OpenAI and Anthropic best practices, including concise layered instructions, provider-doc-first updates, CI/tooling notes, and default protection for local .env files

External best-practice review

Reviewed and applied guidance from:

  • OpenAI Codex AGENTS.md guide
  • OpenAI prompting guide
  • OpenAI reasoning best practices guide
  • Anthropic Claude Code memory guide
  • Anthropic Claude Code MCP guide
  • Anthropic prompt engineering best practices

Verification

  • python3 scripts/agents/sync-agent-shims.py
  • python3 scripts/agents/sync-agent-shims.py --check
  • python3 -m py_compile scripts/agents/sync-agent-shims.py tests/test_sync_agent_shims.py
  • fixture-based smoke run for agent shim generation, check mode, and shared-skill symlink projection

Notes

  • I did not run poetry-backed pytest locally because installing dev dependencies required network access from this environment.

Disclaimer: Experimental PR review

Greptile Summary

This PR centralizes all shared agent configuration for Claude, Cursor, VS Code, and Codex under a new .agents/ directory, replacing the previously committed .claude/settings.json and root CLAUDE.md with a generated-shim approach driven by .agents/config.json. A new scripts/agents/sync-agent-shims.py script generates tool-specific files (.mcp.json, .claude/settings.json, .cursor/environment.json, etc.) at developer setup and CI time; the generated files are all gitignored. AGENTS.md and CLAUDE.md at the repo root become compatibility symlinks into .agents/.\n\nKey changes:\n- .agents/config.json is the new single source of truth for MCP servers, Claude permissions, Codex, and Cursor settings\n- sync-agent-shims.py handles write mode (generate), check mode (drift detection), symlink projection, and stale shim cleanup\n- .github/workflows/ci.yml adds a shim-generation smoke test before the poetry install steps in the type-checking job\n- CLAUDE.md (previously a 135-line Claude-specific guide) is replaced with a symlink pointing to AGENTS.md.agents/AGENTS.md; the canonical guide is refreshed with a verification matrix and CI notes\n- One P1 issue: skills_root.iterdir() is called unconditionally in main() and will raise FileNotFoundError if .agents/skills/ is absent, which the test suite never exercises (the fixture always creates the directory)

Confidence Score: 4/5

Safe to merge after addressing the skills_root.iterdir() crash guard; all other concerns are non-blocking P2 suggestions

The overall design is solid and well-documented. The one P1 is a straightforward defensive fix (if skills_root.exists() guard) that doesn't affect the happy path since the directory is always committed. The CI step is intentionally a smoke test rather than a drift guard — the only viable pattern given the gitignore approach. Test coverage is functional but could be broadened for the remaining formatters.

scripts/agents/sync-agent-shims.py (lines 233–237) for the missing directory guard; tests/test_sync_agent_shims.py for missing formatter assertions

Important Files Changed

Filename Overview
scripts/agents/sync-agent-shims.py New script that generates tool-specific shims from .agents/config.json; unconditional skills_root.iterdir() can raise FileNotFoundError if directory is absent
tests/test_sync_agent_shims.py New fixture-based tests for the shim script; covers sync output and check-mode drift detection but leaves several formatter outputs unverified
.agents/config.json New canonical shared agent config — well-structured with MCP servers, Claude permissions, Codex, and Cursor settings
.github/workflows/ci.yml Added "Verify agent shim generation" step to type-checking job; the sync-then-check pattern is effectively an idempotency smoke test rather than a drift guard
.gitignore Added ignore entries for all generated agent shim files; correctly excludes local-only artifacts
.agents/AGENTS.md New canonical root agent guide with comprehensive project structure, verification matrix, architecture, and configuration documentation
CONTRIBUTING.md Added shared agent setup section documenting the .agents/ layout, generated vs committed files, and workflow steps

Flowchart

%%{init: {'theme': 'neutral'}}%%
flowchart TD
    A[".agents/config.json\n(source of truth)"] --> B["sync-agent-shims.py"]
    A --> C["AGENTS.md\n(canonical guide)"]
    B -->|write mode| D[".claude/settings.json"]
    B -->|write mode| E[".mcp.json"]
    B -->|write mode| F[".cursor/environment.json\n.cursor/mcp.json"]
    B -->|write mode| G[".vscode/mcp.json"]
    B -->|write mode| H[".codex/config.toml\n.codex/environments/environment.toml\n(optional)"]
    B -->|symlinks| I["AGENTS.md → .agents/AGENTS.md"]
    B -->|symlinks| J["CLAUDE.md → AGENTS.md"]
    B -->|symlinks| K[".claude/skills/* → .agents/skills/*"]
    D -.->|gitignored| L["Not committed"]
    E -.->|gitignored| L
    F -.->|gitignored| L
    G -.->|gitignored| L
    H -.->|gitignored| L
    B -->|check mode| M{"Files match\nexpected content?"}
    M -->|yes| N["exit 0"]
    M -->|no| O["exit 1 + stderr"]
Loading

Reviews (1): Last reviewed commit: "chore(agent-setup): centralize shared ag..." | Re-trigger Greptile

Greptile also left 1 inline comment on this PR.

@github-actions
Copy link

@claude review

Copy link

@chatgpt-codex-connector chatgpt-codex-connector bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 8d5e4d3755

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment on lines +167 to +170
def find_unexpected_children(path: Path, expected_children: set[str]) -> list[Path]:
if not path.exists():
return []
return [child for child in path.iterdir() if child.name not in expected_children]

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 Badge Prevent pruning through symlinked managed directories

The new shim sync can delete files outside the repository when .claude/skills is a symlink: find_unexpected_children iterates whatever that path points to, and sync_managed_directories then removes each “unexpected” child. In a local repro, pointing .claude/skills to another directory caused sync-agent-shims.py to unlink files in that external directory. This is a destructive path-traversal/data-loss risk; guard managed directories against symlinks (or enforce resolved paths stay under repo_root) before deleting children.

Useful? React with 👍 / 👎.

Comment on lines +21 to +25
"Bash(cat:*)",
"Bash(head:*)",
"Bash(tail:*)"
],
"deny": [

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Block .env reads in Bash permission allowlist

The config claims to protect .env secrets, but this allowlist explicitly permits Bash(cat:*), Bash(head:*), and Bash(tail:*) while only denying the Read(./.env*) tool path. In practice, .claude/settings.json generated from this config still allows shell reads like cat .env, so the documented default protection is bypassed whenever Bash is used.

Useful? React with 👍 / 👎.

Comment on lines +233 to +237
continue

path.parent.mkdir(parents=True, exist_ok=True)

if is_matching_symlink(path, target):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 skills_root.iterdir() raises FileNotFoundError when directory is absent

skills_root.iterdir() is called unconditionally. If .agents/skills/ is missing (e.g., sparse checkout, CI environment where the directory was accidentally removed), the script crashes with an unhandled FileNotFoundError rather than a meaningful error message.

The canonical config always has the directory committed (via skills/README.md), but defensive handling avoids surprising failures. The test fixture always creates the directory so this path is never exercised by the test suite.

Suggested change
continue
path.parent.mkdir(parents=True, exist_ok=True)
if is_matching_symlink(path, target):
skills_root = repo_root / ".agents" / "skills"
shared_skill_names = sorted(
entry.name
for entry in (skills_root.iterdir() if skills_root.exists() else [])
if entry.is_dir() and (entry / "SKILL.md").exists()
)

Comment on lines +319 to +324
skills_root = repo_root / ".agents" / "skills"
shared_skill_names = sorted(
entry.name
for entry in skills_root.iterdir()
if entry.is_dir() and (entry / "SKILL.md").exists()
)
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🟡 In main(), skills_root.iterdir() is called unconditionally at line 322 without guarding against the case where .agents/skills/ does not exist, which would raise FileNotFoundError. The one-line fix is to add if skills_root.exists() else [] to match the guard pattern already used in find_unexpected_children at line 168.

Extended reasoning...

What the bug is and how it manifests

At scripts/agents/sync-agent-shims.py:322, skills_root.iterdir() is called as a generator expression inside sorted(...) with no prior existence check on the directory:

skills_root = repo_root / ".agents" / "skills"
shared_skill_names = sorted(
    entry.name
    for entry in skills_root.iterdir()   # raises FileNotFoundError if absent
    if entry.is_dir() and (entry / "SKILL.md").exists()
)

If .agents/skills/ does not exist, Python raises FileNotFoundError: [Errno 2] No such file or directory and the entire script aborts before writing any output.

The specific code path that triggers it

The script is invoked as python3 scripts/agents/sync-agent-shims.py [--repo-root PATH]. It resolves skills_root = repo_root / ".agents" / "skills" and immediately calls skills_root.iterdir() with no guard. Any caller that passes --repo-root pointing to a checkout that lacks this directory will trigger the crash.

Why existing code doesn't prevent it

The helper find_unexpected_children at line 168 already demonstrates the correct defensive pattern:

def find_unexpected_children(path: Path, expected_children: set[str]) -> list[Path]:
    if not path.exists():
        return []
    return [child for child in path.iterdir() if child.name not in expected_children]

This guard is absent at line 322, creating an inconsistency within the same file.

Addressing the refutation

The refutation correctly notes that .agents/skills/README.md is committed in this PR, so any git clone of this specific repository will have the directory. The normal CI path (GitHub Actions fresh checkout) is not affected. However, the script explicitly supports --repo-root to run against arbitrary repository paths — this is evidenced by both the argument parser and the test fixture in tests/test_sync_agent_shims.py which calls run_script(repo_root) against a synthetic temp directory. The distinction between "generated output directory" and "committed source directory" drawn in the refutation also applies less cleanly here: this script is designed to be portable. The inconsistency with find_unexpected_children — where the analogous path.iterdir() call IS guarded — makes this an oversight rather than intentional behavior.

Step-by-step proof

  1. Create a minimal fixture: mkdir -p /tmp/myrepo/.agents with a minimal config.json but no skills/ subdirectory.
  2. Run: python3 scripts/agents/sync-agent-shims.py --repo-root /tmp/myrepo
  3. Script reaches line 322, calls Path("/tmp/myrepo/.agents/skills").iterdir()
  4. Python raises: FileNotFoundError: [Errno 2] No such file or directory: '/tmp/myrepo/.agents/skills'
  5. Script exits with an unhandled exception traceback instead of gracefully treating skills as an empty list.

How to fix it

Replace the generator on line 322:

# Before
for entry in skills_root.iterdir()
# After
for entry in (skills_root.iterdir() if skills_root.exists() else [])

This is a one-line change consistent with the pattern already used in find_unexpected_children.

Comment on lines +161 to +164
def is_matching_symlink(path: Path, target: Path) -> bool:
if not path.is_symlink():
return False
return path.resolve() == target.resolve()
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🟡 is_matching_symlink uses path.resolve() == target.resolve(), which follows all symlinks to their final destination and cannot detect when CLAUDE.md points directly to .agents/AGENTS.md instead of the intended AGENTS.md intermediate. In --check mode this drift goes undetected, and in sync mode the script skips recreating the symlink, leaving the incorrect chain silently in place.

Extended reasoning...

Bug Analysis

is_matching_symlink (lines 161-164 of scripts/agents/sync-agent-shims.py) checks whether a path is a symlink pointing at target using path.resolve() == target.resolve(). The Path.resolve() method recursively follows all symlinks until it reaches a real file. This means the check only verifies that both paths ultimately land on the same inode, not that the immediate link target matches what was intended.

Affected Code Path

For CLAUDE.md, sync_symlink_outputs is called with target = repo_root / "AGENTS.md". The intended symlink chain is:

CLAUDE.md  →  AGENTS.md  →  .agents/AGENTS.md

target.resolve() for AGENTS.md (itself a symlink) resolves through to repo_root/.agents/AGENTS.md. If CLAUDE.md was manually or accidentally set to point directly to .agents/AGENTS.md, then path.resolve() for CLAUDE.md also resolves to repo_root/.agents/AGENTS.md, making the two sides equal and is_matching_symlink return True.

Why Existing Code Does Not Prevent This

The function never calls os.readlink() or otherwise inspects the immediate link destination — it only checks the final resolved path. Because resolve() collapses the entire chain, any path that eventually reaches .agents/AGENTS.md is treated as correct, regardless of how many hops it skips or adds.

The test in tests/test_sync_agent_shims.py has the same blind spot: it asserts (repo_root / "CLAUDE.md").resolve() == (repo_root / "AGENTS.md").resolve(), which would pass even if CLAUDE.md bypassed AGENTS.md entirely.

Step-by-Step Proof

  1. Initial state: CLAUDE.md is a symlink pointing to .agents/AGENTS.md directly (not to AGENTS.md).
  2. sync_symlink_outputs is called with path = repo_root/CLAUDE.md, target = repo_root/AGENTS.md.
  3. is_matching_symlink is called.
  4. path.is_symlink()True (CLAUDE.md is a symlink, so check proceeds).
  5. path.resolve()repo_root/.agents/AGENTS.md (follows the direct link).
  6. target.resolve()repo_root/.agents/AGENTS.md (AGENTS.md is itself a symlink to .agents/AGENTS.md).
  7. Both sides are equal → is_matching_symlink returns True.
  8. In check mode: no drift is reported. In sync mode: remove_path + symlink_to are skipped, leaving the wrong chain.

Impact

Content-level impact is zero — both chains deliver the same bytes. The bug only prevents the drift-detection invariant from being enforced: if someone manually installs CLAUDE.md -> .agents/AGENTS.md (skipping the AGENTS.md hop), neither --check nor the regular sync will notice or fix it.

Fix

Replace path.resolve() == target.resolve() with a direct comparison of the immediate link destination:

def is_matching_symlink(path: Path, target: Path) -> bool:
    if not path.is_symlink():
        return False
    relative_target = Path(os.path.relpath(target, start=path.parent))
    return Path(os.readlink(path)) == relative_target

The test assertion for CLAUDE.md should similarly be updated to check the immediate link value rather than resolve().

@hassiebp
Copy link
Contributor Author

Addressed the valid review feedback in a7b3e5e.

Changes included:

  • guard managed generated directories against symlink traversal before pruning
  • compare immediate symlink targets during sync/check so CLAUDE.md -> AGENTS.md drift is detected correctly
  • handle missing .agents/skills/ defensively
  • stop auto-approving Bash in the shared Claude settings so .env protection cannot be bypassed via shell allow rules
  • expand the shim tests to cover the new edge cases

Verification rerun:

  • python3 scripts/agents/sync-agent-shims.py
  • python3 scripts/agents/sync-agent-shims.py --check
  • python3 -m py_compile scripts/agents/sync-agent-shims.py tests/test_sync_agent_shims.py
  • fixture-based harness covering the shim edge cases

Copy link

@chatgpt-codex-connector chatgpt-codex-connector bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: a7b3e5e401

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment on lines +387 to +390
has_mismatch = sync_symlink_outputs(symlink_outputs, check_mode) or has_mismatch
has_mismatch = (
sync_managed_directories(managed_directories, repo_root, check_mode)
or has_mismatch

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 Badge Validate managed paths before generating shim outputs

The script writes symlinks/files before it enforces the managed-directory safety check, so a symlinked tool directory can still be modified outside the repo. In main(), sync_file_outputs/sync_symlink_outputs run before sync_managed_directories; fresh evidence on this commit is that making .claude point to an external directory causes the sync run to create external settings.json and skills/example entries before raising the runtime error. This preserves the non-zero exit but still mutates external paths, which is the data-loss/path-traversal risk this guard is meant to prevent.

Useful? React with 👍 / 👎.

Copy link

@chatgpt-codex-connector chatgpt-codex-connector bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 5582643e98

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment on lines +384 to +388
managed_directories = [
{
"path": repo_root / ".claude" / "skills",
"expected_children": set(shared_skill_names),
}

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 Badge Validate all shim output paths stay within repo

The safety checks only cover managed_directories for .claude/skills, but sync_file_outputs still writes .cursor/*, .vscode/*, .codex/*, and .mcp.json without any repo-boundary validation. If one of those directories is a symlink (for example .cursor -> /tmp/external), running the sync script writes files outside the repository and exits successfully, which is an unintended path-traversal/data-loss risk during install/postinstall.

Useful? React with 👍 / 👎.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant