feat: add --three-file flag to generate .calls/.deps/.impact shards#120
Conversation
Benchmark data shows the three-file format dramatically outperforms the single .graph format: - 68% faster (264s vs 827s on Medusa) - 78% fewer tool uses (42 vs 190) Each file contains one relationship type, so grep hits are more targeted — the agent gets call data when searching for function names, deps when searching imports, impact when checking blast radius. Usage: supermodel analyze --three-file Closes supermodeltools#117
|
Note Reviews pausedIt looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the Use the following commands to manage reviews:
Use the checkboxes below for quick actions:
WalkthroughAdds a Changes
Sequence Diagram(s)sequenceDiagram
participant CLI as CLI
participant Analyze as AnalyzeCmd
participant Shard as ShardHandler
participant Render as Renderer
participant FS as Filesystem/Cache
CLI->>Analyze: parse args (--three-file?, --no-shards?)
Analyze->>Shard: Generate(opts{ThreeFile, Force, ...})
Shard->>Shard: shards enabled? (cfg.ShardsEnabled() && !noShards)
Shard->>Render: renderShards(repoDir, cache, files, dryRun, threeFile)
alt threeFile == true
Render->>Render: RenderAllThreeFile(files)
loop per source
Render->>FS: write/delete `.calls`, `.deps`, `.impact`
end
else threeFile == false
Render->>Render: RenderAll(files)
Render->>FS: write/delete single `.graph`
end
Render->>Shard: return changedCount
Shard->>Analyze: done
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~20 minutes Suggested reviewers
Poem
🚥 Pre-merge checks | ✅ 4 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (4 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Comment |
Demo$ supermodel analyze --three-file --force
✓ Wrote 291 shards for 134 source files (2009 nodes, 4493 relationships)
$ ls internal/cache/cache.*.go
internal/cache/cache.calls.go
internal/cache/cache.deps.go
internal/cache/cache.impact.go
$ cat internal/cache/cache.calls.go
//go:build ignore
package ignore
// @generated supermodel-shard — do not edit
// [calls]
// Evict → dir internal/cache/cache.go:23
// Get ← GetGraph internal/analyze/handler.go:39
// Get → dir internal/cache/cache.go:23
// HashFile ← GetGraph internal/analyze/handler.go:39
// HashFile ← Run internal/deadcode/handler.go:27
// HashFile ← Run internal/blastradius/handler.go:25
// Put ← GetGraph internal/analyze/handler.go:39
// Put → dir internal/cache/cache.go:23
// dir ← Get internal/cache/cache.go:28
// dir ← Put internal/cache/cache.go:44
// dir → Dir internal/config/config.go:25
$ cat internal/cache/cache.deps.go
//go:build ignore
package ignore
// @generated supermodel-shard — do not edit
// [deps]
// imports internal/api/client.go
// imports internal/api/types.go
// imports internal/config/config.go
// imported-by cmd/audit.go
// imported-by internal/analyze/handler.go
// imported-by internal/blastradius/handler.go
// imported-by internal/deadcode/handler.go
// imported-by internal/mcp/server.go
$ cat internal/cache/cache.impact.go
//go:build ignore
package ignore
// @generated supermodel-shard — do not edit
// [impact]
// risk HIGH
// domains ArchDocs · CoreAPI · SupermodelCLI
// direct 8
// transitive 24
// affects cmd/audit.go · internal/analyze/handler.go · internal/blastradius/handler.go · internal/deadcode/handler.go · internal/mcp/server.goAn agent searching for `HashFile` via grep will hit `cache.calls.go` and immediately see every caller — no need to grep through 134 source files to reconstruct the call chain. |
There was a problem hiding this comment.
Actionable comments posted: 1
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
cmd/analyze.go (1)
40-45:⚠️ Potential issue | 🟡 MinorHandle conflicting flags
--three-file+--no-shards.Right now
--three-fileis silently ignored when--no-shardsis set (Line 40 gate). A quick validation/error makes CLI behavior clearer.Suggested tweak
RunE: func(cmd *cobra.Command, args []string) error { cfg, err := config.Load() if err != nil { return err } if err := cfg.RequireAPIKey(); err != nil { return err } + if noShards && threeFile { + return fmt.Errorf("--three-file cannot be used with --no-shards") + } dir := "."Also applies to: 53-53
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@cmd/analyze.go` around lines 40 - 45, Validate and error on the conflicting CLI flags --three-file (threeFile) and --no-shards (noShards) before the shard-gate that checks cfg.ShardsEnabled(); specifically, if noShards is true and threeFile is true return a user-facing error explaining the conflict instead of silently ignoring threeFile. Apply the same check in both places where the code branches to shards.Generate (the block using cfg.ShardsEnabled() && !noShards and the similar branch around analyze.Run) so that callers get a clear validation error rather than silent behavior.
🧹 Nitpick comments (1)
internal/shards/render.go (1)
283-328: Consider pruning legacy.graphfiles when--three-fileis enabled.If users switch formats, old
.graph.*shards can stick around and pollute grep/search results. A small cleanup in this mode would keep output consistent with intent.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@internal/shards/render.go` around lines 283 - 328, When rendering three-file shards in RenderAllThreeFile, also prune any legacy .graph.* shard files for the same source to avoid stale files; after computing callsPath/depsPath/impactPath (in the loop over srcFile) add a cleanup step that uses filepath.Glob to find patterns for the legacy graph shards for that source (e.g. srcFile+".graph.*"), and for each match call os.Remove (respecting dryRun so you don’t delete when dryRun is true), ignoring non-fatal errors and treating path-traversal errors the same way you handle WriteShard errors; place this cleanup before writing the new .calls/.deps/.impact files so old graph shards are removed when switching formats.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@internal/shards/render.go`:
- Around line 311-314: The current branch deletes stale files directly
(os.Remove(full)) and ignores dryRun and the path-safety checks used by
WriteShard; replace the direct delete with a safe helper (e.g., RemoveShard)
that resolves absolute paths, enforces the repoDir prefix check to prevent path
traversal, honors the dryRun flag by logging the intended delete instead of
performing it, and returns errors except for IsNotExist; update the code path
that handles item.content == "" to call that helper with repoDir, item.path and
dryRun instead of calling os.Remove directly.
---
Outside diff comments:
In `@cmd/analyze.go`:
- Around line 40-45: Validate and error on the conflicting CLI flags
--three-file (threeFile) and --no-shards (noShards) before the shard-gate that
checks cfg.ShardsEnabled(); specifically, if noShards is true and threeFile is
true return a user-facing error explaining the conflict instead of silently
ignoring threeFile. Apply the same check in both places where the code branches
to shards.Generate (the block using cfg.ShardsEnabled() && !noShards and the
similar branch around analyze.Run) so that callers get a clear validation error
rather than silent behavior.
---
Nitpick comments:
In `@internal/shards/render.go`:
- Around line 283-328: When rendering three-file shards in RenderAllThreeFile,
also prune any legacy .graph.* shard files for the same source to avoid stale
files; after computing callsPath/depsPath/impactPath (in the loop over srcFile)
add a cleanup step that uses filepath.Glob to find patterns for the legacy graph
shards for that source (e.g. srcFile+".graph.*"), and for each match call
os.Remove (respecting dryRun so you don’t delete when dryRun is true), ignoring
non-fatal errors and treating path-traversal errors the same way you handle
WriteShard errors; place this cleanup before writing the new
.calls/.deps/.impact files so old graph shards are removed when switching
formats.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro
Run ID: de761c66-e3da-47d8-818e-7f7afa5d4ebd
📒 Files selected for processing (3)
cmd/analyze.gointernal/shards/handler.gointernal/shards/render.go
| if item.content == "" { | ||
| full := filepath.Join(repoDir, item.path) | ||
| _ = os.Remove(full) | ||
| continue |
There was a problem hiding this comment.
Protect stale-file deletes (dry-run + path safety).
This branch deletes files even in dry-run, and it bypasses the path traversal guard used by WriteShard. Please route deletes through a safe helper and honor dryRun.
Safer pattern
for _, item := range []struct {
path string
content string
}{
{depsPath, deps},
{callsPath, calls},
{impactPath, impact},
} {
if item.content == "" {
- full := filepath.Join(repoDir, item.path)
- _ = os.Remove(full)
+ if err := RemoveShard(repoDir, item.path, dryRun); err != nil {
+ return written, err
+ }
continue
}func RemoveShard(repoDir, shardPath string, dryRun bool) error {
full, err := filepath.Abs(filepath.Join(repoDir, shardPath))
if err != nil {
return err
}
repoAbs, err := filepath.Abs(repoDir)
if err != nil {
return err
}
if !strings.HasPrefix(full, repoAbs+string(filepath.Separator)) && full != repoAbs {
return fmt.Errorf("path traversal blocked: %s", shardPath)
}
if dryRun {
fmt.Printf(" [dry-run] would remove %s\n", full)
return nil
}
if err := os.Remove(full); err != nil && !os.IsNotExist(err) {
return err
}
return nil
}🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@internal/shards/render.go` around lines 311 - 314, The current branch deletes
stale files directly (os.Remove(full)) and ignores dryRun and the path-safety
checks used by WriteShard; replace the direct delete with a safe helper (e.g.,
RemoveShard) that resolves absolute paths, enforces the repoDir prefix check to
prevent path traversal, honors the dryRun flag by logging the intended delete
instead of performing it, and returns errors except for IsNotExist; update the
code path that handles item.content == "" to call that helper with repoDir,
item.path and dryRun instead of calling os.Remove directly.
There was a problem hiding this comment.
Actionable comments posted: 3
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
README.md (1)
5-5:⚠️ Potential issue | 🟠 MajorUpdate legacy references to
.graphfiles.Your README still talks about "writes a
.graphfile" (line 5), "writes.graphfiles" (line 19), and "Manages.graph.*sidecar files" (line 101), but you're now recommending the three-file format as best practice. This creates confusion—are there now four file types? Do.graphfiles still get generated?Either update these references to mention both formats, or clarify when each format is used.
🔄 Suggested clarification
For line 5, you could explain both formats exist:
-Supermodel maps every file, function, and call relationship in your repo and writes a `.graph` file next to each source file. Your agent reads them automatically via grep and cat. No prompt changes. No extra context windows. No new tools to learn. +Supermodel maps every file, function, and call relationship in your repo and writes shard files next to each source file (either a single `.graph` file or three separate `.calls`/`.deps`/`.impact` files). Your agent reads them automatically via grep and cat. No prompt changes. No extra context windows. No new tools to learn.For line 101:
-Manages `.graph.*` sidecar files written next to each source file. Agents read these without making API calls. +Manages shard files written next to each source file (`.graph.*` or `.calls.*`/`.deps.*`/`.impact.*`). Agents read these without making API calls.Also applies to: 19-19, 101-101
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@README.md` at line 5, Update README.md to remove ambiguous legacy-only mentions of ".graph" and ".graph.*" and clearly state both formats: the legacy single-file ".graph" and the newer three-file format (describe when each is produced and which is the recommended best practice). Search for the literal tokens ".graph", ".graph files", and ".graph.*" (e.g., the sentence at line 5, the reference at line 19, and the mention at line 101) and replace those lines with a concise clarification that both formats exist, specify the conditions or versions that produce the single-file versus the three-file format, and indicate that the three-file format is the recommended approach.
🧹 Nitpick comments (1)
README.md (1)
127-127: Qualify the performance claim.You're stating "68% faster" as an absolute fact, but this comes from specific benchmark data (mentioned in the PR description). Real-world results will vary based on repo size, query patterns, etc. Add a qualifier like "in benchmarks" or "up to" to set accurate expectations.
📊 Suggested wording
-The three-file format is **68% faster** in benchmarks because grep hits are more targeted — searching for a function name hits only the `.calls` file with caller/callee data, not a combined blob. +In benchmarks, the three-file format is **up to 68% faster** because grep hits are more targeted — searching for a function name hits only the `.calls` file with caller/callee data, not a combined blob.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@README.md` at line 127, Update the claim "The three-file format is **68% faster** in benchmarks" to qualify it—for example change to "up to 68% faster in our benchmarks" or "68% faster in our benchmarks; results may vary by repo size and query patterns"—so the README sentence clearly indicates the source and that real-world performance depends on factors like repo size and query patterns.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@README.md`:
- Around line 122-124: The README shows inconsistent filename notation between
concrete filenames (e.g., cache.calls.go, cache.deps.go, cache.impact.go) and a
wildcard pattern ("Files ending in .calls.*"); choose and apply one consistent
convention across the document: either replace the wildcard references with
concrete examples matching the generated files (cache.calls.go, cache.deps.go,
cache.impact.go) wherever mentioned (including the earlier example at lines
139-141) or change the concrete example to a wildcard form and add a brief
clarifying note explaining that the wildcard represents language extensions
(e.g., .calls.go, .calls.py) so readers understand whether the suffix includes
the source-language extension; update all occurrences to the chosen style.
- Around line 121-125: The markdown code fences for the file listing line
containing "src/cache.go → src/cache.calls.go # who calls what, with
file:line" and the manual instruction block starting with "This repository has
Supermodel graph shard files next to source files." need language specifiers;
update both triple-backtick fences to use ```text (or ```bash) so the linter
stops flagging them, i.e., add "text" after the opening ``` for the block that
contains the "src/cache.go → src/cache.calls.go" snippet and for the block that
contains the "This repository has Supermodel graph shard files next to source
files." paragraph.
- Line 106: The README currently documents a non-existent CLI command "skill";
either remove the two README lines (the table row at line ~106 and the usage
example at ~132) or add a real command implementation named "skill" under cmd
(e.g., create cmd/skill.go), export a Cobra/CLI subcommand registered with the
root command, and implement its Run/Execute function to print the agent
awareness prompt (same content referenced in CLAUDE.md/AGENTS.md) and exit;
reference symbols to update are the README entries for "skill" and the new
command file cmd/skill.go and its command registration
(root.AddCommand(skillCmd) or equivalent) so users no longer get "command not
found".
---
Outside diff comments:
In `@README.md`:
- Line 5: Update README.md to remove ambiguous legacy-only mentions of ".graph"
and ".graph.*" and clearly state both formats: the legacy single-file ".graph"
and the newer three-file format (describe when each is produced and which is the
recommended best practice). Search for the literal tokens ".graph", ".graph
files", and ".graph.*" (e.g., the sentence at line 5, the reference at line 19,
and the mention at line 101) and replace those lines with a concise
clarification that both formats exist, specify the conditions or versions that
produce the single-file versus the three-file format, and indicate that the
three-file format is the recommended approach.
---
Nitpick comments:
In `@README.md`:
- Line 127: Update the claim "The three-file format is **68% faster** in
benchmarks" to qualify it—for example change to "up to 68% faster in our
benchmarks" or "68% faster in our benchmarks; results may vary by repo size and
query patterns"—so the README sentence clearly indicates the source and that
real-world performance depends on factors like repo size and query patterns.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
| |---|---| | ||
| | `analyze [path]` | Upload repo, run full analysis, write `.graph.*` files (use `--no-files` to skip) | | ||
| | `analyze [path]` | Upload repo, run full analysis, write graph files (use `--three-file` for best results, `--no-shards` to skip) | | ||
| | `skill` | Print agent awareness prompt — pipe to `CLAUDE.md` or `AGENTS.md` | |
There was a problem hiding this comment.
🧩 Analysis chain
🏁 Script executed:
#!/bin/bash
# Check if the skill command is implemented in the codebase
rg -n "Use.*skill" --type=go
rg -n "\"skill\"" cmd/ --type=go -A5 -B5Repository: supermodeltools/cli
Length of output: 45
🏁 Script executed:
# Search for skill across all files
rg -i "skill" --max-count=20
# Read the README around the mentioned lines
head -140 README.md | tail -40Repository: supermodeltools/cli
Length of output: 2139
🏁 Script executed:
# Find main.go or command setup
fd -type f -name "main.go"
# Look for command registration patterns
rg -i "command.*skill|skill.*command" --type=go
# Look for cobra/cli command definitions
rg "cmd\." --type=go -B2 -A2 | head -50
# List the cmd/ directory structure
ls -la cmd/ 2>/dev/null || echo "cmd/ directory not found"
git ls-files | grep -E "^cmd/|main\.go"Repository: supermodeltools/cli
Length of output: 4084
🏁 Script executed:
# Check archdocs.go for skill-related code
cat cmd/archdocs.go
# Double-check: search for "skill" in all cmd files
rg "skill" cmd/Repository: supermodeltools/cli
Length of output: 2369
Remove the skill command from README or implement it.
The README documents a skill command (lines 106, 132) with usage examples, but it doesn't exist in the codebase. There's no cmd/skill.go file like the other documented commands (analyze, clean, hook, etc. all have their implementation files).
Either remove these lines from the README, or implement the command. Right now users following the docs will hit a "command not found" error.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@README.md` at line 106, The README currently documents a non-existent CLI
command "skill"; either remove the two README lines (the table row at line ~106
and the usage example at ~132) or add a real command implementation named
"skill" under cmd (e.g., create cmd/skill.go), export a Cobra/CLI subcommand
registered with the root command, and implement its Run/Execute function to
print the agent awareness prompt (same content referenced in
CLAUDE.md/AGENTS.md) and exit; reference symbols to update are the README
entries for "skill" and the new command file cmd/skill.go and its command
registration (root.AddCommand(skillCmd) or equivalent) so users no longer get
"command not found".
| ``` | ||
| src/cache.go → src/cache.calls.go # who calls what, with file:line | ||
| → src/cache.deps.go # imports and imported-by | ||
| → src/cache.impact.go # risk level, domains, blast radius | ||
| ``` |
There was a problem hiding this comment.
Add language specifiers to code blocks.
The markdown linter is flagging these code blocks because they're missing language identifiers. Add text or bash to keep the linter happy.
🔧 Quick fix
For the file listing at line 121:
-```
+```text
src/cache.go → src/cache.calls.go # who calls what, with file:lineFor the manual instruction at line 137:
-```
+```text
This repository has Supermodel graph shard files next to source files.Also applies to: 137-143
🧰 Tools
🪛 markdownlint-cli2 (0.22.0)
[warning] 121-121: Fenced code blocks should have a language specified
(MD040, fenced-code-language)
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@README.md` around lines 121 - 125, The markdown code fences for the file
listing line containing "src/cache.go → src/cache.calls.go # who
calls what, with file:line" and the manual instruction block starting with "This
repository has Supermodel graph shard files next to source files." need language
specifiers; update both triple-backtick fences to use ```text (or ```bash) so
the linter stops flagging them, i.e., add "text" after the opening ``` for the
block that contains the "src/cache.go → src/cache.calls.go" snippet and for the
block that contains the "This repository has Supermodel graph shard files next
to source files." paragraph.
| src/cache.go → src/cache.calls.go # who calls what, with file:line | ||
| → src/cache.deps.go # imports and imported-by | ||
| → src/cache.impact.go # risk level, domains, blast radius |
There was a problem hiding this comment.
Fix the inconsistent file extension notation.
Your example at lines 122-124 shows specific filenames like cache.calls.go (with .go), but then lines 139-141 say "Files ending in .calls.*" using a wildcard pattern. This is confusing—pick one notation and stick with it.
If the actual generated files are cache.calls.go, cache.deps.go, etc., then use that concrete pattern throughout. If the * represents different language extensions (like .calls.go for Go, .calls.py for Python), make that explicit.
✏️ Suggested fix for consistency
Option 1: If files always include the source language extension, update the manual text to match the example:
-Files ending in .calls.* contain function call relationships.
-Files ending in .deps.* contain dependency relationships.
-Files ending in .impact.* contain blast radius data.
+Files ending in .calls.go (or .calls.py, etc.) contain function call relationships.
+Files ending in .deps.go (or .deps.py, etc.) contain dependency relationships.
+Files ending in .impact.go (or .impact.py, etc.) contain blast radius data.Option 2: If you want to use wildcards for brevity, update the example to match:
-src/cache.go → src/cache.calls.go # who calls what, with file:line
- → src/cache.deps.go # imports and imported-by
- → src/cache.impact.go # risk level, domains, blast radius
+src/cache.go → src/cache.calls.* # who calls what, with file:line
+ → src/cache.deps.* # imports and imported-by
+ → src/cache.impact.* # risk level, domains, blast radiusAlso applies to: 139-141
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@README.md` around lines 122 - 124, The README shows inconsistent filename
notation between concrete filenames (e.g., cache.calls.go, cache.deps.go,
cache.impact.go) and a wildcard pattern ("Files ending in .calls.*"); choose and
apply one consistent convention across the document: either replace the wildcard
references with concrete examples matching the generated files (cache.calls.go,
cache.deps.go, cache.impact.go) wherever mentioned (including the earlier
example at lines 139-141) or change the concrete example to a wildcard form and
add a brief clarifying note explaining that the wildcard represents language
extensions (e.g., .calls.go, .calls.py) so readers understand whether the suffix
includes the source-language extension; update all occurrences to the chosen
style.
When switching from single .graph to --three-file (or back), stale files from the old format are removed before writing the new format. - RenderAll removes .calls/.deps/.impact before writing .graph - RenderAllThreeFile removes .graph before writing .calls/.deps/.impact - isShardFile now recognizes all four shard tags (graph, calls, deps, impact) - clean command handles both formats 3 new tests: stale three-file cleanup, stale graph cleanup, isShardFile recognizes all formats.
There was a problem hiding this comment.
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
internal/shards/zip.go (1)
357-369:⚠️ Potential issue | 🟠 MajorOnly treat supported source files as shards.
Line 144 runs this check for every file, so names like
docs/api.calls.mdnow get silently dropped from the upload even though we never generate Markdown shards. TightenisShardFileso it only matches when the final extension is one ofSourceExtensions.Possible fix
func isShardFile(filename string) bool { - ext := filepath.Ext(filename) - if ext == "" { + ext := strings.ToLower(filepath.Ext(filename)) + if ext == "" || !SourceExtensions[ext] { return false } stem := strings.TrimSuffix(filename, ext) tag := filepath.Ext(stem) if tag == "" {🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@internal/shards/zip.go` around lines 357 - 369, The isShardFile function currently treats any filename with a secondary extension as a shard (e.g., docs/api.calls.md), causing non-source files to be dropped; update isShardFile to first verify that the final extension (ext) is one of the allowed SourceExtensions before extracting the stem and tag, returning false otherwise. Modify isShardFile to check ext against the SourceExtensions set/list (or call an existing helper that validates source extensions), then proceed to derive stem/tag and lookup shardTags[tag] as before, using the existing function name isShardFile and the SourceExtensions and shardTags symbols to locate the code to change.
♻️ Duplicate comments (1)
internal/shards/render.go (1)
246-257:⚠️ Potential issue | 🔴 CriticalRoute shard deletes through the same safety path as writes.
Lines 247-256 and Lines 329-331 still call
os.Removedirectly. That means--dry-runcan delete real files, and these paths skip the repo-boundary guard thatWriteShardalready has.Safer pattern
+func RemoveShard(repoDir, shardPath string, dryRun bool) error { + full, err := filepath.Abs(filepath.Join(repoDir, shardPath)) + if err != nil { + return err + } + repoAbs, err := filepath.Abs(repoDir) + if err != nil { + return err + } + if !strings.HasPrefix(full, repoAbs+string(filepath.Separator)) && full != repoAbs { + return fmt.Errorf("path traversal blocked: %s", shardPath) + } + if dryRun { + fmt.Printf(" [dry-run] would remove %s\n", full) + return nil + } + if err := os.Remove(full); err != nil && !os.IsNotExist(err) { + return err + } + return nil +} + -func removeStaleThreeFile(repoDir, srcFile string) { +func removeStaleThreeFile(repoDir, srcFile string, dryRun bool) error { c, d, i := ThreeFileShardNames(srcFile) for _, p := range []string{c, d, i} { - _ = os.Remove(filepath.Join(repoDir, p)) + if err := RemoveShard(repoDir, p, dryRun); err != nil { + return err + } } + return nil } -func removeStaleGraph(repoDir, srcFile string) { - _ = os.Remove(filepath.Join(repoDir, ShardFilename(srcFile))) +func removeStaleGraph(repoDir, srcFile string, dryRun bool) error { + return RemoveShard(repoDir, ShardFilename(srcFile), dryRun) }- removeStaleThreeFile(repoDir, srcFile) + if err := removeStaleThreeFile(repoDir, srcFile, dryRun); err != nil { + return written, err + } ... - removeStaleGraph(repoDir, srcFile) + if err := removeStaleGraph(repoDir, srcFile, dryRun); err != nil { + return written, err + } ... - full := filepath.Join(repoDir, item.path) - _ = os.Remove(full) + if err := RemoveShard(repoDir, item.path, dryRun); err != nil { + return written, err + } continueAlso applies to: 329-331
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@internal/shards/render.go` around lines 246 - 257, removeStaleThreeFile and removeStaleGraph call os.Remove directly, bypassing the repo-boundary and dry-run safety checks used by WriteShard; change these to route deletes through the same safe path (e.g., call the existing WriteShard-style guard or add/use a DeleteShard helper that enforces dry-run and repo-boundary checks) so file removals for c,d,i and ShardFilename(srcFile) respect the same validation and dry-run behavior as writes.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Outside diff comments:
In `@internal/shards/zip.go`:
- Around line 357-369: The isShardFile function currently treats any filename
with a secondary extension as a shard (e.g., docs/api.calls.md), causing
non-source files to be dropped; update isShardFile to first verify that the
final extension (ext) is one of the allowed SourceExtensions before extracting
the stem and tag, returning false otherwise. Modify isShardFile to check ext
against the SourceExtensions set/list (or call an existing helper that validates
source extensions), then proceed to derive stem/tag and lookup shardTags[tag] as
before, using the existing function name isShardFile and the SourceExtensions
and shardTags symbols to locate the code to change.
---
Duplicate comments:
In `@internal/shards/render.go`:
- Around line 246-257: removeStaleThreeFile and removeStaleGraph call os.Remove
directly, bypassing the repo-boundary and dry-run safety checks used by
WriteShard; change these to route deletes through the same safe path (e.g., call
the existing WriteShard-style guard or add/use a DeleteShard helper that
enforces dry-run and repo-boundary checks) so file removals for c,d,i and
ShardFilename(srcFile) respect the same validation and dry-run behavior as
writes.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro
Run ID: c7bfedcb-e5ab-40a0-8b7a-86863c3cecd8
📒 Files selected for processing (4)
internal/shards/graph.gointernal/shards/render.gointernal/shards/render_stale_test.gointernal/shards/zip.go
…guard tests - Happy-path content: verify .calls contains [calls] + function names, .deps contains [deps] + import targets, .impact contains [impact] + risk - Single .graph content: verify combined file has all sections - Empty section: stale .impact file removed when a source file has no impact data (no importers, no callers) - Path traversal: safeRemove blocks ../traversal, allows inside-repo deletes - isShardFile: recognizes all four tags across .go/.ts/.py extensions - Hardened empty-content removal in RenderAllThreeFile to use safeRemove 12 tests total in render_stale_test.go.
There was a problem hiding this comment.
♻️ Duplicate comments (1)
internal/shards/render.go (1)
246-260:⚠️ Potential issue | 🔴 CriticalDry-run still performs real deletes, and delete errors are swallowed.
Cleanup paths call
safeRemoveunconditionally, so dry-run can still delete shard files. Also, failed removals are ignored, which can leave stale shards behind silently.Suggested fix (thread dry-run + errors through cleanup)
-// safeRemove removes a file only if it resolves inside repoDir (traversal guard). -func safeRemove(repoDir, relPath string) { +// safeRemove removes a file only if it resolves inside repoDir (traversal guard). +func safeRemove(repoDir, relPath string, dryRun bool) error { full, err := filepath.Abs(filepath.Join(repoDir, relPath)) if err != nil { - return + return err } repoAbs, err := filepath.Abs(repoDir) if err != nil { - return + return err } if !strings.HasPrefix(full, repoAbs+string(filepath.Separator)) && full != repoAbs { - return + return fmt.Errorf("path traversal blocked: %s", relPath) } - _ = os.Remove(full) + if dryRun { + fmt.Printf(" [dry-run] would remove %s\n", full) + return nil + } + if err := os.Remove(full); err != nil && !os.IsNotExist(err) { + return err + } + return nil } -// removeStaleThreeFile removes .calls/.deps/.impact files for a source file. -func removeStaleThreeFile(repoDir, srcFile string) { +func removeStaleThreeFile(repoDir, srcFile string, dryRun bool) error { c, d, i := ThreeFileShardNames(srcFile) for _, p := range []string{c, d, i} { - safeRemove(repoDir, p) + if err := safeRemove(repoDir, p, dryRun); err != nil { + return err + } } + return nil } -// removeStaleGraph removes the single .graph file for a source file. -func removeStaleGraph(repoDir, srcFile string) { - safeRemove(repoDir, ShardFilename(srcFile)) +func removeStaleGraph(repoDir, srcFile string, dryRun bool) error { + return safeRemove(repoDir, ShardFilename(srcFile), dryRun) }- removeStaleThreeFile(repoDir, srcFile) + if err := removeStaleThreeFile(repoDir, srcFile, dryRun); err != nil { + return written, err + }- removeStaleGraph(repoDir, srcFile) + if err := removeStaleGraph(repoDir, srcFile, dryRun); err != nil { + return written, err + } ... - safeRemove(repoDir, item.path) + if err := safeRemove(repoDir, item.path, dryRun); err != nil { + return written, err + }Also applies to: 262-268, 270-273, 282-283, 321-322, 345-347
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@internal/shards/render.go` around lines 246 - 260, safeRemove currently performs real deletions and swallows errors; change it to accept a dryRun bool (or return an error) and ensure callers check the dry-run flag before attempting removal so no deletes happen during dry-run, and propagate os.Remove errors instead of ignoring them; update callers of safeRemove (the cleanup paths noted around the other ranges) to pass the dryRun flag (or handle the returned error) and log/return removal failures so stale shards are not silently left or missed.
🧹 Nitpick comments (1)
internal/shards/render_stale_test.go (1)
66-117: Please add dry-run regression tests for stale cleanup.Current tests only exercise
dryRun=false. Add cases that pre-create stale files, run withdryRun=true, and assert files are unchanged.Test cases worth adding
+func TestRenderAll_DryRun_DoesNotRemoveStaleThreeFiles(t *testing.T) { /* ... */ } +func TestRenderAllThreeFile_DryRun_DoesNotRemoveGraphOrEmptySectionFiles(t *testing.T) { /* ... */ }Also applies to: 230-253
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@internal/shards/render_stale_test.go` around lines 66 - 117, Add regression tests that cover dry-run behavior by creating stale files first, invoking RenderAll and RenderAllThreeFile with the dryRun flag true, and asserting no files are removed; specifically, add tests (e.g., TestRenderAll_DryRun_DoesNotRemoveStaleFiles and TestRenderAllThreeFile_DryRun_DoesNotRemoveGraphFile) that pre-create the stale three-file set (index.calls.ts, index.deps.ts, index.impact.ts) and a graph file as appropriate, call RenderAll(dir, cache, []string{"src/index.ts"}, true) and RenderAllThreeFile(dir, cache, []string{"src/index.ts"}, true), then assert the original files still exist (use os.Stat and os.IsNotExist checks) to ensure dryRun=true leaves files unchanged.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Duplicate comments:
In `@internal/shards/render.go`:
- Around line 246-260: safeRemove currently performs real deletions and swallows
errors; change it to accept a dryRun bool (or return an error) and ensure
callers check the dry-run flag before attempting removal so no deletes happen
during dry-run, and propagate os.Remove errors instead of ignoring them; update
callers of safeRemove (the cleanup paths noted around the other ranges) to pass
the dryRun flag (or handle the returned error) and log/return removal failures
so stale shards are not silently left or missed.
---
Nitpick comments:
In `@internal/shards/render_stale_test.go`:
- Around line 66-117: Add regression tests that cover dry-run behavior by
creating stale files first, invoking RenderAll and RenderAllThreeFile with the
dryRun flag true, and asserting no files are removed; specifically, add tests
(e.g., TestRenderAll_DryRun_DoesNotRemoveStaleFiles and
TestRenderAllThreeFile_DryRun_DoesNotRemoveGraphFile) that pre-create the stale
three-file set (index.calls.ts, index.deps.ts, index.impact.ts) and a graph file
as appropriate, call RenderAll(dir, cache, []string{"src/index.ts"}, true) and
RenderAllThreeFile(dir, cache, []string{"src/index.ts"}, true), then assert the
original files still exist (use os.Stat and os.IsNotExist checks) to ensure
dryRun=true leaves files unchanged.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro
Run ID: f27907e0-f9aa-450f-9f4b-25f2decde4c5
📒 Files selected for processing (2)
internal/shards/render.gointernal/shards/render_stale_test.go
There was a problem hiding this comment.
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
cmd/analyze.go (1)
31-40:⚠️ Potential issue | 🟠 MajorCheck flag conflicts before API-key/config validation
Right now the
--three-file+--no-shardsconflict is validated aftercfg.RequireAPIKey(). If API key is missing, users get the wrong error first and never see the flag conflict. Move the conflict check to the top ofRunEso invalid CLI args fail fast.Suggested fix
RunE: func(cmd *cobra.Command, args []string) error { + if noShards && threeFile { + return fmt.Errorf("--three-file cannot be used with --no-shards") + } + cfg, err := config.Load() if err != nil { return err } if err := cfg.RequireAPIKey(); err != nil { return err } - if noShards && threeFile { - return fmt.Errorf("--three-file cannot be used with --no-shards") - }🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@cmd/analyze.go` around lines 31 - 40, Move the CLI flag conflict check for --three-file and --no-shards to the top of the command RunE so invalid flags fail fast; specifically, relocate the conditional that checks "if noShards && threeFile { return fmt.Errorf(\"--three-file cannot be used with --no-shards\") }" so it runs before calling config.Load() and cfg.RequireAPIKey() (the surrounding RunE function where cfg, err := config.Load() and cfg.RequireAPIKey() are invoked). Ensure the check still uses the existing noShards and threeFile variables and returns the same error message when the conflict is detected.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Outside diff comments:
In `@cmd/analyze.go`:
- Around line 31-40: Move the CLI flag conflict check for --three-file and
--no-shards to the top of the command RunE so invalid flags fail fast;
specifically, relocate the conditional that checks "if noShards && threeFile {
return fmt.Errorf(\"--three-file cannot be used with --no-shards\") }" so it
runs before calling config.Load() and cfg.RequireAPIKey() (the surrounding RunE
function where cfg, err := config.Load() and cfg.RequireAPIKey() are invoked).
Ensure the check still uses the existing noShards and threeFile variables and
returns the same error message when the conflict is detected.
What
supermodel analyze --three-filegenerates three shard files per source file instead of one combined.graphfile:Why
Benchmark data shows the three-file format dramatically outperforms the single .graph format:
Each file contains one relationship type. Grep hits are more targeted — searching for a function name hits the
.callsfile with just caller/callee data, not a blob of deps+calls+impact mixed together.Changes
internal/shards/render.go: AddThreeFileShardNames,RenderAllThreeFileinternal/shards/handler.go: AddThreeFilebool toGenerateOptionsandRenderOptions, dispatch viarenderShardshelpercmd/analyze.go: Add--three-fileflag3 files, 69 lines added.
Closes #117
Summary by CodeRabbit
New Features
Behavior
Documentation
Tests