[DO NOT MERGE] feat: linearization tour over the code graph (draft) by jonathanpopham · Pull Request #127 · supermodeltools/cli

jonathanpopham · 2026-04-13T22:16:03Z

Objective

LLMs are one-dimensional — they consume a token stream left-to-right. Graphs are not. Today's .graph.* shards linearize at the file level (each sidecar is a flat [deps] / [calls] / [impact] view), but agents get N independent shards with no reading order and have to guess where to start.

This PR adds a tour: a single .supermodel/TOUR.md that serializes the graph into a strategy-chosen linear walk, grouped by domain/subdomain, linking each file to its shard. It also adds a prose narrative preamble on each shard so call/import adjacency reads as sentences rather than arrows. Both are the missing linearization layer over the existing sharding feature.

Feature-complete surface

Standalone tour:

supermodel tour [--strategy topo|bfs-seed|dfs-seed|centrality]
                [--seed <file>]
                [--narrate]
                [--budget-tokens <N>]
                [--dry-run] [path]

Integrated with analyze (one command = shards + spine):

supermodel analyze [--tour]
                   [--tour-strategy ...] [--tour-seed <file>]
                   [--tour-budget <N>]
                   [--narrate]
                   [path]

Strategies (all implemented)

Strategy	Ordering	Best for
`topo`	reverse-topological over imports (leaves first)	whole-codebase onboarding
`bfs-seed`	BFS from `--seed` outward (undirected import graph)	focused tasks, blast-radius walks
`dfs-seed`	DFS from `--seed` outward (emit-on-pop)	tracing a request through layers
`centrality`	transitive-dependent count desc (largest blast radius)	"what's the core of this codebase"

`--narrate` (implemented)

Each shard gets a prose preamble:

// Narrative:
// This file (src/auth.ts) belongs to domain Auth.
// It imports 2 file(s): src/db.ts, src/util.ts.
// It is imported by 1 file(s): src/api.ts.
//   login is called by handleRequest and calls query.
// Risk: MEDIUM (2 transitive dependent(s)).

Same data as [deps]/[calls]/[impact] — second rendering targeted at LLM reading style.

`--budget-tokens N` (implemented)

Long tours chunk into TOUR.01.md, TOUR.02.md, ... with prev/next cross-links. TOUR.md becomes an index. Split points are file-entry boundaries — no partial entries. Heuristic: 4 chars ≈ 1 token.

Files changed

File	Purpose
`docs/linearization.md`	Design doc (updated to reflect implemented surface)
`internal/shards/tour.go`	`TourStrategy` interface + all 4 strategies + `ChunkTour` + `WriteTour` + `ResolveStrategy`
`internal/shards/narrative.go`	`RenderNarrative` — prose preamble generator
`internal/shards/render.go`	`RenderAll` / `RenderAllThreeFile` take `narrate bool`
`internal/shards/handler.go`	`GenerateOptions` extended; `maybeWriteTour` runs after shards
`internal/shards/tour_test.go`	13 tests covering strategies, chunking, narrative, narrate-in-render
`internal/shards/daemon.go` + `render_*_test.go`	updated call sites for new signature
`cmd/tour.go`	subcommand with all flags
`cmd/analyze.go`	`--tour*` and `--narrate` flags; wires through `GenerateOptions`

Test plan

go test ./internal/shards/ — all 18 tour/narrative/strategy tests pass (5 from initial PR + 13 new); full shards suite green
go test ./... — full suite green, no regressions
go vet ./... — clean
Smoke test all strategies on a fabricated graph: topo / centrality / bfs-seed / dfs-seed produce correct orderings
Smoke test --budget-tokens 60: 5-file graph chunked into 5 chapters + index with prev/next links
Smoke test --narrate: 5 shards rewritten with prose preamble; narrative + structured sections coexist
Empirical benchmark — wire supermodeltools/supermodel-benchmarks/shard-ab-test/ and mcpbr-eval to measure agent task completion with/without TOUR+narrate. Not in this PR — landing the surface so we can benchmark it.

Design principles

Same vertical slice. Everything lives inside internal/shards/. No cross-slice dependency.
Additive. Default analyze behavior unchanged.
Deterministic. Lex tiebreaks, stable sorts, cycle-safe, tests pin invariants.
Strategy-pluggable. Adding a new ordering is one TourStrategy implementation + one ResolveStrategy case.

Open questions for reviewers

Inline vs. linked shard content in TOUR.md? Currently linked (DRY, requires pointer-following). Could add --inline.
--focus <glob> filter to scope tours to a subtree?
Should tour --budget-tokens 0 clean up stale TOUR.NN.md files from a prior chunked run? (Currently cosmetic leftover.)
Does arch-docs want TOUR.md as its entry point instead of its own traversal?

References

arXiv:2410.19494 — Graph Linearization Methods for Reasoning on Graphs with LLMs
supermodeltools/codegraph-graphrag — BFS narrative walks, thesis doc
supermodeltools/graph2md — per-node markdown emission
supermodeltools/mcp/src/tools/explore-function.ts — describeNode, cross-subsystem markers

Summary by CodeRabbit

New Features
- Added a new CLI command to generate a repository "tour" (TOUR.md) that linearizes files in a chosen order (strategies: topo, bfs-seed, dfs-seed, centrality), supports seed selection, token-budget chaptering, dry-run previews, and an optional narrate mode that prepends prose summaries to shard sidecars.
CLI
- Integrated tour and narrate flags into analysis workflow with non-fatal warnings on tour failures.
Documentation
- Added guide describing linearization, strategies, CLI usage, and output shape.
Tests
- Added comprehensive tests for ordering, rendering, chunking, narration, and strategy resolution.

LLMs are 1D and graphs aren't. Per-file shards already linearize at the file level, but agents get N independent sidecars with no reading order and must guess where to start. This adds a tour: a single TOUR.md that serializes the whole graph into a strategy-chosen walk (reverse-topo by default), grouped by domain/subdomain, linking each file to its shard. - docs/linearization.md — design doc, thesis, strategy taxonomy, prior art - internal/shards/tour.go — TourStrategy interface + TopoStrategy - internal/shards/tour_test.go — determinism, leaves-first, cycle handling - cmd/tour.go — new 'supermodel tour' subcommand Draft PR for design feedback before scoping follow-ons (bfs-seed, centrality, --narrate prose preamble on shards, --budget-tokens chunking).

coderabbitai · 2026-04-13T22:16:19Z

Walkthrough

This PR adds a new CLI command supermodel tour plus tour-generation internals: it reads .supermodel/shards.json, builds a shard cache, chooses a linearization strategy, renders a repository TOUR (optionally chunked by token budget), and can optionally prepend prose narratives to existing per-file shard sidecars.

Changes

Cohort / File(s)	Summary
CLI: tour command `cmd/tour.go`	New Cobra subcommand `tour [path]` that resolves repo path, reads/unmarshals `.supermodel/shards.json`, builds cache, resolves strategy, calls `shards.WriteTour`, supports flags `--strategy`, `--seed`, `--narrate`, `--budget-tokens`, `--dry-run`, and re-renders shards when `--narrate` is set.
Analyze CLI wiring `cmd/analyze.go`	Adds `--narrate`, `--tour`, `--tour-strategy`, `--tour-seed`, `--tour-budget` flags and threads them into `GenerateOptions`; validates incompatible flag combos (`--no-shards` with --tour/--narrate).
Core tour engine `internal/shards/tour.go`	New tour generator: `TourStrategy` interface and implementations (`TopoStrategy`, `BFSSeedStrategy`, `DFSSeedStrategy`, `CentralityStrategy`), `RenderTour`, `ChunkTour`, and `WriteTour` (atomic writes, chunking into `TOUR.01.md`…, index `TOUR.md`).
Narrative generator `internal/shards/narrative.go`	New `RenderNarrative` (exported) plus helpers to synthesize per-file prose describing imports/importers, functions, adjacency, risk, and transitive dependents; returns comment-prefixed narrative block.
Rendering plumbing `internal/shards/render.go`, `internal/shards/render_stale_test.go`, `internal/shards/render_test.go`	Extended `RenderAll`/`RenderAllThreeFile` signatures to accept `narrate bool`; when true, prepends `RenderNarrative` output to shard contents. Updated tests to pass new arg.
Generate & handler changes `internal/shards/handler.go`, `internal/shards/daemon.go`	Extended `GenerateOptions` and `RenderOptions` with tour/narrate fields; updated `renderShards` signature to accept `narrate`; added `ResolveStrategy` and `maybeWriteTour` to optionally write tours and handle strategy resolution (non-fatal warnings on failure). Updated `Daemon` call sites to new `RenderAll` signature.
Tests for tour & narrative `internal/shards/tour_test.go`, `internal/shards/render_stale_test.go`, `internal/shards/render_test.go`	New tests validating ordering strategies, deterministic/cycle-safe topo ordering, BFS/DFS seed behaviors, centrality ordering, tour rendering and chunking, and narrative output & re-rendering. Updated render tests for new signatures.
Docs `docs/linearization.md`	New documentation describing TOUR.md shape, linearization strategies, chunking behavior, `--narrate` opt-in, CLI surfaces, and examples.

Sequence Diagram

sequenceDiagram
    participant User
    participant CLI as "supermodel tour"
    participant FS as "File System"
    participant ShardIR as "shards.json -> IR"
    participant Cache as "Shard Cache"
    participant Strategy
    participant Renderer as "RenderTour/ChunkTour"
    participant Writer as "WriteTour"

    User->>CLI: run `supermodel tour [path] --strategy=... --narrate`
    CLI->>FS: read `<repo>/.supermodel/shards.json`
    FS-->>CLI: JSON bytes
    CLI->>ShardIR: unmarshal -> IR
    CLI->>Cache: NewCache().Build(&ir)
    Cache-->>CLI: populated cache
    CLI->>Strategy: ResolveStrategy(name, seed) -> Order(cache)
    Strategy->>Cache: traverse graph / compute metrics
    Strategy-->>CLI: ordered file list
    CLI->>Renderer: RenderTour(cache, order, prefix)
    Renderer-->>Cache: file metadata, domains, imports
    Renderer-->>CLI: markdown body (possibly chunked)
    CLI->>Writer: WriteTour(repoDir, cache, strategy, budget, dryRun)
    Writer->>FS: atomic write (tmp -> rename) of TOUR.md / chapters
    FS-->>Writer: write result
    alt --narrate set
        CLI->>Renderer: RenderNarrative per-file
        Renderer->>Writer: overwrite `.graph.*` shards with narrative preamble
    end
    Writer-->>CLI: success / path info
    CLI-->>User: success message (unless --dry-run)

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~45 minutes

Possibly related PRs

feat: add --three-file flag to generate .calls/.deps/.impact shards #120 — touches GenerateOptions/RenderOptions and RenderAll/render path signatures; likely overlapping changes to render plumbing.

Suggested reviewers

greynewell

Poem

🗺️ Files line up, a tour unfurled,
Paths and imports step through the world,
Narratives whisper, shards sing along,
Chapters stitched tidy, ordered, and strong,
A little map so readers find home.

🚥 Pre-merge checks | ✅ 2 | ❌ 1

❌ Failed checks (1 warning)

Check name	Status	Explanation	Resolution
Docstring Coverage	⚠️ Warning	Docstring coverage is 61.54% which is insufficient. The required threshold is 80.00%.	Write docstrings for the functions missing them to satisfy the coverage threshold.

✅ Passed checks (2 passed)

Check name	Status	Explanation
Title check	✅ Passed	The title 'feat: linearization tour over the code graph (draft)' accurately describes the main feature: adding a linearization tour system for the code graph.
Description check	✅ Passed	The PR description is comprehensive, covering objectives, feature surface, test results, and design principles, though it marks some items as not yet completed (e.g., empirical benchmarking).

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

✨ Finishing Touches

🧪 Generate unit tests (beta)

Create PR with unit tests

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

coderabbitai

Actionable comments posted: 3

🧹 Nitpick comments (2)

internal/shards/tour_test.go (1)

90-92: Cycle test should assert actual members, not only count.

On Line 90, checking only len(order) == 2 can miss wrong contents. Quick hardening: assert both a.go and b.go are present.

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@internal/shards/tour_test.go` around lines 90 - 92, The test currently only
checks the length of the cycle participant slice `order`; modify the assertion
to verify the actual members instead of just the count by asserting that both
expected filenames (e.g., "a.go" and "b.go") appear in `order` (use explicit
equality or a membership check on `order`), so replace the loose `len(order) !=
2` check with assertions that `order` contains "a.go" and "b.go" (and fail the
test with a clear message if either is missing).

docs/linearization.md (1)

43-47: Add languages to fenced code blocks to satisfy markdown lint.

Line 43 and Line 115 use bare fences; adding explicit languages (for example text or bash) will clear MD040 warnings.

Also applies to: 115-121

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@docs/linearization.md` around lines 43 - 47, Replace the bare fenced code
blocks with language-tagged fences to satisfy markdown lint: locate the
triple-backtick block that lists TOUR.md, src/auth/session.go, and
src/auth/session.graph.go and change the opening fence to include a language
(e.g., ```text or ```bash), and do the same for the later multi-line fenced
block (the one currently using plain ``` around the example lines); ensure both
opening fences include an explicit language token.

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@docs/linearization.md`:
- Around line 115-129: The docs list several flags/strategies for the
"supermodel tour" CLI (--strategy bfs-seed|dfs-seed|centrality, --seed,
--narrate, --budget-tokens) that are not implemented (current CLI only supports
--strategy topo and --dry-run); either remove or mark those options as TODO in
docs or implement the missing runtime support in the command handler for
"supermodel tour" to accept and act on --strategy(bfs-seed|dfs-seed|centrality),
--seed, --narrate, and --budget-tokens; update the CLI parsing and executor code
that reads flags for "supermodel tour" to validate and handle these options (or
sync the docs to only mention --strategy topo and --dry-run) so the
documentation and implementation match.

In `@internal/shards/tour.go`:
- Around line 178-199: riskFor currently only aggregates domains from
cache.TransitiveDependents(file) and the file itself via cache.FileDomain, which
can diverge from renderImpactSection because that function also includes domains
of direct caller files (function-call edges); update riskFor to also iterate
direct callers of the file (i.e., the files that call this file via
function-call edges) and add their cache.FileDomain entries into the domains set
before computing thresholds so the domain count matches renderImpactSection
(reference symbols: riskFor, cache.TransitiveDependents, cache.FileDomain and
the codepath that obtains direct callers such as the function that returns
direct callers/callers for a file in the cache).
- Around line 21-23: Add one-line doc comments for the exported methods
TopoStrategy.Name and TopoStrategy.Order describing that Name returns the
strategy identifier and Order returns the ordered files for the given cache;
also change the signature of splitDomain to name its two return values (e.g.,
func splitDomain(...) (dom string, sub string)) so callers/readers immediately
know what each returned string represents and to satisfy the linter.

---

Nitpick comments:
In `@docs/linearization.md`:
- Around line 43-47: Replace the bare fenced code blocks with language-tagged
fences to satisfy markdown lint: locate the triple-backtick block that lists
TOUR.md, src/auth/session.go, and src/auth/session.graph.go and change the
opening fence to include a language (e.g., ```text or ```bash), and do the same
for the later multi-line fenced block (the one currently using plain ``` around
the example lines); ensure both opening fences include an explicit language
token.

In `@internal/shards/tour_test.go`:
- Around line 90-92: The test currently only checks the length of the cycle
participant slice `order`; modify the assertion to verify the actual members
instead of just the count by asserting that both expected filenames (e.g.,
"a.go" and "b.go") appear in `order` (use explicit equality or a membership
check on `order`), so replace the loose `len(order) != 2` check with assertions
that `order` contains "a.go" and "b.go" (and fail the test with a clear message
if either is missing).

🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

Push a commit to this branch (recommended)
Create a new PR with the fixes

ℹ️ Review info

⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: a4fb4858-cb54-436e-81ce-aa1db99803cb

📥 Commits

Reviewing files that changed from the base of the PR and between 1dbd1d9 and 2ef0846.

📒 Files selected for processing (4)

cmd/tour.go
docs/linearization.md
internal/shards/tour.go
internal/shards/tour_test.go

coderabbitai · 2026-04-13T22:23:30Z

+func (TopoStrategy) Name() string { return "topo" }
+
+func (TopoStrategy) Order(cache *Cache) []string {


⚠️ Potential issue | 🟡 Minor

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash # Verify current lint findings are resolved for this file. golangci-lint run ./... --disable-all -E revive -E gocritic | rg -n 'internal/shards/tour.go:(21|23|161)'

Repository: supermodeltools/cli

Length of output: 168

🏁 Script executed:

# First, check if the file exists and read the relevant lines wc -l internal/shards/tour.go

Repository: supermodeltools/cli

Length of output: 90

🏁 Script executed:

# Read lines around 21, 23, and 161 to understand current state sed -n '19,25p' internal/shards/tour.go

Repository: supermodeltools/cli

Length of output: 271

🏁 Script executed:

# Read lines around 161 to see the splitDomain function sed -n '159,169p' internal/shards/tour.go

Repository: supermodeltools/cli

Length of output: 228

🏁 Script executed:

# Check the correct golangci-lint flags (try without --disable-all) golangci-lint run ./internal/shards/tour.go -E revive -E gocritic --out-format=json 2>&1 | head -100

Repository: supermodeltools/cli

Length of output: 166

🏁 Script executed:

# Try running golangci-lint with simpler syntax golangci-lint run ./internal/shards/tour.go -E revive -E gocritic 2>&1

Repository: supermodeltools/cli

Length of output: 1627

🏁 Script executed:

# Also try with specific go files flag golangci-lint run ./internal/shards/ -E revive -E gocritic 2>&1 | grep -A 2 "tour.go"

Repository: supermodeltools/cli

Length of output: 45

🏁 Script executed:

# Check if there's a golangci-lint config file find . -maxdepth 2 -name ".golangci*" -o -name "golangci*"

Repository: supermodeltools/cli

Length of output: 79

🏁 Script executed:

# Try running golangci-lint on just the repo without typecheck golangci-lint run ./internal/shards/tour.go --no-config -E revive -E gocritic -D typecheck 2>&1 | head -50

Repository: supermodeltools/cli

Length of output: 270

🏁 Script executed:

# Let's just check the Go syntax by looking at exported funcs/methods # Exported methods/functions that lack doc comments in tour.go grep -n "^func" internal/shards/tour.go | head -20

Repository: supermodeltools/cli

Length of output: 752

🏁 Script executed:

# Check the golangci config to understand what linters are enabled cat ./.golangci.yaml

Repository: supermodeltools/cli

Length of output: 2570

🏁 Script executed:

# Now let's try to run revive specifically on just the exported methods/functions # Focus on lines 21, 23, and check if doc comments are missing sed -n '20,24p' internal/shards/tour.go echo "---" sed -n '160,162p' internal/shards/tour.go

Repository: supermodeltools/cli

Length of output: 262

🏁 Script executed:

# Let's check which exported functions/methods lack doc comments # Lines 21, 23, 92, 132, 147, 154, 161, 171, 181, 202, 212 # Some are unexported (lowercase), so let's filter for actually exported (uppercase start) grep -n "^func ([A-Z]\|^func [A-Z]" internal/shards/tour.go

Repository: supermodeltools/cli

Length of output: 361

🏁 Script executed:

# Check if these functions have doc comments by looking at the line before each sed -n '20,21p' internal/shards/tour.go echo "---" sed -n '22,23p' internal/shards/tour.go echo "---" sed -n '160,161p' internal/shards/tour.go

Repository: supermodeltools/cli

Length of output: 222

🏁 Script executed:

# Try running revive specifically on the file to see exported comment warnings revive -config ./.golangci.yaml ./internal/shards/tour.go 2>&1 | grep -E "(exported|Name|Order|splitDomain)"

Repository: supermodeltools/cli

Length of output: 45

🏁 Script executed:

# Try running just revive with simpler config revive -severity=warning ./internal/shards/tour.go 2>&1

Repository: supermodeltools/cli

Length of output: 109

🏁 Script executed:

# Check for gocritic unnamedResult specifically # gocritic is harder to run standalone, so let's search for functions with unnamed returns rg 'func.*\(.*\)\s*\(' internal/shards/tour.go -A 1 | head -30

Repository: supermodeltools/cli

Length of output: 273

🏁 Script executed:

# Let's see what the issue actually is by checking the exact code again cat -n internal/shards/tour.go | sed -n '19,25p' cat -n internal/shards/tour.go | sed -n '159,170p'

Repository: supermodeltools/cli

Length of output: 571

Add doc comments and name the return values to fix lint issues.

Go's linter expects exported methods (ones that start with a capital letter) to have doc comments explaining what they do. Right now, Name() and Order() on TopoStrategy are missing these. Think of it like: if you're sharing code with others, tell them what each public method does.

Also, splitDomain() has two return values but they're unnamed — just (string, string). The linter prefers them to be named (like (dom string, sub string)) so anyone reading the code immediately knows what each returned value represents.

The suggested fixes above will resolve all three issues:

Add a one-line doc comment before Name() explaining it returns the strategy identifier

Add a one-line doc comment before Order() explaining it returns the ordered files

Name the return values in splitDomain() as dom and sub

🧰 Tools

🪛 GitHub Check: golangci-lint

[failure] 23-23:
exported: exported method TopoStrategy.Order should have comment or be unexported (revive)

[failure] 21-21:
exported: exported method TopoStrategy.Name should have comment or be unexported (revive)

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed. In `@internal/shards/tour.go` around lines 21 - 23, Add one-line doc comments for the exported methods TopoStrategy.Name and TopoStrategy.Order describing that Name returns the strategy identifier and Order returns the ordered files for the given cache; also change the signature of splitDomain to name its two return values (e.g., func splitDomain(...) (dom string, sub string)) so callers/readers immediately know what each returned string represents and to satisfy the linter.

coderabbitai · 2026-04-13T22:23:30Z

+// riskFor is a narrow re-derivation of the impact section's risk tier for the
+// tour line. It matches renderImpactSection's thresholds so tour and shard stay
+// in agreement.
+func riskFor(file string, cache *Cache) string {
+	transitive := cache.TransitiveDependents(file)
+	domains := map[string]bool{}
+	if d := cache.FileDomain[file]; d != "" {
+		domains[d] = true
+	}
+	for f := range transitive {
+		if d := cache.FileDomain[f]; d != "" {
+			domains[d] = true
+		}
+	}
+	switch {
+	case len(transitive) > 20 || len(domains) > 2:
+		return "HIGH"
+	case len(transitive) > 5 || len(domains) > 1:
+		return "MEDIUM"
+	default:
+		return "LOW"
+	}


⚠️ Potential issue | 🟠 Major

Risk tier derivation can drift from shard impact risk.

On Line 179, the comment says this matches shard impact thresholds, but Line 187 only folds in transitive importer domains. renderImpactSection also counts domains from direct caller files (function-call edges), so TOUR risk can disagree with shard risk for the same file.

Suggested fix

func riskFor(file string, cache *Cache) string { transitive := cache.TransitiveDependents(file) + directCallerFiles := make(map[string]bool) + for id, fn := range cache.FnByID { + if fn.File != file { + continue + } + for _, caller := range cache.Callers[id] { + if caller.File != "" && caller.File != file { + directCallerFiles[caller.File] = true + } + } + } + domains := map[string]bool{} if d := cache.FileDomain[file]; d != "" { domains[d] = true } for f := range transitive { if d := cache.FileDomain[f]; d != "" { domains[d] = true } } + for f := range directCallerFiles { + if d := cache.FileDomain[f]; d != "" { + domains[d] = true + } + } switch { case len(transitive) > 20 || len(domains) > 2: return "HIGH"

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

// riskFor is a narrow re-derivation of the impact section's risk tier for the

// tour line. It matches renderImpactSection's thresholds so tour and shard stay

// in agreement.

func riskFor(file string, cache *Cache) string {

transitive := cache.TransitiveDependents(file)

domains := map[string]bool{}

if d := cache.FileDomain[file]; d != "" {

domains[d] = true

}

for f := range transitive {

if d := cache.FileDomain[f]; d != "" {

domains[d] = true

}

}

switch {

case len(transitive) > 20 || len(domains) > 2:

return "HIGH"

case len(transitive) > 5 || len(domains) > 1:

return "MEDIUM"

default:

return "LOW"

}

// riskFor is a narrow re-derivation of the impact section's risk tier for the

// tour line. It matches renderImpactSection's thresholds so tour and shard stay

// in agreement.

func riskFor(file string, cache *Cache) string {

transitive := cache.TransitiveDependents(file)

directCallerFiles := make(map[string]bool)

for id, fn := range cache.FnByID {

if fn.File != file {

continue

}

for _, caller := range cache.Callers[id] {

if caller.File != "" && caller.File != file {

directCallerFiles[caller.File] = true

}

}

}

domains := map[string]bool{}

if d := cache.FileDomain[file]; d != "" {

domains[d] = true

}

for f := range transitive {

if d := cache.FileDomain[f]; d != "" {

domains[d] = true

}

}

for f := range directCallerFiles {

if d := cache.FileDomain[f]; d != "" {

domains[d] = true

}

}

switch {

case len(transitive) > 20 || len(domains) > 2:

return "HIGH"

case len(transitive) > 5 || len(domains) > 1:

return "MEDIUM"

default:

return "LOW"

}

}

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed. In `@internal/shards/tour.go` around lines 178 - 199, riskFor currently only aggregates domains from cache.TransitiveDependents(file) and the file itself via cache.FileDomain, which can diverge from renderImpactSection because that function also includes domains of direct caller files (function-call edges); update riskFor to also iterate direct callers of the file (i.e., the files that call this file via function-call edges) and add their cache.FileDomain entries into the domains set before computing thresholds so the domain count matches renderImpactSection (reference symbols: riskFor, cache.TransitiveDependents, cache.FileDomain and the codepath that obtains direct callers such as the function that returns direct callers/callers for a file in the cache).

… analyze wiring Completes the linearization layer over sharding. All four strategies, prose narrative preamble, token-budget chapter chunking, and analyze integration so one command produces shards + spine. - internal/shards/tour.go - BFSSeedStrategy / DFSSeedStrategy: undirected walk from --seed, emit-on-pop so DFS descends one branch fully before the next. - CentralityStrategy: transitive-dependent count desc, lex tiebreak. - ChunkTour: splits at file-entry boundaries to fit a token budget; TOUR.md becomes an index linking to TOUR.01.md..TOUR.NN.md with prev/next chapter links. - WriteTour now accepts budgetTokens; chapter files written atomically. - ResolveStrategy: shared strategy lookup (used by tour + analyze). - internal/shards/narrative.go - RenderNarrative: prose preamble — domain/subdomain, import and importer counts with named examples, defined functions + call adjacency, risk tier with transitive-dependent count. - internal/shards/render.go - RenderAll / RenderAllThreeFile take a narrate bool; when set, each shard is prefixed with the narrative block. - cmd/tour.go - Flags: --strategy, --seed, --narrate, --budget-tokens, --dry-run. - --narrate re-renders existing shards using the cached graph. - cmd/analyze.go - --tour, --tour-strategy, --tour-seed, --tour-budget, --narrate — so a single 'supermodel analyze --tour --narrate' emits shards + tour in one pass. - internal/shards/handler.go: wires opts through the cached, fresh, and stale-fallback code paths; maybeWriteTour runs after shards succeed. - internal/shards/daemon.go + tests: updated call sites for new signature. Tests: 13 new cases covering each strategy (leaves-first, determinism, cycles, reachable-only, BFS-vs-DFS ordering, centrality, chunk boundaries, budget fit, strategy resolution, narrative content, narrate in RenderAll). Full suite green, go vet clean. Smoke-tested end-to-end on a fabricated graph: all four strategies produce correct orderings; --budget-tokens 60 chunks 5 files into 5 chapters with index; --narrate prefixes each shard with prose.

coderabbitai

Actionable comments posted: 7

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)

cmd/analyze.go (1)

62-76: ⚠️ Potential issue | 🟡 Minor

Shard-dependent flags can silently no-op when shards are disabled in config.

Right now --tour / --narrate can be accepted but ignored if cfg.ShardsEnabled() is false, which is confusing behavior.

Suggested guard

 			if noShards && (narrate || tour) {
 				return fmt.Errorf("--narrate and --tour require shards (cannot combine with --no-shards)")
 			}
+			if !cfg.ShardsEnabled() && (threeFile || narrate || tour) {
+				return fmt.Errorf("--three-file, --narrate, and --tour require shards to be enabled")
+			}
 			dir := "."

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@cmd/analyze.go` around lines 62 - 76, The CLI accepts shard-dependent flags
(e.g., tour, narrate and the tour* options) but currently they silently no-op
when cfg.ShardsEnabled() is false; add a guard before choosing the generation
path that checks if cfg.ShardsEnabled() is false and any shard-dependent flags
(tour, narrate, tourStrategy, tourSeed, tourBudget) are set, and if so return a
clear error (or non-zero exit) telling the user those flags require shards to be
enabled; locate this check in cmd/analyze.go around the existing
cfg.ShardsEnabled() branch that calls shards.Generate(...) and analyze.Run(...),
and perform the validation there so analyze.Run is not invoked while ignoring
the shard-specific flags.

♻️ Duplicate comments (2)

docs/linearization.md (1)
43-47: ⚠️ Potential issue | 🟡 Minor

Add language tags to fenced code blocks to satisfy markdown lint.

The unlabeled fences trigger MD040; adding explicit languages keeps docs lint-clean.
Suggested patch
-```
+```text
 TOUR.md                         ← linear walk (this feature)
 src/auth/session.go             ← source file
 src/auth/session.graph.go       ← existing shard (per-file linearization)
@@
- +bash
supermodel tour [--strategy topo|bfs-seed|dfs-seed|centrality]
[--seed ]
[--narrate]
[--budget-tokens ]
[--dry-run]
[path]
@@
-```
+```bash
supermodel analyze [--tour]
                   [--tour-strategy topo|bfs-seed|dfs-seed|centrality]
                   [--tour-seed <file>]
                   [--tour-budget <N>]
                   [--narrate]
                   [path]
</details>


Also applies to: 117-124, 128-135

<details>
<summary>🤖 Prompt for AI Agents</summary>
Verify each finding against the current code and only fix it if needed.

In @docs/linearization.md around lines 43 - 47, The markdown in
docs/linearization.md uses unlabeled fenced code blocks which trigger MD040;
update each fence to include a language tag (e.g., change the file-list block
fence to text and the command examples to bash) so the three shown blocks
(the TOUR.md/source file shard listing and the two supermodel command examples)
and the additional blocks referenced (around the other examples at the same
sections) all have explicit language tags; edit the fences in those code blocks
to match the suggested patch (use text for plain lists and bash for CLI
examples) throughout the file.
</details>

</blockquote></details>
<details>
<summary>internal/shards/tour.go (1)</summary><blockquote>

`21-50`: _⚠️ Potential issue_ | _🟡 Minor_

**Lint is still red in this file.**

The exported `Name`/`Order` methods still need doc comments, and `splitDomain` still returns two anonymous strings. That matches the current `revive`/`gocritic` failures, so CI will keep failing until this is cleaned up.  
  


Also applies to: 137-137, 275-275

<details>
<summary>🤖 Prompt for AI Agents</summary>

```
Verify each finding against the current code and only fix it if needed.

In `@internal/shards/tour.go` around lines 21 - 50, Exported methods like Name on
TopoStrategy, BFSSeedStrategy, DFSSeedStrategy, CentralityStrategy and all Order
methods must have proper doc comments (start with the method name) and the
splitDomain function must return two named string results instead of anonymous
returns; add comments above each exported Name/Order function using the exact
function names (e.g., "TopoStrategy.Name ...", "BFSSeedStrategy.Order ...") and
change splitDomain's signature to return (domain, suffix string) and update its
return statements and callers to use those named results so revive/gocritic
warnings are resolved.
```

</details>

</blockquote></details>

</blockquote></details>

<details>
<summary>🤖 Prompt for all review comments with AI agents</summary>
Verify each finding against the current code and only fix it if needed.

Inline comments:
In @cmd/tour.go:

Around line 81-84: When --narrate is set we must not rewrite other shard
formats; instead, before calling shards.RenderAll from the narrate branch (where
cache.SourceFiles() is used), filter the files list to only those that currently
use the .graph shard format (i.e. detect existing .graph.* shards via the
cache/metadata API), or replace the call with a new helper like
shards.RenderExistingGraphShards(repoDir, cache, graphFiles, true, dryRun) that
only rewrites .graph shards; update the narrate branch to use the filtered list
(or new helper) so RenderAll does not convert repos from .calls/.deps/.impact to
.graph and avoid stale-file cleanup deleting those other formats.

In @internal/shards/handler.go:

Around line 51-67: The function maybeWriteTour currently proceeds to write a
tour even when a seed (for bfs-seed/dfs-seed) resolves to no files; update
maybeWriteTour to validate the chosen seed before calling WriteTour by checking
that the resolved seed files appear in cache.SourceFiles() (or that
seededTraversal returned a non-empty set) and, if the seed is missing or yields
no files, log a warning via ui.Warn and return early; keep the existing
ResolveStrategy and WriteTour calls but add this seed existence check (using
cache.SourceFiles() and strategy/seed identifiers) to prevent reporting success
for an absent seed.

Around line 72-89: The error messages in ResolveStrategy incorrectly reference
a non-existent CLI flag (--seed); update the validation here to be flag-agnostic
by changing the fmt.Errorf messages in ResolveStrategy for the "bfs-seed" and
"dfs-seed" branches to something like "bfs-seed requires a seed file" and
"dfs-seed requires a seed file" (or move the flag-specific check into the CLI
layer that parses --tour-seed), ensuring references to BFSSeedStrategy,
DFSSeedStrategy and ResolveStrategy remain consistent so maybeWriteTour and the
analyze flow no longer point users to the wrong flag.

In @internal/shards/narrative.go:

Around line 114-124: uniqueCallerNames currently dedupes by function name
which collapses distinct functions that share the same name; change the dedupe
key to the unique function identifier instead. In uniqueCallerNames, use
r.FuncID (or another unique ID from CallerRef) as the seen-map key (e.g.,
map[FuncID]bool) while still using cache.FuncName(r.FuncID) to produce the human
name to append; skip when the name is empty or the FuncID was already seen, mark
the FuncID as seen, and append the name. This preserves distinct functions with
identical names while keeping return type as []string.

In @internal/shards/tour.go:

Around line 462-485: The current generation logic can leave stale TOUR.NN.md
files when the new chapters slice is smaller; before writing new multi-chapter
files (i.e. in the branch that iterates over chapters and calls
writeFile(fmt.Sprintf("TOUR.%02d.md", ...), chapter)), delete any existing
TOUR..md files that won't be overwritten (or delete all TOUR.NN.md files up
front) so only the newly written set remains; use a glob for the TOUR..md
pattern, remove matches that are not in the current expected filenames, and then
proceed to write the new TOUR.%02d.md files and the index (update both the
single-chapter and multi-chapter branches to clean up stale TOUR.NN.md files
appropriately).

Around line 215-241: The current loop over entries (built from order using
splitDomain and written with writeTourEntry) only emits headings when domain
changes but doesn't regroup files, so domains reappear; fix by collecting files
into a two-level ordered map: track a slice of seen domains and for each domain
a slice of seen subdomains, and append each file into domainMap[dom][sub] in the
same order they appear in order; then iterate the seen domains and their seen
subdomains (preserving per-group file order) to emit the "## Domain" and
optional "### Subdomain" headings and call writeTourEntry for each file (use the
existing variables entries, splitDomain, cache.FileDomain, repoRelPrefix, and b)
so output is grouped by domain/subdomain but retains intra-group order.

Around line 357-417: The parser currently resets currentHeading after adding
the first domainedEntry, so subsequent "- **" entries lose their domain heading;
remove the currentHeading.Reset() call in the block parser (the loop that builds
entries from rest) so that currentHeading persists across multiple entries until
a new "## " or "### " heading block updates it, ensuring later chapters retain
the proper heading context for entries (domainedEntry.heading).

Outside diff comments:
In @cmd/analyze.go:

Around line 62-76: The CLI accepts shard-dependent flags (e.g., tour, narrate
and the tour* options) but currently they silently no-op when
cfg.ShardsEnabled() is false; add a guard before choosing the generation path
that checks if cfg.ShardsEnabled() is false and any shard-dependent flags (tour,
narrate, tourStrategy, tourSeed, tourBudget) are set, and if so return a clear
error (or non-zero exit) telling the user those flags require shards to be
enabled; locate this check in cmd/analyze.go around the existing
cfg.ShardsEnabled() branch that calls shards.Generate(...) and analyze.Run(...),
and perform the validation there so analyze.Run is not invoked while ignoring
the shard-specific flags.

Duplicate comments:
In @docs/linearization.md:

Around line 43-47: The markdown in docs/linearization.md uses unlabeled fenced
code blocks which trigger MD040; update each fence to include a language tag
(e.g., change the file-list block fence to ```text and the command examples to
the two supermodel command examples) and the additional blocks referenced
(around the other examples at the same sections) all have explicit language
tags; edit the fences in those code blocks to match the suggested patch (use
text for plain lists and bash for CLI examples) throughout the file.

In `@internal/shards/tour.go`:
- Around line 21-50: Exported methods like Name on TopoStrategy,
BFSSeedStrategy, DFSSeedStrategy, CentralityStrategy and all Order methods must
have proper doc comments (start with the method name) and the splitDomain
function must return two named string results instead of anonymous returns; add
comments above each exported Name/Order function using the exact function names
(e.g., "TopoStrategy.Name ...", "BFSSeedStrategy.Order ...") and change
splitDomain's signature to return (domain, suffix string) and update its return
statements and callers to use those named results so revive/gocritic warnings
are resolved.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

Push a commit to this branch (recommended)

Create a new PR with the fixes

ℹ️ Review info

⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: c2fc073c-bd19-4e5d-ae23-5bc25994bee5

📥 Commits

Reviewing files that changed from the base of the PR and between 2ef0846 and 57ef344.

📒 Files selected for processing (11)

cmd/analyze.go

cmd/tour.go

docs/linearization.md

internal/shards/daemon.go

internal/shards/handler.go

internal/shards/narrative.go

internal/shards/render.go

internal/shards/render_stale_test.go

internal/shards/render_test.go

internal/shards/tour.go

internal/shards/tour_test.go

🚧 Files skipped from review as they are similar to previous changes (1)

internal/shards/tour_test.go

coderabbitai · 2026-04-13T23:02:59Z

+			if narrate {
+				files := cache.SourceFiles()
+				written, rerr := shards.RenderAll(repoDir, cache, files, true, dryRun)
+				if rerr != nil {


⚠️ Potential issue | 🟠 Major

--narrate can destructively switch shard format.

Calling RenderAll here rewrites as .graph.* and triggers stale three-file cleanup, so repos using .calls/.deps/.impact can lose those files during supermodel tour --narrate.

Safer approach (rewrite only existing .graph shards)

if narrate { files := cache.SourceFiles() - written, rerr := shards.RenderAll(repoDir, cache, files, true, dryRun) + var graphFiles []string + for _, f := range files { + if _, statErr := os.Stat(filepath.Join(repoDir, shards.ShardFilename(f))); statErr == nil { + graphFiles = append(graphFiles, f) + } + } + written, rerr := shards.RenderAll(repoDir, cache, graphFiles, true, dryRun) if rerr != nil { return fmt.Errorf("re-rendering shards with narrative: %w", rerr) }

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed. In `@cmd/tour.go` around lines 81 - 84, When --narrate is set we must not rewrite other shard formats; instead, before calling shards.RenderAll from the narrate branch (where cache.SourceFiles() is used), filter the files list to only those that currently use the .graph shard format (i.e. detect existing .graph.* shards via the cache/metadata API), or replace the call with a new helper like shards.RenderExistingGraphShards(repoDir, cache, graphFiles, true, dryRun) that only rewrites .graph shards; update the narrate branch to use the filtered list (or new helper) so RenderAll does not convert repos from .calls/.deps/.impact to .graph and avoid stale-file cleanup deleting those other formats.

coderabbitai · 2026-04-13T23:02:59Z

+func maybeWriteTour(repoDir string, cache *Cache, opts GenerateOptions) {
+	if !opts.Tour {
+		return
+	}
+	strategy, err := ResolveStrategy(opts.TourStrategy, opts.TourSeed)
+	if err != nil {
+		ui.Warn("Tour skipped: %v", err)
+		return
+	}
+	out, err := WriteTour(repoDir, cache, strategy, opts.TourBudget, opts.DryRun)
+	if err != nil {
+		ui.Warn("Tour write failed: %v", err)
+		return
+	}
+	if !opts.DryRun {
+		ui.Success("Wrote tour to %s (strategy: %s)", out, strategy.Name())
+	}


⚠️ Potential issue | 🟠 Major

Don't report success for a seed file that isn't in the cache.

For bfs-seed / dfs-seed, a typo in opts.TourSeed currently falls through to seededTraversal, which returns no files, and then maybeWriteTour still logs success. That gives the user an almost-empty TOUR.md instead of a clear warning. Please validate the seed against cache.SourceFiles() before writing.

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed. In `@internal/shards/handler.go` around lines 51 - 67, The function maybeWriteTour currently proceeds to write a tour even when a seed (for bfs-seed/dfs-seed) resolves to no files; update maybeWriteTour to validate the chosen seed before calling WriteTour by checking that the resolved seed files appear in cache.SourceFiles() (or that seededTraversal returned a non-empty set) and, if the seed is missing or yields no files, log a warning via ui.Warn and return early; keep the existing ResolveStrategy and WriteTour calls but add this seed existence check (using cache.SourceFiles() and strategy/seed identifiers) to prevent reporting success for an absent seed.

coderabbitai · 2026-04-13T23:02:59Z

+func ResolveStrategy(name, seed string) (TourStrategy, error) {
+	switch name {
+	case "", "topo":
+		return TopoStrategy{}, nil
+	case "bfs-seed":
+		if seed == "" {
+			return nil, fmt.Errorf("bfs-seed requires --seed <file>")
+		}
+		return BFSSeedStrategy{Seed: seed}, nil
+	case "dfs-seed":
+		if seed == "" {
+			return nil, fmt.Errorf("dfs-seed requires --seed <file>")
+		}
+		return DFSSeedStrategy{Seed: seed}, nil
+	case "centrality":
+		return CentralityStrategy{}, nil
+	default:
+		return nil, fmt.Errorf("unknown strategy %q (supported: topo, bfs-seed, dfs-seed, centrality)", name)


⚠️ Potential issue | 🟡 Minor

The seed error message points at the wrong flag for analyze.

This helper says --seed, but the analyze flow in this PR uses --tour-seed. So the warning from maybeWriteTour sends users to a flag that doesn't exist in that command. I'd keep this message flag-agnostic here, or move the flag-specific validation up into the CLI layer.

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed. In `@internal/shards/handler.go` around lines 72 - 89, The error messages in ResolveStrategy incorrectly reference a non-existent CLI flag (--seed); update the validation here to be flag-agnostic by changing the fmt.Errorf messages in ResolveStrategy for the "bfs-seed" and "dfs-seed" branches to something like "bfs-seed requires a seed file" and "dfs-seed requires a seed file" (or move the flag-specific check into the CLI layer that parses --tour-seed), ensuring references to BFSSeedStrategy, DFSSeedStrategy and ResolveStrategy remain consistent so maybeWriteTour and the analyze flow no longer point users to the wrong flag.

coderabbitai · 2026-04-13T23:02:59Z

+func uniqueCallerNames(refs []CallerRef, cache *Cache) []string {
+	seen := make(map[string]bool, len(refs))
+	var out []string
+	for _, r := range refs {
+		n := cache.FuncName(r.FuncID)
+		if n == "" || seen[n] {
+			continue
+		}
+		seen[n] = true
+		out = append(out, n)
+	}


⚠️ Potential issue | 🟡 Minor

Caller/callee dedupe is collapsing distinct functions with the same name.

uniqueCallerNames dedupes on function name, so different functions named the same get merged into one mention. That makes the narrative adjacency lossy.

Suggested fix

func uniqueCallerNames(refs []CallerRef, cache *Cache) []string { - seen := make(map[string]bool, len(refs)) + seen := make(map[string]bool, len(refs)) var out []string for _, r := range refs { - n := cache.FuncName(r.FuncID) - if n == "" || seen[n] { + if r.FuncID == "" || seen[r.FuncID] { continue } - seen[n] = true - out = append(out, n) + n := cache.FuncName(r.FuncID) + if n == "" { + continue + } + seen[r.FuncID] = true + if r.File != "" { + out = append(out, fmt.Sprintf("%s (%s)", n, r.File)) + } else { + out = append(out, n) + } } sort.Strings(out) return out }

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed. In `@internal/shards/narrative.go` around lines 114 - 124, uniqueCallerNames currently dedupes by function name which collapses distinct functions that share the same name; change the dedupe key to the unique function identifier instead. In uniqueCallerNames, use r.FuncID (or another unique ID from CallerRef) as the seen-map key (e.g., map[FuncID]bool) while still using cache.FuncName(r.FuncID) to produce the human name to append; skip when the name is empty or the FuncID was already seen, mark the FuncID as seen, and append the name. This preserves distinct functions with identical names while keeping return type as []string.

coderabbitai · 2026-04-13T23:02:59Z

+	// Group by domain then subdomain while preserving order within each group.
+	type entry struct {
+		file string
+		dom  string
+		sub  string
+	}
+	entries := make([]entry, 0, len(order))
+	for _, f := range order {
+		dom, sub := splitDomain(cache.FileDomain[f])
+		entries = append(entries, entry{file: f, dom: dom, sub: sub})
+	}
+
+	lastDom, lastSub := "", ""
+	for _, e := range entries {
+		if e.dom != lastDom {
+			fmt.Fprintf(&b, "## Domain: %s\n\n", displayOrUnassigned(e.dom))
+			lastDom = e.dom
+			lastSub = ""
+		}
+		if e.sub != lastSub {
+			if e.sub != "" {
+				fmt.Fprintf(&b, "### Subdomain: %s\n\n", e.sub)
+			}
+			lastSub = e.sub
+		}
+		writeTourEntry(&b, e.file, cache, repoRelPrefix)
+	}


⚠️ Potential issue | 🟠 Major

This labels domains, but it doesn't really group them.

Right now you keep the raw strategy order and only emit a heading when the domain changes. So an order like api/a.go, cmd/b.go, api/c.go will render Domain: api, then Domain: cmd, then Domain: api again. That misses the PR goal of grouping files by domain/subdomain into a stable reading spine.

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed. In `@internal/shards/tour.go` around lines 215 - 241, The current loop over entries (built from order using splitDomain and written with writeTourEntry) only emits headings when domain changes but doesn't regroup files, so domains reappear; fix by collecting files into a two-level ordered map: track a slice of seen domains and for each domain a slice of seen subdomains, and append each file into domainMap[dom][sub] in the same order they appear in order; then iterate the seen domains and their seen subdomains (preserving per-group file order) to emit the "## Domain" and optional "### Subdomain" headings and call writeTourEntry for each file (use the existing variables entries, splitDomain, cache.FileDomain, repoRelPrefix, and b) so output is grouped by domain/subdomain but retains intra-group order.

coderabbitai · 2026-04-13T23:02:59Z

+	// Split entries on blank line (entries are separated by "\n\n").
+	type domainedEntry struct {
+		heading string // most recent "## Domain" or "### Subdomain" heading block
+		text    string // the "- **..." entry
+	}
+	var entries []domainedEntry
+	var currentHeading strings.Builder
+	for _, block := range strings.Split(rest, "\n\n") {
+		block = strings.TrimRight(block, "\n")
+		if block == "" {
+			continue
+		}
+		if strings.HasPrefix(block, "## ") || strings.HasPrefix(block, "### ") {
+			currentHeading.WriteString(block)
+			currentHeading.WriteString("\n\n")
+			continue
+		}
+		if strings.HasPrefix(block, "- **") {
+			entries = append(entries, domainedEntry{heading: currentHeading.String(), text: block})
+			currentHeading.Reset()
+		}
+	}
+
+	var chapters [][]domainedEntry
+	var currentChapter []domainedEntry
+	currentSize := approxTokens(preamble)
+	for _, e := range entries {
+		entrySize := approxTokens(e.heading) + approxTokens(e.text) + 2
+		if len(currentChapter) > 0 && currentSize+entrySize > budgetTokens {
+			chapters = append(chapters, currentChapter)
+			currentChapter = nil
+			currentSize = approxTokens(preamble)
+		}
+		currentChapter = append(currentChapter, e)
+		currentSize += entrySize
+	}
+	if len(currentChapter) > 0 {
+		chapters = append(chapters, currentChapter)
+	}
+
+	out := make([]string, len(chapters))
+	total := len(chapters)
+	for i, chapter := range chapters {
+		var b strings.Builder
+		b.WriteString(preamble)
+		fmt.Fprintf(&b, "> Chapter %d of %d", i+1, total)
+		if i > 0 {
+			fmt.Fprintf(&b, " · [prev](TOUR.%02d.md)", i)
+		}
+		if i < total-1 {
+			fmt.Fprintf(&b, " · [next](TOUR.%02d.md)", i+2)
+		}
+		b.WriteString("\n\n")
+		lastHeading := ""
+		for _, e := range chapter {
+			if e.heading != "" && e.heading != lastHeading {
+				b.WriteString(e.heading)
+				lastHeading = e.heading
+			}
+			b.WriteString(e.text)
+			b.WriteString("\n\n")


⚠️ Potential issue | 🟠 Major

Chunking can drop the heading at the start of later chapters.

Only the first entry after a ## Domain / ### Subdomain block keeps that heading. If a chapter split happens in the middle of that section, the next chapter can start with a bare - **file** entry and no domain context.

Suggested fix

var entries []domainedEntry var currentHeading strings.Builder + lastHeading := "" for _, block := range strings.Split(rest, "\n\n") { block = strings.TrimRight(block, "\n") if block == "" { continue } @@ } if strings.HasPrefix(block, "- **") { - entries = append(entries, domainedEntry{heading: currentHeading.String(), text: block}) - currentHeading.Reset() + heading := currentHeading.String() + if heading == "" { + heading = lastHeading + } else { + lastHeading = heading + currentHeading.Reset() + } + entries = append(entries, domainedEntry{heading: heading, text: block}) } }

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed. In `@internal/shards/tour.go` around lines 357 - 417, The parser currently resets currentHeading after adding the first domainedEntry, so subsequent "- **" entries lose their domain heading; remove the currentHeading.Reset() call in the block parser (the loop that builds entries from rest) so that currentHeading persists across multiple entries until a new "## " or "### " heading block updates it, ensuring later chapters retain the proper heading context for entries (domainedEntry.heading).

coderabbitai · 2026-04-13T23:02:59Z

+	if len(chapters) == 1 {
+		if err := writeFile("TOUR.md", chapters[0]); err != nil {
+			return "", fmt.Errorf("write tour: %w", err)
+		}
+		return outPath, nil
+	}
+
+	// Multi-chapter: write TOUR.NN.md files and an index TOUR.md.
+	for i, chapter := range chapters {
+		if err := writeFile(fmt.Sprintf("TOUR.%02d.md", i+1), chapter); err != nil {
+			return "", fmt.Errorf("write chapter: %w", err)
+		}
+	}
+	var idx strings.Builder
+	fmt.Fprintf(&idx, "# Repository Tour — Index\n\n")
+	fmt.Fprintf(&idx, "**Strategy:** `%s` · %d chapters\n\n", strategy.Name(), len(chapters))
+	fmt.Fprintf(&idx, "Read chapters in order; each fits within the token budget.\n\n")
+	for i := range chapters {
+		fmt.Fprintf(&idx, "- [Chapter %d](TOUR.%02d.md)\n", i+1, i+1)
+	}
+	if err := writeFile("TOUR.md", idx.String()); err != nil {
+		return "", fmt.Errorf("write tour index: %w", err)
+	}
+	return outPath, nil


⚠️ Potential issue | 🟠 Major

Re-runs can leave stale TOUR.NN.md files behind.

If a repo used to produce, say, 5 chapters and now produces 1 or 2, this code only writes the new set. The old chapter files stay in .supermodel/, which is easy for tools or humans to pick up by mistake when they glob TOUR*.md.

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed. In `@internal/shards/tour.go` around lines 462 - 485, The current generation logic can leave stale TOUR.NN.md files when the new chapters slice is smaller; before writing new multi-chapter files (i.e. in the branch that iterates over chapters and calls writeFile(fmt.Sprintf("TOUR.%02d.md", ...), chapter)), delete any existing TOUR.*.md files that won't be overwritten (or delete all TOUR.NN.md files up front) so only the newly written set remains; use a glob for the TOUR.*.md pattern, remove matches that are not in the current expected filenames, and then proceed to write the new TOUR.%02d.md files and the index (update both the single-chapter and multi-chapter branches to clean up stale TOUR.NN.md files appropriately).

jonathanpopham · 2026-04-14T01:46:53Z

Closing as not planned — benchmark showed tour+narrate was 2.96× more expensive and 1.35× slower than main shards on the Django change_tracking task. Reboot in a new PR with a sharper interpretation of linearization: render the shard relationships themselves as human-language sentences (no tour, no preamble headers). See upcoming follow-up PR.

jonathanpopham changed the title ~~feat: linearization tour over the code graph (draft)~~ [DO NOT MERGE] feat: linearization tour over the code graph (draft) Apr 13, 2026

coderabbitai bot reviewed Apr 13, 2026

View reviewed changes

jonathanpopham closed this Apr 14, 2026

		func (TopoStrategy) Name() string { return "topo" }

		func (TopoStrategy) Order(cache *Cache) []string {

Conversation

jonathanpopham commented Apr 13, 2026 • edited by coderabbitai bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Objective

Feature-complete surface

Strategies (all implemented)

--narrate (implemented)

--budget-tokens N (implemented)

Files changed

Test plan

Design principles

Open questions for reviewers

References

Summary by CodeRabbit

Uh oh!

coderabbitai bot commented Apr 13, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Walkthrough

Changes

Sequence Diagram

Estimated code review effort

Possibly related PRs

Suggested reviewers

Poem

❌ Failed checks (1 warning)

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

coderabbitai bot Apr 13, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Apr 13, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Apr 13, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Apr 13, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Apr 13, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Apr 13, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Apr 13, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Apr 13, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Apr 13, 2026

Choose a reason for hiding this comment

Uh oh!

jonathanpopham commented Apr 14, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

jonathanpopham commented Apr 13, 2026 •

edited by coderabbitai bot

Loading

`--narrate` (implemented)

`--budget-tokens N` (implemented)

coderabbitai bot commented Apr 13, 2026 •

edited

Loading