Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
15 commits
Select commit Hold shift + click to select a range
0c6aa63
FE-800: Add spec-to-cook-plan frontier and spike findings
kostandinang Jun 3, 2026
64a86e0
FE-800: Reconcile spec-to-cook-plan with petri-graph-compilation
kostandinang Jun 3, 2026
7e94ce7
FE-800 Slice 1: deterministic projection — completed spec → cook plan…
kostandinang Jun 3, 2026
c00c8fb
FE-800 Slice 2: LLM planning pass — depends_on DAG + epic grouping + …
kostandinang Jun 3, 2026
c1da487
FE-800 Slice 3: deterministic reconciliation — projected Plan + LLM e…
kostandinang Jun 3, 2026
32f665a
FE-800 Slice 4: CLI wiring — brunch plan emits .brunch/cook/plan.yaml…
kostandinang Jun 3, 2026
5e9e7a9
FE-800 Slice 5: warning-model hardening — single audit stream, synthe…
kostandinang Jun 3, 2026
dbc95e7
FE-800: Rename cook-plan-* → plan-* (orchestrator package, not cook c…
kostandinang Jun 3, 2026
ae7e672
FE-800: brunch plan <specId> — server-side snapshot builder
kostandinang Jun 3, 2026
8f4960a
FE-800: harden brunch plan CLI surface
kostandinang Jun 3, 2026
ab2a462
FE-800: spec-scoped plan output + cook --spec routing
kostandinang Jun 4, 2026
2a67e67
FE-800: extract spec-plan-paths as single owner of the spec-scoped la…
kostandinang Jun 4, 2026
bfe36e0
FE-800: enrich cook progress lines with slug derived from slice defin…
kostandinang Jun 4, 2026
cdfd869
FE-800: align slice-label docs, pi-actions style, pin non-empty-defin…
kostandinang Jun 4, 2026
3d21c13
FE-800: Note integration-blind verification follow-on in PLAN
kostandinang Jun 4, 2026
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
499 changes: 0 additions & 499 deletions memory/CARDS.md

This file was deleted.

26 changes: 23 additions & 3 deletions memory/PLAN.md
Original file line number Diff line number Diff line change
Expand Up @@ -28,6 +28,7 @@ The May 2026 intent-spec, multi-chat, changeset-ledger, prompt/context, and agen
1. `agent-fixture-substrate` — branch-complete off main, reconciling — FE-705 integration substrate for JSONL agent capability CLI and LLM-as-user probes.
2. `chat-runtime-secondary-chats` — FE-716; V1 done — PR #141 merged to main.
3. **Petrinaut integration sub-track** — umbrella **FE-760** (Orchestrator ⇄ Petrinaut). FE-761 (semantics), FE-762 (`net.json` + SDCPN export), FE-763 (event stream), and FE-784 (colour fold) have **landed**. **`petri-sync-server` (FE-764)** is the active piece, reshaped (2026-06-01 meeting) into an **ephemeral cook-hosted SSE live stream** for the Bristol demo — no-colour, replay-on-connect, brunch-initiated session, supersedes the dropped static-bundle idea. Replaces the POC interpreter's visualization role with Petrinaut as canonical surface.
4. `spec-to-cook-plan` — **FE-800**; **done — branch-complete off FE-764**, PR #167 pending re-description. Six slices landed: 1 (deterministic projection) + 2 (LLM planning pass) + 3 (deterministic reconciliation) + 4 (CLI wiring) + 5 (warning-model hardening) + 6 (read from spec id — `brunch plan <specId>`, server-side snapshot builder `buildCompletedSpecSnapshot` over `getEntitiesForSpecificationOnActivePath`, plan driver moved into `src/server/plan-runner.ts`, orchestrator `plan-cli.ts` deleted). Bristol-demo front half (`brunch plan <specId>` → `.brunch/cook/plan.yaml` → `brunch cook --petrinaut-stream`) is now operational against any completed spec in the project DB. Two proving spikes done 2026-06-03. Move to **Recently Completed** on PR merge.

### Recently Completed

Expand All @@ -39,7 +40,7 @@ The May 2026 intent-spec, multi-chat, changeset-ledger, prompt/context, and agen
#### Follow-ons surfaced by the 2026-05-26 cook-codebase-mode smoke

- **pi-actions evaluate-done collapses the TDD workflow** — `pi-actions.ts:70` passes `--tools read,write,edit,bash` to every action including `evaluate-done`. Real pi fixed the buggy file *during evaluation* and reported `done: true` on the first call; write-tests / write-code / run-tests never executed. Affects both modes but is more visible in brownfield. Either restrict evaluator tools to `read` or accept this as the intended pi-as-agent behavior. Worth its own frontier.
- **`cook-artifact-lifecycle` frontier (proposed, not yet authored)** — slice 3's hybrid mechanism creates real slice branches (`cook-slice/<runId>/<sliceId>`) but never commits to them; the cook branch (`cook/<runId>`) still has HEAD === source HEAD and the modification lives in untracked subdirs of the cook branch's working tree. To close the loop: (a) commit slice work to the slice branch on slice completion, (b) replace `mergeSlicesIntoEpicSandbox`'s file-copy with `git merge` of slice branches into an epic branch surfacing real conflicts (today's file-copy is silent last-slice-wins), (c) merge epic branches back to `cook/<runId>` so `git merge cook/<runId>` from main becomes the promotion path. Pairs with worktree + branch GC story. ~2-3 days of structural work; slice 3 set up the substrate (real branches per slice) so this frontier can land cleanly on top.
- **cook output promotion (follow-on)** — slice 3 creates real slice branches (`cook-slice/<runId>/<sliceId>`) but never commits; `cook/<runId>` HEAD === source HEAD with modifications in untracked subdirs, so there is no promotion path into the user's checkout. To close: commit slice work, `git merge` sliceepic`cook/<runId>`, then `git merge cook/<runId>` from the working branch. Pairs with worktree/branch GC. Quality-of-life; the run worktree is already inspectable by hand.

### Next

Expand Down Expand Up @@ -106,9 +107,10 @@ The May 2026 intent-spec, multi-chat, changeset-ledger, prompt/context, and agen
- **Name:** Petri graph compilation — compile nets from plan-graph + relation policy
- **Linear:** unassigned in this plan snapshot
- **Kind:** structural
- **Status:** horizon (blocked on `intent-graph-semantics` FE-700)
- **Status:** horizon (blocked on `intent-graph-semantics` FE-700) — **premise weakened, partially subsumed by `spec-to-cook-plan` (FE-800); see Reconciliation**
- **Objective:** Compile Petri nets from workspace plan-graph nodes and relation-policy edges rather than from YAML plan fixtures. Relation kinds (`plan.depends_on`, `plan.verified_by_oracle`, `plan.introduces_design`, etc.) compile into topology-level requirements (prerequisite tokens, guard predicates, semantic-lane join conditions). Extends the FE-700 relation-policy registry.
- **Why now / unlocks:** Without graph compilation, the Petri engine only runs hand-authored YAML plans. Graph compilation makes the engine a planning oracle (simulate before executing) and connects execution to the semantic workspace.
- **Reconciliation with `spec-to-cook-plan` (FE-800, 2026-06-03):** This frontier's premise — compile from `plan.depends_on` relation-policy edges — quietly assumed those execution-order edges exist in the graph (to be supplied by FE-700). The FE-800 spikes proved **execution order is not spec truth and FE-700 will not conjure it** (the observer captures only epistemic deps; requirements are pure sinks of `depends_on`). So the ordering this frontier wanted to read must be **synthesized** — exactly what FE-800's LLM planning pass does at the `plan.yaml` layer, after which the existing `net-compiler.ts` (plan.yaml → net) already produces the net. Net effect: FE-800 + the existing compiler cover the graph→executable-net path; this frontier's remaining **distinct** value is the **Phase-4 simulation oracle** (analyze/simulate the net before running) and richer synthesized token/gate payloads, *not* a separate graph→net compiler. Reframe or fold accordingly before scheduling; do not treat as independent of FE-800.
- **Open design constraints (from PR #143 / FE-743 review):**
- **Declarative output arcs:** Extracted to its own frontier `petri-declarative-routing` (lands ahead of Phase 3; independent of FE-700).
- **Token state enrichment:** Open question whether more metadata should move from reports into tokens (richer typed token payloads per spec §3). FE-738 added `reworkCount`, FE-743 added pool tokens with `agentPoolSize`, but the boundary between control state (tokens) and substantive handoff state (reports) is a design choice this frontier needs to resolve as the token taxonomy gets richer.
Expand Down Expand Up @@ -284,6 +286,23 @@ The May 2026 intent-spec, multi-chat, changeset-ledger, prompt/context, and agen
- **Artifacts:** contract `src/orchestrator/src/petrinaut-stream-contract.ts` + `docs/petrinaut-stream-contract.md`; validated sample export from run `904d205d`.
- **Traceability:** §Lexicon `folded net` (export reuse; demo deviates via identity fold); I122-K (tokens are pointers → per-place counts suffice for marking deltas); execution-authority posture (Petrinaut renders; brunch's interpreter runs the net).

### spec-to-cook-plan

- **Name:** Spec → orchestrator plan emitter — project + plan a `brunch cook` plan.yaml from a completed intent graph
- **Linear:** FE-800 (standalone; not parented under FE-760)
- **Kind:** structural
- **Status:** done — branch-complete off FE-764, PR #167 pending re-description. Six slices landed: 1 (deterministic projection), 2 (LLM planning pass), 3 (deterministic reconciliation — id existence, self-loops, cycle break via Kahn lex-tie-break, non-buildable slice + dep dropping, epic grouping with default-epic fallback, synthesized unit-test verification targets, all transformations surfaced as typed `ReconciliationWarning[]`), 4 (CLI wiring composing the three stages, writes `.brunch/cook/plan.yaml`, surfaces warnings on stderr; emitter falls back to empty enrichment when the LLM throws so a usable orderless plan still emits), 5 (warning-model hardening — single `EmitterWarning` audit stream, synthesis demoted to verbose-only, formatter co-located), 6 (read from spec id — `brunch plan <specId> [--out=<dir>] [--verbose]`, server-side snapshot builder `buildCompletedSpecSnapshot(db, specId)` over `getEntitiesForSpecificationOnActivePath` mapping accepted requirements/criteria + active-path relationships filtered to accepted ids, plan driver moved to `src/server/plan-runner.ts`, orchestrator `plan-cli.ts` deleted). Two proving spikes done 2026-06-03 (see memory `spec-to-cook-plan-spike`); branch stacks on FE-764. Bristol-demo end-to-end path (`brunch plan <specId>` → `brunch cook --petrinaut-stream`) is now operational
- **Objective:** Emit a `brunch cook` plan.yaml from a completed brunch specification's intent graph. Three-stage emitter: **projection** (deterministic) — `requirement` items → slices, `criterion --verifies--> requirement` edges → per-slice verification linkage, stable slice ids; **planning pass** (LLM) — infer the execution-order `depends_on` DAG + epic grouping + non-buildable-constraint detection, since execution order is not spec truth and reads as zero from the graph; **reconciliation** (deterministic) — validate the LLM output for cook (drop/redirect deps onto non-buildable constraints, guarantee acyclicity, synthesize conventional verification targets, flag contradictions). Output is a reviewable artifact, not a silent input.
- **Why now / unlocks:** The missing front-half of the Bristol end-to-end demo (SPEC → generated plan → cook → Petri → Petrinaut). TRACK F execution + Petrinaut visualization are done/active (FE-760 umbrella, FE-764 streaming) and `cook-codebase-mode` runs brownfield, but every cook run still starts from a hand-authored plan.yaml. This is the smallest bridge from "fixture-driven orchestrator" to "brunch spec drives the orchestrator."
- **Spike findings (2026-06-03, against real completed spec 2 "brunch_graphs"):** (1) projection works today; verification linkage fully covered (every requirement has ≥1 verifying criterion). (2) graph-read dependency synthesis yields **zero** — requirements are only sinks of epistemic `depends_on`; **not fixable by FE-700** (it types relations, it doesn't make the observer emit execution order). (3) one `generateObject` call (claude-sonnet-4, ~900/640 tokens) produced a credible acyclic DAG + free non-buildable detection, but dangled deps onto constraints → requires the reconciliation stage. Not blocked by FE-700/FE-701/FE-705; spec 2 is a usable demo input that exists now.
- **Acceptance:** (1) `brunch` emits `<dir>/.brunch/cook/plan.yaml` from a completed specification (all phases confirmed). (2) Projection is deterministic: requirements → slices, verifies edges → verification linkage, stable slice ids. (3) Planning pass produces an acyclic `depends_on` DAG and flags non-buildable constraint-style requirements. (4) Reconciliation guarantees no dangling/cyclic deps and emits cook-valid schema (epics/slices/depends_on/verification). (5) The generated plan round-trips through `loadPlan` and drives a `brunch cook <repo> --petrinaut-stream` run end-to-end against a brownfield fixture. (6) Demo mode: ordering can be authored/overridden deterministically (reviewable) instead of LLM-generated, for a controlled Bristol run.
- **Open / pending decisions:** ordering LLM-by-default vs authored-by-default for the demo; whether the emitter lives server-side (capability contract) or in the orchestrator package; brownfield verification-target convention (criterion prose → runnable test path is synthesized, agent authors the test).
- **Follow-on — integration-blind verification (2026-06-04):** the first brownfield cook of `spatial_graph_layout` produced *orphan* feature modules (+ a Ladle story) that satisfied criteria like AC1 ("toggling the layout switch swaps between list and canvas") **without the feature existing in the running app**. Root cause sits in this emitter: the convention-synthesized `verification.target` is integration-blind, so the agent authored a test that passes in isolation. Productizing brownfield cook ("a cooked feature is real and visible in brunch") needs (a) the emitter to emit *integration-shaped* slices + verification that demands host-wiring — an **integration oracle** (product reachability, enforced in the FE-738 semantic lane; distinct from `petri-simulation-oracle`'s *net* reachability), and (b) run-output **promotion** into the checkout (see the cook-codebase-mode promotion follow-on). Not on the demo critical path — the Bristol path shows execution/visualization, which orphan-but-executed does not break. Revisit when brownfield cook moves from "executes a plan" to "ships a feature."
- **Relationship to `petri-graph-compilation` (Phase 3):** these are NOT independent. This frontier projects graph → `plan.yaml` then reuses the working `net-compiler.ts` (plan.yaml → net); Phase 3 wanted to compile graph → net directly from `plan.depends_on` relation edges. The spikes showed those execution-order edges don't exist and FE-700 won't supply them, so Phase 3's ordering input must itself be synthesized — i.e. FE-800 is the grounded source of what Phase 3 assumed it could read. FE-800 partially **subsumes** Phase 3; Phase 3's residual value is the simulation oracle (Phase 4), not the compile path. Keep the two reconciled.
- **Verification:** projection golden tests (spec fixture → plan.yaml); planning-pass acyclicity/contract tests (mock + opt-in real-provider); reconciliation tests (dangling-dep redirect, cycle break, non-buildable handling); end-to-end integration feeding a generated plan into the existing brownfield-smoke harness.
- **Traceability:** Requirements 46–50; D155-K–D160-K (new D160-K); A97 (validated); resolves SPEC §Constraints non-goal tension via D160-K. Spike memory: `spec-to-cook-plan-spike`.
- **Design docs:** `docs/design/orchestrator.md`; `docs/next/architecture/plan-graph-petri-orchestration.md`; umbrella H-6476.

### petrinaut-colour-fold

- **Name:** Petrinaut export — colour-fold per-slice subnet
Expand Down Expand Up @@ -685,7 +704,8 @@ orchestrator-poc (Phase 0: compiler extraction — done)
│ └──→ petri-event-stream (FE-763: initial markings + transition firings — done)
│ ├──→ petrinaut-colour-fold (FE-784: colour-fold export projection — done; set aside for the no-colour demo)
│ └──→ petri-sync-server (FE-764: ACTIVE — ephemeral cook-hosted SSE live stream; replay-on-connect; brunch-initiated session; Bristol demo)
├──→ petri-graph-compilation (Phase 3: compile from plan-graph + relation policy; needs FE-700)
├──→ spec-to-cook-plan (demo front-half: completed intent graph → cook plan.yaml; projection + LLM planning pass + reconciliation; spikes done; feeds FE-764 stream; NOT blocked by FE-700)
├──→ petri-graph-compilation (Phase 3: compile from plan-graph + relation policy; needs FE-700; premise weakened — partially subsumed by spec-to-cook-plan; residual value = Phase 4 sim oracle)
└──→ petri-simulation-oracle (Phase 4: reachability, deadlock, resume; declarative-routing structural prerequisite now satisfied; Phase 3 still needed for graph-derived gates)

LOWER-PRIORITY / DEFERRED
Expand Down
2 changes: 2 additions & 0 deletions memory/SPEC.md
Original file line number Diff line number Diff line change
Expand Up @@ -156,6 +156,7 @@ Brunch operates inside a **workspace**: the cwd-backed software context whose lo
| A94 | Durable secondary chats can replace independent side-chat persistence while preserving reloadable side, reconciliation, qa, and strategy conversations inside one workspace surface without introducing a `thread` table yet. | medium | open | D138, D153, Requirement 45 | In-stream secondary-chat rendering/reload walkthroughs over the existing chat/turn substrate. |
| A95 | Transcript-first context with explicit context snapshots on turn rows plus active graph-item handles on chats can keep secondary chats useful across multi-chat item changes without a persisted context-spec table. Handles only need re-snapshotting when the referenced item's version/fingerprint advances. | medium | open | D139, D140, D154, Requirement 45 | Context-provision tests for snapshot insertion, item-list/neighborhood/economic-graph snapshot builders, stale-handle refresh, and prompt/context-pack rendering. |
| A96 | Async-by-default reconciliation can move Pending review into an in-stream target-grouped reconciliation chat without hiding judgment work or surfacing auto-confirmed noise. | medium | open | D135, D137, D138, D146, D153 | Track 3 classifier scheduling, target-ordering tests, and dense reconciliation walkthroughs. |
| A97 | A completed intent graph can be projected + planned into a valid `brunch cook` plan.yaml: `requirement` items and `criterion --verifies--> requirement` edges read deterministically, but execution-order `depends_on` is **not** spec truth (the observer captures only epistemic deps; FE-700 does not change this) and must come from an LLM planning pass plus a deterministic reconciliation stage, not a graph read. | high | validated | D160-K, Requirements 46–50 | Two spikes 2026-06-03 against real completed spec 2 ("brunch_graphs"): projection clean + verification fully covered; graph-read req→req deps = 0; one `generateObject` call yielded a credible acyclic DAG + free non-buildable-constraint detection, but dangled deps onto constraints (needs reconciliation). |

### Active Decisions

Expand Down Expand Up @@ -206,6 +207,7 @@ Brunch operates inside a **workspace**: the cwd-backed software context whose lo
157. **Action dispatch is name-keyed and extensible** — engines orchestrate which action fires when; handlers own how. POC uses inline dispatch per engine; promote to a real `ActionRegistry` when a 3rd action type lands. Depends on: Requirement 46.
158. **Plan model is two-level (epics → slices), no milestones in POC** — schema is provisional pending canonical brunch plan emission. Forward-compatible for intent/design/oracle pointers.
159. **Worktree isolation per run** — agents write freely inside `<cwd>/.brunch/cook/runs/<runId>/worktree/` (cwd-scoped, not fixture-scoped); fixture dir and source repo untouched. Fixtures stay byte-identical before and after a run. Depends on: Requirement 49.
160. **Spec→cook-plan emission is a CLI/orchestrator-track seam, not a V1 product UI surface** — projecting and planning a cook `plan.yaml` from a completed intent graph is dev-layer orchestrator capability extending Requirements 46–50, so it does not breach the V1 product non-goal "Brunch elicits specs and stops at the handoff/export boundary," which governs interactive product UX. The emitter is three-stage: projection (deterministic graph read of requirements + verifies edges) + planning pass (LLM-inferred execution-order DAG, epic grouping, non-buildable detection) + reconciliation (deterministic validation: no dangling/cyclic deps, cook-valid schema, synthesized verification targets). Generated plans are reviewable artifacts, not silent inputs. Depends on: Requirements 46–50; A97.

#### Provider, prompt/context, and agent substrate

Expand Down
Loading
Loading