Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
15 changes: 12 additions & 3 deletions AGENTS.md
Original file line number Diff line number Diff line change
Expand Up @@ -22,7 +22,7 @@
* **Monorepo structure: @loreai/core + opencode-lore packages with Bun workspaces**: Bun workspace monorepo, 3 packages: \`packages/core/\` (\`@loreai/core\`, runtime-agnostic, esbuild → \`dist/node/\` + \`dist/bun/\`), \`packages/opencode/\` (\`opencode-lore\`, ships raw TS — build is no-op echo so \`bun --filter '\*' build\` works uniformly; OpenCode loader runs TS under Bun), \`packages/pi/\` (\`@loreai/pi\`, ~132KB esbuild bundle). Root \`package.json\` is private with \`workspaces: \["packages/\*"]\` but MUST have \`main\`/\`exports\` pointing to \`./packages/opencode/src/index.ts\` — trampoline required because OpenCode's \`file:///\` plugin loader resolves from repo root; without it plugin loading silently fails. Declarations via \`tsc -p tsconfig.build.json\` into \`dist/types/\`, copied to both target dirs for barrel re-exports. Tests: \`bun test\` from root with preload at \`packages/core/test/setup.ts\`. Workspace \`package.json\` \`main\`/\`types\` MUST point to built \`dist/\` (Node can't run \`.ts\`); workspace-internal consumers use tsconfig \`paths\` mapping \`@loreai/core\` → \`../core/src/index.ts\`. Build tsconfig MUST NOT include this paths mapping (TS6059).

<!-- lore:019dbbb8-fbcd-7205-89de-e6197a32c6a7 -->
* **OpenCode built-in compaction fully disabled — Lore owns all context management**: OpenCode built-in compaction fully disabled — Lore owns all context management. \`config\` hook sets \`cfg.compaction = { auto: false, prune: false }\`. Lore overrides manual \`/compact\` via \`experimental.session.compacting\`: (a) chunked distillation via \`backgroundDistill\`, (b) \`loadForSession\` (archived=0 post-F2) injected as \`output.context\`, (c) \`findPreviousCompactSummary(client, sessionID)\` walks newest-first for assistant messages with truthy \`info.summary\`, joins text parts with \`\n\n\`, (d) \`buildCompactPrompt({ hasDistillations, knowledge, previousSummary })\` emits \`\<previous-summary>\` anchor before SUMMARY\_TEMPLATE when present. Pi's \`session\_before\_compact\` is deterministic substitution, no anchor. If upstream adds compaction not gated by \`compaction.auto\`, Lore breaks silently. Hooks Lore registers (6 in \`packages/opencode/src/index.ts\`): \`config\`, \`event\` (message.updated/session.error/session.idle), \`experimental.chat.system.transform\` (LTM), \`experimental.chat.messages.transform\` (gradient), \`experimental.session.compacting\`, \`tool\` (recall). Any rename of these breaks Lore silently.
* **OpenCode built-in compaction fully disabled — Lore owns all context management**: OpenCode built-in compaction fully disabled — Lore owns all context management. \`config\` hook sets \`cfg.compaction = { auto: false, prune: false }\`. Lore overrides manual \`/compact\` via \`experimental.session.compacting\`: chunked distillation via \`backgroundDistill\`, \`loadForSession\` (archived=0 post-F2) injected as \`output.context\`, \`findPreviousCompactSummary\` walks newest-first for assistant messages with truthy \`info.summary\`, \`buildCompactPrompt\` emits \`\<previous-summary>\` anchor before SUMMARY\_TEMPLATE. Pi's \`session\_before\_compact\` is deterministic substitution, no anchor. 6 hooks Lore registers in \`packages/opencode/src/index.ts\`: \`config\`, \`event\` (message.updated/session.error/session.idle), \`experimental.chat.system.transform\` (LTM), \`experimental.chat.messages.transform\` (gradient), \`experimental.session.compacting\`, \`tool\` (recall). Any rename of these or upstream compaction not gated by \`compaction.auto\` breaks Lore silently.

<!-- lore:019d9af0-d691-77c7-9a8b-cc5a21037b0c -->
* **SQLite #db/driver subpath import for Bun/Node dual-runtime**: Core uses Node subpath imports (\`#db/driver\` in package.json) to resolve \`bun:sqlite\` or \`node:sqlite\` at runtime. \`driver.bun.ts\` re-exports \`Database\` from \`bun:sqlite\` + \`sha256\` via \`node:crypto\`. \`driver.node.ts\` extends \`DatabaseSync\` from \`node:sqlite\` with \`.query(sql)\` shim using WeakMap statement caching — API parity with \`bun:sqlite\`. Tests run under Bun; esbuild bundles use \`conditions: \["node"|"bun"]\`. API differences: \`.query()\` vs \`.prepare()\`, \`{create:true}\` Bun-only, \`.transaction(fn)\` Bun-only (use manual BEGIN/COMMIT/ROLLBACK for cross-runtime). FTS5/pragmas identical. \`node:sqlite\` stable in Node 22.5+, no native addons.
Expand All @@ -37,6 +37,9 @@
<!-- lore:019c91d6-04af-7334-8374-e8bbf14cb43d -->
* **Calibration used DB message count instead of transformed window count — caused layer 0 false passthrough**: Gradient/calibration traps: (1) Calibration must use transformed window count via \`getLastTransformedCount()\`, not DB count — delta≈1 → layer 0 passthrough → overflow. (2) \`actualInput\` must include \`cache.write\` — cold-cache otherwise falls to layer 0. (3) Trailing pure-text assistant messages cause Anthropic prefill errors; drop loop must run at ALL layers (layer 0 shares ref with output). Never drop messages with tool parts (\`hasToolParts\`) — infinite loop. (4) Unregistered projects get zero context management → stuck compaction loops; recovery deletes messages after last good assistant message.

<!-- lore:019de7f5-4a14-7bb6-ad30-714cb662bf0a -->
* **Craft release: stale release branch must be deleted before re-triggering**: If \`release.yml\` was triggered before a needed PR merged, the resulting \`release/X.Y.Z\` branch and publish issue contain a stale snapshot. To re-cut: (1) \`gh issue close \<publish-issue-num>\`, (2) \`git push origin --delete release/X.Y.Z\`, (3) re-run \`gh workflow run release.yml -f version=auto\`. Craft will re-use the same version number if no new conventional-commit \`feat:\` landed since, or bump minor if a \`feat:\` is now included. Then accept the new publish issue. Publish workflow auto-closes the issue on success.

<!-- lore:019c8f4f-67ca-7212-a8c4-8a75b230ceea -->
* **Test DB isolation via LORE\_DB\_PATH and Bun test preload**: Lore test suite uses isolated temp DB via \`packages/core/test/setup.ts\` preload (\`bunfig.toml\` at repo root: \`preload = \["./packages/core/test/setup.ts"]\`). Preload sets \`LORE\_DB\_PATH\` to \`mkdtempSync\` path before any imports of \`src/db.ts\`; \`afterAll\` cleans up. \`src/db.ts\` checks \`LORE\_DB\_PATH\` first. \`agents-file.test.ts\` needs \`beforeEach\` cleanup for intra-file isolation and \`TEST\_UUIDS\` cleanup in \`afterAll\` (shared with \`ltm.test.ts\`). Tests covering OpenCode-specific code (plugin init, recovery functions) live in \`packages/opencode/test/\`. Driver-level tests in \`packages/core/test/db-driver.test.ts\`.

Expand All @@ -45,6 +48,9 @@

### Pattern

<!-- lore:019de7fb-feb8-7ea2-b341-2c5d2baf36ed -->
* **Context health diagnostics: C\_norm + R\_compression**: \`packages/core/src/temporal.ts\` exports \`temporalCnorm(timestamps, now?)\` — normalized variance of relative-existence weights ∈\[0,1]; 0=uniform, 1=concentrated in past. \`packages/core/src/distillation.ts\` exports \`compressionRatio(distilledTokens, sourceTokens)\` returning \`k/√N\`; <1.0 = aggressive/likely-lossy. Both logged per-segment in \`distillSegment\` via \`log.info\` (gated by LORE\_DEBUG=1 — currently invisible without env var). Heuristics adapted from D7x7z49/llm-context-idea (temporal clustering + square-root boundary). Credited in README 'Standing on the shoulders of'. Phase 4 (meta-threshold tuning from R\_compression) deferred pending real data — favor storing metrics as nullable columns on \`distillations\` table over log scraping.

<!-- lore:019dc6bd-8425-7452-af95-9d0cf4c95f5c -->
* **F2 metaDistill anchoring + loadForSession archive default**: F2 metaDistill anchoring: \`metaDistill\` (\`packages/core/src/distillation.ts\`) anchors on prior gen>0 meta via \`latestMetaObservations(projectPath, sessionID)\` and only consolidates NEW gen-0 since. \`recursiveUser()\` emits \`\<previous-meta-summary>\` block when anchored; byte-identical when absent. Generation chain: \`Math.max(existing.gen, priorMeta.gen) + 1\` — \`loadGen0\` only returns gen=0, must fold in \`priorMeta.generation\`. Threshold: anchored ≥1 new gen-0; first-round ≥3. \`loadForSession\` defaults to \`archived=0\`. storeDistillation→archive isn't transactional; mid-crash leaves stale meta + un-archived gen-0s, next run re-consolidates.

Expand All @@ -58,7 +64,7 @@
* **Layer 4 token-budget tail (F7)**: Gradient layer 4 token-budget tail (F7): \`gradient.ts\` layer 4 ('nuclear') replaced fixed \`slice(-3)\` with \`tailBudget = clamp(usable \* 0.25, 2\_000, 8\_000)\`. Walks backward from \`currentTurnStart()\` accumulating \`estimateMessage()\` tokens until exhausted. Current turn (last user + subsequent assistants) ALWAYS included even if it exceeds budget — terminal layer must always return. Tool parts NOT stripped (would cause infinite tool-call loop). Distilled prefix unchanged. Variable scoping: \`transformInner()\` declares \`const turnStart\` inside layer 3; layer 4 must use a different name (e.g. \`nuclearTurnStart\`).

<!-- lore:019cb050-ef48-7cbe-8e58-802f17c34591 -->
* **Lore logging: LORE\_DEBUG gating for info/warn, always-on for errors**: \`packages/core/src/log.ts\`: \`log.info()\`/\`log.warn()\` suppressed unless \`LORE\_DEBUG=1|true\`; \`log.error()\` always emits. All write to stderr with \`\[lore]\` prefix. Exists because OpenCode TUI renders all stderr as red error text — routine status messages were alarming users. Use \`log.info()\` for status, \`log.warn()\` for non-actionable oddities, \`log.error()\` only in catch blocks for real failures. Never use \`console.error\` directly. LORE\_DEBUG also gates per-turn gradient diagnostics (layer, tokens, cap, prefix hash, system prompt hash) added for cache-bust investigation — invaluable for identifying byte-identity breaks but too noisy for default mode.
* **Lore logging: LORE\_DEBUG gating for info/warn, always-on for errors**: \`packages/core/src/log.ts\`: \`log.info()\`/\`log.warn()\` suppressed unless \`LORE\_DEBUG=1|true\`; \`log.error()\` always emits to stderr with \`\[lore]\` prefix. Exists because OpenCode TUI renders all stderr as red error text. Use \`log.info()\` for status, \`log.warn()\` for non-actionable oddities, \`log.error()\` only in catch blocks. Never use \`console.error\` directly. LORE\_DEBUG also gates per-turn gradient diagnostics (layer, tokens, cap, prefix hash, system prompt hash) for cache-bust investigation. TRAP: any diagnostic added via \`log.info()\` is invisible by default — for metrics that must be observable without env-var gating, write to DB columns rather than logs.

<!-- lore:019cb12a-c957-7e24-b3f5-6869f3429d13 -->
* **Lore release process: craft + issue-label publish**: Release process (craft + issue-label publish): publishes 4 tarballs (\`@loreai/core\`, \`@loreai/opencode\`, \`@loreai/pi\`, \`opencode-lore\` legacy mirror). CI: \`bun pm pack\` each, extract, swap \`name\` via jq, repack. \`npm version --workspaces\` fails EUNSUPPORTEDPROTOCOL on \`workspace:\*\` — workaround: \`preReleaseCommand: scripts/bump-version.sh\` rewrites \`version\` via jq AND patches \`bun.lock\` workspace version fields via awk. CRITICAL: \`bun pm pack\` rewrites \`workspace:\*\` from \`bun.lock\`, NOT package.json — without lockfile patch, tarballs ship stale cross-workspace deps (ETARGET). \`actions/setup-node@v4\` MUST set \`registry-url: https://registry.npmjs.org\` or OIDC fails ENEEDAUTH.
Expand All @@ -73,7 +79,10 @@
* **Recall logic extracted to core, thin tool wrappers per host**: Recall search+format logic lives in \`packages/core/src/recall.ts\` as host-agnostic \`runRecall({projectPath, sessionID, query, scope, llm, knowledgeEnabled, searchConfig})\` returning a formatted markdown string. Host adapters wrap it in their tool framework: \`packages/opencode/src/reflect.ts\` uses \`tool({args, execute})\` with OpenCode's Zod-ish schema; \`packages/pi/src/reflect.ts\` uses \`pi.registerTool({parameters: Type.Object(...), execute})\` with Typebox. Both adapters are ~75 lines — all BM25/FTS5/RRF fusion, vector search, cross-project discovery, lat.md section search stays in core. Pattern applies to any future host (ACP, CLI): keep logic in \`@loreai/core\`, write a thin tool-framework wrapper per host.

<!-- lore:019dc5f4-9ecc-777a-bf51-855749f86e2a -->
* **SQLite migration runner: per-statement atomic, not transactional**: SQLite migration runner per-statement atomic, not transactional: \`packages/core/src/db.ts\` \`migrate()\` calls \`database.exec(MIGRATIONS\[i])\` with no \`BEGIN\`/\`COMMIT\` wrapper. Multi-statement migrations crash mid-run leave partial progress — write idempotent SQL (e.g. F3b's WHERE clause matches 0 rows after first run). VACUUM special-cased at \`VACUUM\_MIGRATION\_INDEX = 2\` (cannot run in transaction). FTS5 content-table-backed: per-row UPDATE auto-triggers reindex via \`temporal\_fts\_update\`; bulk reindex via \`INSERT INTO temporal\_fts(temporal\_fts) VALUES('rebuild')\`.
* **SQLite migration runner: per-statement atomic, not transactional**: SQLite migration runner (\`packages/core/src/db.ts\`) is per-statement atomic, NOT transactional: \`migrate()\` calls \`database.exec(MIGRATIONS\[i])\` with no BEGIN/COMMIT — write idempotent SQL. Traps: (1) \`ALTER TABLE ADD COLUMN\` is NOT idempotent — duplicate-column error aborts later statements (migration 7 killed \`kv\_meta\` on re-run). Fix: catch duplicate-column errors and re-exec remaining statements; \`recoverMissingObjects()\` runs every startup to idempotently \`CREATE TABLE IF NOT EXISTS\` critical tables (structural only, no backfill — add new critical tables here). (2) \`db()\` must NOT assign \`instance\` until \`migrate()\` succeeds. VACUUM special-cased at \`VACUUM\_MIGRATION\_INDEX = 2\`. FTS5 content-table-backed: per-row UPDATE auto-reindexes; bulk via \`INSERT INTO temporal\_fts(temporal\_fts) VALUES('rebuild')\`.

<!-- lore:019de7fb-fec3-75a5-ae26-88193fc59abf -->
* **Time-gap-aware detectSegments + recall recency RRF**: \`detectSegments\` (\`packages/core/src/distillation.ts\`) prefers splitting at the largest inter-message time gap when that gap is ≥3× the median gap; falls back to count-based splitting for uniform timestamps (preserves legacy behavior). Min segment size 3 — tiny trailing segments still merged into previous. Exported for tests. Recall pipeline (\`packages/core/src/recall.ts:runRecall\`) adds a recency-sorted list of temporal results to RRF fusion alongside the BM25 list, both keyed \`t:\<id>\` so RRF naturally boosts items appearing in both — no new thresholds. Pure additive; no schema changes; existing BM25-only behavior preserved when recency list is empty.

### Preference

Expand Down
18 changes: 17 additions & 1 deletion packages/core/src/db.ts
Original file line number Diff line number Diff line change
Expand Up @@ -2,7 +2,7 @@ import { Database } from "#db/driver";
import { join, dirname } from "path";
import { mkdirSync } from "fs";

const SCHEMA_VERSION = 11;
const SCHEMA_VERSION = 12;

const MIGRATIONS: string[] = [
`
Expand Down Expand Up @@ -333,6 +333,22 @@ const MIGRATIONS: string[] = [
WHERE content LIKE '%' || char(10) || '[tool:%'
OR content LIKE '%' || char(10) || '[reasoning] %';
`,
`
-- Version 12: Context health diagnostic columns on distillations.
--
-- r_compression: k/√N where k = distilled token count, N = source token
-- count. Values < 1.0 signal likely lossy compression. NULL for rows
-- created before this migration or for meta-distillations (gen > 0)
-- where the metric is not computed.
--
-- c_norm: normalized variance of relative-existence weights over source
-- message timestamps. Range [0, 1]; 0 = uniform distribution, 1 = attention
-- dominated by distant past. NULL for pre-migration rows or meta-distillations.
--
-- Both columns are nullable REALs — cheap to add, no backfill needed.
ALTER TABLE distillations ADD COLUMN r_compression REAL;
ALTER TABLE distillations ADD COLUMN c_norm REAL;
`,
];

function dataDir() {
Expand Down
Loading
Loading