feat(ce-demo-reel): add demo reel skill with Python capture pipeline#541
feat(ce-demo-reel): add demo reel skill with Python capture pipeline#541
Conversation
…move feature-video Replace the hardcoded bin/dev + agent-browser + phantom imgup block in shipping-workflow.md with a project-type-aware evidence-capture skill that works across web apps, CLI tools, libraries, and desktop apps. The skill auto-detects project type, checks available tools (agent-browser, vhs, silicon, ffmpeg), recommends a capture tier (browser reel, terminal recording, screenshot reel, static screenshots, or skip), and uploads evidence to catbox.moe. A bash pipeline script handles ffmpeg stitching, frame normalization, palette generation, size optimization, and upload. Removes feature-video -- its GitHub-native MP4 upload via agent-browser DOM manipulation was over-engineered. GIFs uploaded to catbox render inline everywhere without platform-specific upload hacks. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
…ullet logic Convert three dense prose paragraphs into structured bullet/sub-bullet lists for better agent instruction-following and human reviewability: - DU-3: 6-action paragraph -> numbered step list with sub-bullets - Step 1 clean tree: interleaved prose/bullets -> labeled decision tree - Step 7 existing PR: dense paragraph -> 5-step numbered list Also adds evidence-capture integration to Step 6 with a user-facing question gate for when the PR has observable behavior. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 3d75fa0c4f
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
plugins/compound-engineering/skills/evidence-capture/scripts/capture-evidence.sh
Outdated
Show resolved
Hide resolved
plugins/compound-engineering/skills/evidence-capture/references/upload-and-approval.md
Outdated
Show resolved
Hide resolved
- Upload fallback now stages, commits, and pushes evidence to the branch so GitHub can render the relative path (previously left as local-only) - Add capture-evidence.sh test suite covering arg validation and ffmpeg stitch integration Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 3d2825b13a
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
plugins/compound-engineering/skills/evidence-capture/scripts/capture-evidence.sh
Outdated
Show resolved
Hide resolved
- Replace negative array indexing (bash 4+) with portable index computation for macOS /bin/bash 3.2 compatibility - Fix ffmpeg availability check in tests: use `which` instead of `command` (shell builtin, not spawnable as executable) Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Move frame existence validation before ffmpeg/ffprobe tool checks so the test gets the correct "Frame not found" error on CI runners that don't have ffmpeg installed. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 7495bef165
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
plugins/compound-engineering/skills/evidence-capture/scripts/capture-evidence.sh
Outdated
Show resolved
Hide resolved
Adds a "Frame the narrative before sizing" pre-writing step that forces before/after/scope articulation before drafting. Also strengthens the "lead with value" principle with a mechanism-vs-outcome anti-example, updates the medium sizing tier to reference the narrative frame, and adds guidance to name new test files in test plans. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 1e9bdeae29
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
plugins/compound-engineering/skills/evidence-capture/scripts/capture-evidence.sh
Outdated
Show resolved
Hide resolved
…ror handling - Frame reduction step minimum raised from 1 to 2 so 3-4 frame GIFs actually drop middle frames instead of re-adding all of them - Curl upload failures now use || true to prevent set -euo pipefail from exiting before the retry/fallback logic can run Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Move all deterministic logic into a Python script with 7 subcommands: preflight, detect, recommend, stitch, screenshot-reel, terminal-recording, upload. The agent's role shrinks to judgment calls (what to capture, which tier, user approval) while the script handles how. - Remove capture-evidence.sh and project-detection.md (logic moved to Python detect subcommand) - Update SKILL.md Steps 2/4/6 to call script instead of inline logic - Update tier references to use script for stitching and recording - Fix Codex review findings: require ffprobe for stitched tiers, rewrite VHS tape Output when --output overrides it - Remove docs/evidence commit fallback — artifacts are ephemeral, use OS temp not repo tree - Tests expanded from 8 to 21 covering preflight, detect (8 project type scenarios), recommend (6 tier mappings), stitch, and upload Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Add rule of thumb: .context/ for workflow state that other skills read, mktemp -d for throwaway artifacts (screenshots, GIFs, recordings) that get uploaded and discarded. Keeps ephemeral files out of the repo tree. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Rename skill directory, frontmatter, script (capture-demo.py), and all cross-skill references. Adds "capture evidence" and "add proof to a PR" as trigger phrases in the description so the old vocabulary still works. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: f129c2a63c
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
plugins/compound-engineering/skills/demo-reel/scripts/capture-demo.py
Outdated
Show resolved
Hide resolved
plugins/compound-engineering/skills/demo-reel/scripts/capture-demo.py
Outdated
Show resolved
Hide resolved
…mo-reel - Fix Angular detection: use @angular/core instead of angular (the npm package key is scoped) - Catch subprocess.TimeoutExpired on curl upload so timeouts fall through to retry logic instead of raising a traceback - Rename demo-reel -> ce-demo-reel to align with ce: namespace convention (ce-plan, ce-review, ce-work, ce-demo-reel) - Rename test file to ce-demo-reel.test.ts Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 100e5ab31d
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
plugins/compound-engineering/skills/ce-demo-reel/scripts/capture-demo.py
Show resolved
Hide resolved
plugins/compound-engineering/skills/ce-demo-reel/scripts/capture-demo.py
Outdated
Show resolved
Hide resolved
plugins/compound-engineering/skills/ce-demo-reel/scripts/capture-demo.py
Outdated
Show resolved
Hide resolved
120s per attempt meant a worst-case 245s blocking the agent when catbox is down. With 30s timeout + 10s connect timeout + 2s retry sleep, worst case is ~64s. Healthy uploads complete in 2-5s regardless. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
- Use #0d1117 (GitHub dark) as background instead of #aaaaff (lavender) - Add --no-round-corner to avoid corner artifacts against dark bg - Add --no-line-number for cleaner terminal output frames - Match ffmpeg padding color to #0d1117 so stitched frames are seamless Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
20px was too tight — window dots sat at the top edge with no breathing room. 40px gives a comfortable margin above and below. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
…mp tapes - run_cmd catches TimeoutExpired and returns a controlled failure instead of crashing with a traceback - Frame normalization aborts on ffmpeg failure instead of silently continuing with potentially stale output - VHS tape rewrite uses tempfile.mkstemp instead of deterministic .tmp suffix to avoid overwriting user files Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: b425cdf468
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
plugins/compound-engineering/skills/ce-demo-reel/scripts/capture-demo.py
Outdated
Show resolved
Hide resolved
Active voice, no em dashes, plain English, varied sentence length, digits over words, no filler phrases. Technical jargon stays when it's the clearest term. User style preferences override these defaults. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
The -- pattern is just an em dash in disguise. Updated the writing voice rule to explicitly ban it and suggest periods, commas, colons, or parentheses instead. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Palette generation and final GIF encoding ignored return codes, so a failed ffmpeg could report success if a stale output file existed from a previous run. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
- Specify argument format when invoking ce-demo-reel (pass a target description inferred from the diff) - Document how to detect failure from ce-demo-reel output (check Tier, URL, and Embed fields) - DU-3 now checks for existing evidence in the PR body and preserves it unless the user asks to refresh or remove - Fix bare "demo-reel" reference in example to "ce-demo-reel" - Simplify ce-demo-reel output: remove redundant Label field, clarify that Embed is the deliverable with heading included Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: cc724649e4
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
plugins/compound-engineering/skills/ce-demo-reel/scripts/capture-demo.py
Outdated
Show resolved
Hide resolved
- detect command now uses git rev-parse --show-toplevel so it finds manifests from the repo root regardless of working directory - Remix detection uses @remix-run/react (the actual npm package key) instead of bare "remix" Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Detection now scopes to the relevant subdirectory based on the diff target from Step 0. In monorepos, the agent passes the changed package's root instead of the repo root. If the agent's understanding of the change contradicts the script's classification, the agent wins. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
These documented the GitHub native video upload and agent-browser auth patterns used by feature-video, which was replaced by ce-demo-reel's simpler catbox GIF approach. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Knowledge track learning from the ce-demo-reel build. Bash hit 4 bug classes across review rounds (set -e footguns, bash 3.2 compat, frame reduction math, builtin spawning). Python subprocess model eliminates all of them. Includes when bash is still the right choice. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 54a738ad05
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
plugins/compound-engineering/skills/ce-demo-reel/scripts/capture-demo.py
Show resolved
Hide resolved
ce-demo-reel now returns Tier, Description (1-liner of what the evidence shows), and URL. The caller formats the markdown. Removes the Embed field and markdown generation from upload-and-approval.md. Cleaner boundary: ce-demo-reel captures, the caller presents. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Short tokens like "gin", "echo", "chi" false-positive on unrelated module names (e.g., "engine" contains "gin"). Now matches full paths like github.com/gin-gonic/gin. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Node: express, fastify, koa, hono Go: net/http Python: sanic, litestar Rust: poem, tide Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 0964cf4d0d
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
plugins/compound-engineering/skills/ce-demo-reel/scripts/capture-demo.py
Show resolved
Hide resolved
net/http is stdlib and never appears in go.mod. The agent detects stdlib web servers from source imports in the diff and overrides the classification per the "signal, not a gate" design. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Summary
Visual evidence for PRs now works for any project type. CLI tools, libraries, desktop apps all get demos. Previously, the capture flow assumed a Rails-style dev server and agent-browser, so CLI and library PRs got nothing.
ce-demo-reelreplaces the hardcoded capture block with a tiered architecture. A Python script handles the deterministic parts (project detection, tool checks, stitching, upload). The agent just decides what to capture and asks the user to approve.Demo
3 capture tiers, 3 GIFs. Each was produced by a different tier of
ce-demo-reelto show how the skill picks its approach based on project type and available tools.Terminal recording (VHS tier): the pipeline detecting project types and recommending capture tiers in a live terminal session.
Screenshot reel (silicon + ffmpeg tier): detect a CLI tool from its package.json, then get the recommended capture tier.
Browser reel (agent-browser + ffmpeg tier): headless browser screenshots across 3 pages on bun.sh, stitched into an animated GIF.
What changed
New
ce-demo-reelskill with tiered capture:Python pipeline script (
scripts/capture-demo.py) with 7 subcommands:preflight: tool availability (JSON output, replaces 4 separatecommand -vcalls)detect: project type from manifests (replaces agent interpreting a reference file)recommend: tier recommendation lookup (replaces agent interpreting a markdown table)stitch: frame normalization + 2-pass GIF stitchingscreenshot-reel: silicon rendering + stitch in 1 callterminal-recording: VHS execution + output validationupload: catbox.moe with retry (30s timeout, 10s connect timeout)git-commit-push-pr improved:
Scratch space guidance added to AGENTS.md. Rule of thumb:
.context/for workflow state other skills read,mktemp -dfor throwaway artifacts like screenshots and GIFs.feature-video removed. GitHub-native MP4 upload via DOM manipulation was over-engineered for what GIFs on catbox.moe do more simply.
Key design decisions
set -efootguns, negative array indexing, frame reduction logic,commandbuiltin). Python's subprocess handling is safer and the script is testable without BATS.mktemp -dkeeps them out of the repo tree entirely.Test plan
bun run release:validatepasses (42 skills, 51 agents)bun testpasses (657 tests)tests/ce-demo-reel.test.ts: 21 tests covering:evidence-captureorfeature-videoin skill files