feat(ce-demo-reel): add demo reel skill with Python capture pipeline by tmchow · Pull Request #541 · EveryInc/compound-engineering-plugin

tmchow · 2026-04-09T22:28:28Z

Summary

Visual evidence for PRs now works for any project type. CLI tools, libraries, desktop apps all get demos. Previously, the capture flow assumed a Rails-style dev server and agent-browser, so CLI and library PRs got nothing.

ce-demo-reel replaces the hardcoded capture block with a tiered architecture. A Python script handles the deterministic parts (project detection, tool checks, stitching, upload). The agent just decides what to capture and asks the user to approve.

Demo

3 capture tiers, 3 GIFs. Each was produced by a different tier of ce-demo-reel to show how the skill picks its approach based on project type and available tools.

Terminal recording (VHS tier): the pipeline detecting project types and recommending capture tiers in a live terminal session.

Screenshot reel (silicon + ffmpeg tier): detect a CLI tool from its package.json, then get the recommended capture tier.

Browser reel (agent-browser + ffmpeg tier): headless browser screenshots across 3 pages on bun.sh, stitched into an animated GIF.

What changed

New ce-demo-reel skill with tiered capture:

Tier	Best for	Tools	Output
Browser reel	Web apps, Electron via CDP	agent-browser + ffmpeg	Animated GIF
Terminal recording	CLI tools with motion	vhs (charmbracelet)	Animated GIF
Screenshot reel	CLI discrete steps	silicon + ffmpeg	Animated GIF
Static screenshots	Fallback	agent-browser or silicon	PNGs

Python pipeline script (scripts/capture-demo.py) with 7 subcommands:

preflight: tool availability (JSON output, replaces 4 separate command -v calls)
detect: project type from manifests (replaces agent interpreting a reference file)
recommend: tier recommendation lookup (replaces agent interpreting a markdown table)
stitch: frame normalization + 2-pass GIF stitching
screenshot-reel: silicon rendering + stitch in 1 call
terminal-recording: VHS execution + output validation
upload: catbox.moe with retry (30s timeout, 10s connect timeout)

git-commit-push-pr improved:

Restructured 3 prose-heavy sections into bullet logic for better agent instruction-following. Shipped together because ce-demo-reel integrates into the PR description flow.
New "Frame the narrative" step forces before/after/scope articulation before drafting. Also strengthens "lead with value" with mechanism-vs-outcome guidance.
New writing voice defaults to catch AI slop: active voice, no em dashes, plain English, varied sentence length, digits over words.

Scratch space guidance added to AGENTS.md. Rule of thumb: .context/ for workflow state other skills read, mktemp -d for throwaway artifacts like screenshots and GIFs.

feature-video removed. GitHub-native MP4 upload via DOM manipulation was over-engineered for what GIFs on catbox.moe do more simply.

Key design decisions

Python over bash. The original bash script hit 4 bugs across review rounds (set -e footguns, negative array indexing, frame reduction logic, command builtin). Python's subprocess handling is safer and the script is testable without BATS.
Script handles execution, skill handles judgment. Manifest parsing, tool checks, ffmpeg commands, and upload retry are in the script. The agent decides what to capture and which pages to visit.
Artifacts in OS temp, not the repo. Evidence files get uploaded and discarded. mktemp -d keeps them out of the repo tree entirely.

Test plan

bun run release:validate passes (42 skills, 51 agents)
bun test passes (657 tests)
tests/ce-demo-reel.test.ts: 21 tests covering:
- Preflight JSON output
- Project detection (8 scenarios: web-app, cli-tool, desktop-app, library, text-only, Electron priority, Rails, Go CLI)
- Tier recommendation (6 scenarios: all tool combos, no-tools fallback, available-list filtering)
- Stitch arg validation + ffmpeg integration (GIF magic bytes, multi-frame)
- Upload error paths
No stale references to evidence-capture or feature-video in skill files

…move feature-video Replace the hardcoded bin/dev + agent-browser + phantom imgup block in shipping-workflow.md with a project-type-aware evidence-capture skill that works across web apps, CLI tools, libraries, and desktop apps. The skill auto-detects project type, checks available tools (agent-browser, vhs, silicon, ffmpeg), recommends a capture tier (browser reel, terminal recording, screenshot reel, static screenshots, or skip), and uploads evidence to catbox.moe. A bash pipeline script handles ffmpeg stitching, frame normalization, palette generation, size optimization, and upload. Removes feature-video -- its GitHub-native MP4 upload via agent-browser DOM manipulation was over-engineered. GIFs uploaded to catbox render inline everywhere without platform-specific upload hacks. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

…ullet logic Convert three dense prose paragraphs into structured bullet/sub-bullet lists for better agent instruction-following and human reviewability: - DU-3: 6-action paragraph -> numbered step list with sub-bullets - Step 1 clean tree: interleaved prose/bullets -> labeled decision tree - Step 7 existing PR: dense paragraph -> 5-step numbered list Also adds evidence-capture integration to Step 6 with a user-facing question gate for when the PR has observable behavior. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

chatgpt-codex-connector

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 3d75fa0c4f

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

Open a pull request for review
Mark a draft as ready
Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

plugins/compound-engineering/skills/evidence-capture/scripts/capture-evidence.sh

plugins/compound-engineering/skills/evidence-capture/references/upload-and-approval.md

- Upload fallback now stages, commits, and pushes evidence to the branch so GitHub can render the relative path (previously left as local-only) - Add capture-evidence.sh test suite covering arg validation and ffmpeg stitch integration Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

chatgpt-codex-connector

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 3d2825b13a

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

Open a pull request for review
Mark a draft as ready
Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

tests/capture-evidence.test.ts

plugins/compound-engineering/skills/evidence-capture/scripts/capture-evidence.sh

- Replace negative array indexing (bash 4+) with portable index computation for macOS /bin/bash 3.2 compatibility - Fix ffmpeg availability check in tests: use `which` instead of `command` (shell builtin, not spawnable as executable) Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

Move frame existence validation before ffmpeg/ffprobe tool checks so the test gets the correct "Frame not found" error on CI runners that don't have ffmpeg installed. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

chatgpt-codex-connector

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 7495bef165

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

Open a pull request for review
Mark a draft as ready
Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

plugins/compound-engineering/skills/evidence-capture/scripts/capture-evidence.sh

Adds a "Frame the narrative before sizing" pre-writing step that forces before/after/scope articulation before drafting. Also strengthens the "lead with value" principle with a mechanism-vs-outcome anti-example, updates the medium sizing tier to reference the narrative frame, and adds guidance to name new test files in test plans. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

chatgpt-codex-connector

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 1e9bdeae29

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

Open a pull request for review
Mark a draft as ready
Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

plugins/compound-engineering/skills/evidence-capture/scripts/capture-evidence.sh

…ror handling - Frame reduction step minimum raised from 1 to 2 so 3-4 frame GIFs actually drop middle frames instead of re-adding all of them - Curl upload failures now use || true to prevent set -euo pipefail from exiting before the retry/fallback logic can run Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

Move all deterministic logic into a Python script with 7 subcommands: preflight, detect, recommend, stitch, screenshot-reel, terminal-recording, upload. The agent's role shrinks to judgment calls (what to capture, which tier, user approval) while the script handles how. - Remove capture-evidence.sh and project-detection.md (logic moved to Python detect subcommand) - Update SKILL.md Steps 2/4/6 to call script instead of inline logic - Update tier references to use script for stitching and recording - Fix Codex review findings: require ffprobe for stitched tiers, rewrite VHS tape Output when --output overrides it - Remove docs/evidence commit fallback — artifacts are ephemeral, use OS temp not repo tree - Tests expanded from 8 to 21 covering preflight, detect (8 project type scenarios), recommend (6 tier mappings), stitch, and upload Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

Add rule of thumb: .context/ for workflow state that other skills read, mktemp -d for throwaway artifacts (screenshots, GIFs, recordings) that get uploaded and discarded. Keeps ephemeral files out of the repo tree. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

Rename skill directory, frontmatter, script (capture-demo.py), and all cross-skill references. Adds "capture evidence" and "add proof to a PR" as trigger phrases in the description so the old vocabulary still works. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

chatgpt-codex-connector

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: f129c2a63c

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

Open a pull request for review
Mark a draft as ready
Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

plugins/compound-engineering/skills/demo-reel/scripts/capture-demo.py

…mo-reel - Fix Angular detection: use @angular/core instead of angular (the npm package key is scoped) - Catch subprocess.TimeoutExpired on curl upload so timeouts fall through to retry logic instead of raising a traceback - Rename demo-reel -> ce-demo-reel to align with ce: namespace convention (ce-plan, ce-review, ce-work, ce-demo-reel) - Rename test file to ce-demo-reel.test.ts Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

chatgpt-codex-connector

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 100e5ab31d

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

Open a pull request for review
Mark a draft as ready
Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

plugins/compound-engineering/skills/ce-demo-reel/scripts/capture-demo.py

120s per attempt meant a worst-case 245s blocking the agent when catbox is down. With 30s timeout + 10s connect timeout + 2s retry sleep, worst case is ~64s. Healthy uploads complete in 2-5s regardless. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

- Use #0d1117 (GitHub dark) as background instead of #aaaaff (lavender) - Add --no-round-corner to avoid corner artifacts against dark bg - Add --no-line-number for cleaner terminal output frames - Match ffmpeg padding color to #0d1117 so stitched frames are seamless Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

20px was too tight — window dots sat at the top edge with no breathing room. 40px gives a comfortable margin above and below. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

…mp tapes - run_cmd catches TimeoutExpired and returns a controlled failure instead of crashing with a traceback - Frame normalization aborts on ffmpeg failure instead of silently continuing with potentially stale output - VHS tape rewrite uses tempfile.mkstemp instead of deterministic .tmp suffix to avoid overwriting user files Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

chatgpt-codex-connector

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: b425cdf468

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

Open a pull request for review
Mark a draft as ready
Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

plugins/compound-engineering/skills/ce-demo-reel/scripts/capture-demo.py

Active voice, no em dashes, plain English, varied sentence length, digits over words, no filler phrases. Technical jargon stays when it's the clearest term. User style preferences override these defaults. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

The -- pattern is just an em dash in disguise. Updated the writing voice rule to explicitly ban it and suggest periods, commas, colons, or parentheses instead. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

Palette generation and final GIF encoding ignored return codes, so a failed ffmpeg could report success if a stale output file existed from a previous run. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

- Specify argument format when invoking ce-demo-reel (pass a target description inferred from the diff) - Document how to detect failure from ce-demo-reel output (check Tier, URL, and Embed fields) - DU-3 now checks for existing evidence in the PR body and preserves it unless the user asks to refresh or remove - Fix bare "demo-reel" reference in example to "ce-demo-reel" - Simplify ce-demo-reel output: remove redundant Label field, clarify that Embed is the deliverable with heading included Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

chatgpt-codex-connector

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: cc724649e4

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

Open a pull request for review
Mark a draft as ready
Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

plugins/compound-engineering/skills/ce-demo-reel/SKILL.md

plugins/compound-engineering/skills/ce-demo-reel/scripts/capture-demo.py

- detect command now uses git rev-parse --show-toplevel so it finds manifests from the repo root regardless of working directory - Remix detection uses @remix-run/react (the actual npm package key) instead of bare "remix" Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

Detection now scopes to the relevant subdirectory based on the diff target from Step 0. In monorepos, the agent passes the changed package's root instead of the repo root. If the agent's understanding of the change contradicts the script's classification, the agent wins. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

These documented the GitHub native video upload and agent-browser auth patterns used by feature-video, which was replaced by ce-demo-reel's simpler catbox GIF approach. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

Knowledge track learning from the ce-demo-reel build. Bash hit 4 bug classes across review rounds (set -e footguns, bash 3.2 compat, frame reduction math, builtin spawning). Python subprocess model eliminates all of them. Includes when bash is still the right choice. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

chatgpt-codex-connector

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 54a738ad05

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

Open a pull request for review
Mark a draft as ready
Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

plugins/compound-engineering/skills/ce-demo-reel/scripts/capture-demo.py

ce-demo-reel now returns Tier, Description (1-liner of what the evidence shows), and URL. The caller formats the markdown. Removes the Embed field and markdown generation from upload-and-approval.md. Cleaner boundary: ce-demo-reel captures, the caller presents. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

Short tokens like "gin", "echo", "chi" false-positive on unrelated module names (e.g., "engine" contains "gin"). Now matches full paths like github.com/gin-gonic/gin. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

Node: express, fastify, koa, hono Go: net/http Python: sanic, litestar Rust: poem, tide Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

chatgpt-codex-connector

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 0964cf4d0d

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

Open a pull request for review
Mark a draft as ready
Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

plugins/compound-engineering/skills/ce-demo-reel/scripts/capture-demo.py

net/http is stdlib and never appears in go.mod. The agent detects stdlib web servers from source imports in the diff and overrides the classification per the "signal, not a gate" design. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

tmchow and others added 2 commits April 9, 2026 15:27

chatgpt-codex-connector bot reviewed Apr 9, 2026

View reviewed changes

plugins/compound-engineering/skills/evidence-capture/scripts/capture-evidence.sh Outdated Show resolved Hide resolved

plugins/compound-engineering/skills/evidence-capture/references/upload-and-approval.md Outdated Show resolved Hide resolved

chatgpt-codex-connector bot reviewed Apr 9, 2026

View reviewed changes

tests/capture-evidence.test.ts Outdated Show resolved Hide resolved

plugins/compound-engineering/skills/evidence-capture/scripts/capture-evidence.sh Outdated Show resolved Hide resolved

tmchow and others added 2 commits April 9, 2026 16:34

chatgpt-codex-connector bot reviewed Apr 9, 2026

View reviewed changes

plugins/compound-engineering/skills/evidence-capture/scripts/capture-evidence.sh Outdated Show resolved Hide resolved

chatgpt-codex-connector bot reviewed Apr 10, 2026

View reviewed changes

plugins/compound-engineering/skills/evidence-capture/scripts/capture-evidence.sh Outdated Show resolved Hide resolved

tmchow and others added 4 commits April 9, 2026 19:11

tmchow changed the title ~~feat(evidence-capture): generalize evidence capture across project types~~ feat(demo-reel): add demo-reel skill with Python capture pipeline Apr 10, 2026

chatgpt-codex-connector bot reviewed Apr 10, 2026

View reviewed changes

plugins/compound-engineering/skills/demo-reel/scripts/capture-demo.py Outdated Show resolved Hide resolved

plugins/compound-engineering/skills/demo-reel/scripts/capture-demo.py Outdated Show resolved Hide resolved

tmchow changed the title ~~feat(demo-reel): add demo-reel skill with Python capture pipeline~~ feat(ce-demo-reel): add demo-reel skill with Python capture pipeline Apr 10, 2026

chatgpt-codex-connector bot reviewed Apr 10, 2026

View reviewed changes

tmchow changed the title ~~feat(ce-demo-reel): add demo-reel skill with Python capture pipeline~~ feat(ce-demo-reel): add demo reel skill with Python capture pipeline Apr 10, 2026

tmchow and others added 3 commits April 9, 2026 20:27

fix(ce-demo-reel): increase silicon vertical padding to 40px

b6fbf14

20px was too tight — window dots sat at the top edge with no breathing room. 40px gives a comfortable margin above and below. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

chatgpt-codex-connector bot reviewed Apr 10, 2026

View reviewed changes

plugins/compound-engineering/skills/ce-demo-reel/scripts/capture-demo.py Outdated Show resolved Hide resolved

tmchow and others added 4 commits April 9, 2026 20:43

chatgpt-codex-connector bot reviewed Apr 10, 2026

View reviewed changes

plugins/compound-engineering/skills/ce-demo-reel/SKILL.md Outdated Show resolved Hide resolved

plugins/compound-engineering/skills/ce-demo-reel/scripts/capture-demo.py Outdated Show resolved Hide resolved

tmchow and others added 5 commits April 9, 2026 20:55

chore: remove completed refactor plan

942dc65

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

chatgpt-codex-connector bot reviewed Apr 10, 2026

View reviewed changes

plugins/compound-engineering/skills/ce-demo-reel/scripts/capture-demo.py Show resolved Hide resolved

tmchow and others added 3 commits April 9, 2026 21:13

fix(ce-demo-reel): add missing web framework deps to detection

0964cf4

Node: express, fastify, koa, hono Go: net/http Python: sanic, litestar Rust: poem, tide Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

chatgpt-codex-connector bot reviewed Apr 10, 2026

View reviewed changes

plugins/compound-engineering/skills/ce-demo-reel/scripts/capture-demo.py Show resolved Hide resolved

tmchow merged commit b979143 into main Apr 10, 2026
2 checks passed

github-actions bot mentioned this pull request Apr 10, 2026

chore: release main #529

Merged

Conversation

tmchow commented Apr 9, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Demo

What changed

Key design decisions

Test plan

Uh oh!

chatgpt-codex-connector bot left a comment

Choose a reason for hiding this comment

💡 Codex Review

Uh oh!

Uh oh!

Uh oh!

chatgpt-codex-connector bot left a comment

Choose a reason for hiding this comment

💡 Codex Review

Uh oh!

Uh oh!

Uh oh!

chatgpt-codex-connector bot left a comment

Choose a reason for hiding this comment

💡 Codex Review

Uh oh!

Uh oh!

chatgpt-codex-connector bot left a comment

Choose a reason for hiding this comment

💡 Codex Review

Uh oh!

Uh oh!

chatgpt-codex-connector bot left a comment

Choose a reason for hiding this comment

💡 Codex Review

Uh oh!

Uh oh!

Uh oh!

chatgpt-codex-connector bot left a comment

Choose a reason for hiding this comment

💡 Codex Review

Uh oh!

Uh oh!

Uh oh!

Uh oh!

chatgpt-codex-connector bot left a comment

Choose a reason for hiding this comment

💡 Codex Review

Uh oh!

Uh oh!

chatgpt-codex-connector bot left a comment

Choose a reason for hiding this comment

💡 Codex Review

Uh oh!

Uh oh!

Uh oh!

chatgpt-codex-connector bot left a comment

Choose a reason for hiding this comment

💡 Codex Review

Uh oh!

Uh oh!

chatgpt-codex-connector bot left a comment

Choose a reason for hiding this comment

💡 Codex Review

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

tmchow commented Apr 9, 2026 •

edited

Loading