Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
73 changes: 34 additions & 39 deletions .agentguard/squads/shellforge/blockers.md
Original file line number Diff line number Diff line change
@@ -1,77 +1,72 @@
# ShellForge Squad — Blockers

**Updated:** 2026-03-29T20:00Z
**Reported by:** EM run 5 (claude-code:opus:shellforge:em)
**Updated:** 2026-03-30T00:45Z
**Reported by:** EM run 6 (claude-code:opus:shellforge:em)

---

## P0 — Critical Blockers (2)
## P0 — Critical Blockers

### 1. All 3 PRs Awaiting Human Review — BLOCKING SQUAD PROGRESS
**Description:** All 3 open PRs are passing CI (5/5 checks each) but blocked on `REVIEW_REQUIRED`. GitHub branch protection prevents the EM (authored as jpleva91) from self-approving.
**PRs blocked:**
- **#83** — `fix(p0): close governance fail-open vulnerabilities` — closes #58, #59, #62, #67, #69, #75
- **#84** — `fix(docs): update stale Crush comments in cmdEvaluate (#74)` — closes #74
- **#85** — `chore(squad): EM state update — run 4` — squad ops housekeeping

**Action Required:** @jpleva91 or a collaborator must review and approve PRs #83, #84, #85.
**Priority:** Review #83 first — it carries all P0/P1 governance security fixes.

### 2. PR Budget AT LIMIT (3/3) — No New Fix PRs Possible
**Description:** Squad has reached the max of 3 open PRs. No new work can be opened until at least one PR merges.
**Impact:** P2 bugs (#65 scheduler silent error, #66 flattenParams dead code, #52 cmdScan glob broken, #53 README stale) remain queued but cannot be addressed.
**Unblocked by:** Merging any of #83, #84, or #85.
**None.** All P0 governance bugs are closed.

---

## P1 — Remaining Work (queued, no new PRs until budget frees)
## P1 — Active Work

### #68 — Zero test coverage across all packages
**Severity:** High — governance runtime with no tests is unshipable
**Impact:** Can't validate fix correctness, no regression protection. Blocks dogfood credibility.
**Assignee:** qa-agent
**URL:** https://github.com/AgentGuardHQ/shellforge/issues/68
### PR #86 — Governance timeout override (awaiting human review)
**Description:** PR #86 removes the hardcoded 60s cap in `runShellWithRTK` and `runShellRaw` that silently overrode the governance engine's timeout value. CI pending; GitHub branch protection prevents self-approval.
**Action Required:** @jpleva91 review and approve PR #86.

### #63 — classifyShellRisk prefix matching too broad
**Severity:** High — false read-only classification on commands starting with `cat`/`ls`/`echo`
**Assignee:** qa-agent
**URL:** https://github.com/AgentGuardHQ/shellforge/issues/63

### #68 — Zero test coverage across all packages
**Severity:** High — governance runtime with no tests is unshipable
**Assignee:** qa-agent
**URL:** https://github.com/AgentGuardHQ/shellforge/issues/68

---

## P2 — Unassigned (queued, blocked by PR budget)
## P2 — Queued (unassigned)

| # | Issue | Notes |
|---|-------|-------|
| #76 | Dogfood: run ShellForge swarm on jared box | P0 governance bugs resolved — can now proceed |
| #65 | scheduler.go silent os.WriteFile error | Silent failure on job persistence |
| #66 | flattenParams dead code | Logic bug, result overwritten before use |
| #52 | filepath.Glob ** never matches Go files | cmdScan broken for entire scan feature |
| #52 | filepath.Glob ** never matches Go files | cmdScan scan feature broken |
| #53 | README stale ./shellforge commands | Docs rot |
| #51 | run() helper silently ignores errors | Silent failure in main.go |
| #50 | kernel version comparison lexicographic | setup.sh version gate broken |
| #49 | InferenceQueue not priority-aware | Documented but unimplemented |
| #26 | run-qa/report agents don't build binary if missing | Setup gap |
| #25 | RunResult.Success heuristic incorrect | Agent loop reliability |
| #24 | listFiles() relative paths bug | Path resolution error |

---

## Resolved (pending merge of PR #83)
## Resolved (this cycle)

- **#58** — bounded-execution wildcard policy blocked all run_shell → fix in PR #83
- **#62** — cmdEvaluate fail-open on JSON unmarshal → fix in PR #83
- **#75** — govern-shell.sh printf injection → fix in PR #83
- **#67** — govern-shell.sh fragile sed output parsing → fix in PR #83
- **#69** — rm policy only blocked -rf/-fr, not plain rm → fix in PR #83
- **#59** — misleading `# Mode: monitor` comment with `mode: enforce` → fix in PR #83
- **#74** — stale crush references in cmdEvaluatefix in PR #84
- **#58** — bounded-execution wildcard policy blocked all run_shell → merged in PR #83
- **#62** — cmdEvaluate fail-open on JSON unmarshal → merged in PR #83
- **#75** — govern-shell.sh printf injection → merged in PR #83
- **#67** — govern-shell.sh fragile sed output parsing → merged in PR #83
- **#69** — rm policy only blocked -rf/-fr, not plain rm → merged in PR #83
- **#74** — stale crush references in cmdEvaluate → merged in PR #84
- **#59** — misleading `# Mode: monitor` commentfixed in PR #83, closed manually

---

## Status Summary

| Item | Status |
|------|--------|
| PR #83 (P0 fixes) | CI ✅ 5/5 — REVIEW BLOCKED |
| PR #84 (P1 docs) | CI ✅ 5/5 — REVIEW BLOCKED |
| PR #85 (EM state) | CI ✅ 5/5 — REVIEW BLOCKED |
| PR budget | 3/3 AT LIMIT |
| Dogfood (#76) | BLOCKED on #83 merge |
| P0 issues | ✅ All closed |
| PR #86 (P1 timeout fix) | CI pending — REVIEW REQUIRED |
| PR budget | 1/3 |
| Dogfood (#76) | Governance unblocked — needs human trigger |
| QA-agent (#63, #68) | Active |
| New fix PRs | BLOCKED until budget frees |
| Retry loops | None |
| Blast radius | Low |
74 changes: 31 additions & 43 deletions .agentguard/squads/shellforge/state.json
Original file line number Diff line number Diff line change
@@ -1,77 +1,67 @@
{
"squad": "shellforge",
"updated_at": "2026-03-29T20:00:00Z",
"updated_at": "2026-03-30T00:45:00Z",
"sprint": {
"goal": "Harden enforcement runtime — fix all P0/P1 governance bugs before dogfood run",
"focus": "Security correctness: govern-shell.sh JSON safety, cmdEvaluate bypass, bounded-execution policy, test coverage baseline"
"focus": "Security correctness: P0 COMPLETE, P1 #28 in PR #86, test coverage (#68) and classifyShellRisk (#63) assigned to qa-agent"
},
"pr_budget": {
"max_open": 3,
"current_open": 3,
"status": "at-limit"
"current_open": 1,
"status": "ok"
},
"loop_guard": {
"retry_loop_detected": false,
"blast_radius": "low"
},
"issue_queue": {
"p0": [
{ "number": 58, "title": "Critical: bounded-execution policy denies ALL run_shell calls in enforce mode", "assignee": "em", "status": "fix-in-pr-83" },
{ "number": 62, "title": "bug: cmdEvaluate silently ignores JSON unmarshal error — governance bypass", "assignee": "em", "status": "fix-in-pr-83" },
{ "number": 75, "title": "bug: govern-shell.sh unescaped $COMMAND in printf — silently defaults to allow", "assignee": "em", "status": "fix-in-pr-83" }
],
"p0": [],
"p1": [
{ "number": 69, "title": "bug: governance policy gap — plain rm and rm -r not blocked by no-destructive-rm", "assignee": "em", "status": "fix-in-pr-83" },
{ "number": 67, "title": "bug: govern-shell.sh uses fragile sed to parse JSON", "assignee": "em", "status": "fix-in-pr-83" },
{ "number": 28, "title": "bug: bounded-execution policy timeout (300s) is silently overridden to 60s in shell execution", "assignee": "em", "status": "fix-in-pr-86" },
{ "number": 63, "title": "bug: classifyShellRisk prefix matching too broad — false read-only classification", "assignee": "qa-agent" },
{ "number": 68, "title": "test: zero test coverage across all packages", "assignee": "qa-agent" },
{ "number": 74, "title": "bug: stale crush references in cmd/shellforge/main.go", "assignee": "em", "status": "fix-in-pr-84" }
{ "number": 68, "title": "test: zero test coverage across all packages", "assignee": "qa-agent" }
],
"p2": [
{ "number": 65, "title": "bug: scheduler.go silently ignores os.WriteFile error", "assignee": null },
{ "number": 66, "title": "bug: dead code in flattenParams() overwrites result before using it", "assignee": null },
{ "number": 52, "title": "bug: filepath.Glob with ** in cmdScan never matches any Go files", "assignee": null },
{ "number": 59, "title": "agentguard.yaml misleading comment says monitor but mode is enforce", "assignee": "em", "status": "fix-in-pr-83" },
{ "number": 53, "title": "docs/readme: README still shows ./shellforge commands", "assignee": null },
{ "number": 76, "title": "Dogfood: run ShellForge swarm on jared box via RunPod GPU", "assignee": null }
{ "number": 76, "title": "Dogfood: run ShellForge swarm on jared box via RunPod GPU", "assignee": null },
{ "number": 51, "title": "bug: run() helper in main.go silently ignores command errors", "assignee": null },
{ "number": 50, "title": "bug: kernel version comparison in setup.sh is lexicographic, not numeric", "assignee": null },
{ "number": 49, "title": "bug: InferenceQueue is not priority-aware despite being documented as such", "assignee": null },
{ "number": 26, "title": "bug: run-qa-agent.sh and run-report-agent.sh don't build binary if missing", "assignee": null },
{ "number": 25, "title": "bug: agent RunResult.Success heuristic is incorrect", "assignee": null },
{ "number": 24, "title": "bug: listFiles() returns paths relative to cwd, not the listed directory", "assignee": null }
],
"p3": [
{ "number": 81, "title": "feat: OpenClaw as governed execution runtime in ShellForge", "assignee": null },
{ "number": 77, "title": "[research] Evaluate go-agent-framework sandboxing integration", "assignee": null },
{ "number": 71, "title": "[research] lean-ctx — 88% token reduction via shell hook + MCP server", "assignee": null },
{ "number": 73, "title": "[research] ml-explore/mlx-lm — Apple MLX inference backend", "assignee": null },
{ "number": 72, "title": "[research] nono — kernel-enforced agent sandbox via macOS Seatbelt", "assignee": null },
{ "number": 71, "title": "[research] lean-ctx — 88% token reduction via shell hook + MCP server", "assignee": null },
{ "number": 56, "title": "[research] mem0 — persistent cross-run agent memory", "assignee": null },
{ "number": 55, "title": "[research] microsoft/agent-governance-toolkit", "assignee": null },
{ "number": 54, "title": "[research] omlx — SSD KV caching doubles swarm capacity", "assignee": null },
{ "number": 81, "title": "feat: OpenClaw as governed execution runtime in ShellForge", "assignee": null }
{ "number": 11, "title": "[research] RTK integration — 70-90% token savings for agent runs", "assignee": null },
{ "number": 10, "title": "[research] TurboQuant integration — 6x KV cache compression", "assignee": null }
]
},
"pr_queue": [
{
"number": 83,
"title": "fix(p0): close governance fail-open vulnerabilities",
"status": "open",
"ci": "passing (5/5)",
"review_status": "REVIEW_REQUIRED — awaiting human approval (cannot self-approve)",
"issues_closed": [58, 59, 62, 67, 69, 75]
},
{
"number": 84,
"title": "fix(docs): update stale Crush comments in cmdEvaluate (#74)",
"status": "open",
"ci": "passing (5/5)",
"review_status": "REVIEW_REQUIRED — awaiting human approval (cannot self-approve)",
"issues_closed": [74]
},
{
"number": 85,
"title": "chore(squad): EM state update — run 4 (2026-03-29)",
"number": 86,
"title": "fix(governance): honour policy timeout in shell execution — remove 60s cap (#28)",
"status": "open",
"ci": "passing (5/5)",
"review_status": "REVIEW_REQUIRED — awaiting human approval (cannot self-approve)",
"issues_closed": []
"ci": "pending",
"review_status": "REVIEW_REQUIRED — awaiting human approval",
"issues_closed": [28]
}
],
"recently_closed": [
{ "number": 83, "merged": true, "issues_closed": [58, 62, 67, 69, 75], "date": "2026-03-30" },
{ "number": 84, "merged": true, "issues_closed": [74], "date": "2026-03-30" },
{ "number": 85, "merged": true, "issues_closed": [], "date": "2026-03-30" }
],
"agents": {
"qa-agent": { "status": "assigned", "schedule": "4h", "last_issue": 63 },
"report-agent": { "status": "idle", "schedule": "30m", "last_issue": null },
Expand All @@ -82,10 +72,8 @@
"No dev-agent in swarm — P0/P1 bugs require EM to author fixes directly"
],
"blockers": [
"PR #83 (P0 fixes): CI passing 5/5, review BLOCKED — GitHub prevents self-approval. Requires human review from @jpleva91 or a collaborator.",
"PR #84 (P1 docs fix): CI passing 5/5, review BLOCKED — same constraint.",
"PR #85 (EM state update): CI passing 5/5, review BLOCKED — same constraint.",
"PR budget AT LIMIT (3/3) — cannot open new fix PRs until at least one merges."
"PR #86 (P1 timeout fix): CI pending, review BLOCKED — GitHub prevents self-approval. Requires human review from @jpleva91.",
"Dogfood (#76): unblocked by P0 merge but no assignee yet — needs human trigger."
],
"notes": "Run 5 (2026-03-29T20:00Z): No new issues since Run 4. All 3 open PRs now passing CI (5/5) but all blocked on REVIEW_REQUIRED — GitHub branch protection prevents self-approval. PR budget at limit (3/3). No new work can be opened. Dogfood run (#76) still blocked pending PR #83 merge. Human review of PRs #83, #84, #85 is the sole critical path item."
"notes": "Run 6 (2026-03-30T00:45Z): PRs #83/#84/#85 all merged — P0 COMPLETE. Issue #59 closed (was already fixed by #83). PR #86 opened for last P1 (#28, timeout override). PR budget 1/3. Sprint goal nearly achieved — remaining P1s are #28 (in PR), #63/#68 (qa-agent). Dogfood run (#76) is unblocked from governance side but needs human to trigger."
}
Loading