Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
22 changes: 21 additions & 1 deletion CLAUDE.md
Original file line number Diff line number Diff line change
Expand Up @@ -49,7 +49,8 @@ src/clayde/
# find_open_pr(), create_pull_request(), is_blocked(),
# add_pr_reviewer(), get_pr_reviews(),
# get_pr_review_comments(), parse_pr_url(),
# get_issue_author()
# get_issue_author(), get_check_runs(),
# get_required_check_names()
git.py # ensure_repo() — clone or update repos under REPOS_DIR
safety.py # Content filtering & plan approval: is_comment_visible(),
# filter_comments(), is_issue_visible(),
Expand All @@ -63,10 +64,13 @@ src/clayde/
orchestrator.py # main() — single cycle, run_loop() — container entry point
prompts/
work.j2 # Jinja2 template for the unified work prompt
fix_ci.j2 # prompt for diagnosing/fixing a failing PR pipeline
tasks/
__init__.py
work.py # run(issue_url) — unified: Claude decides next action
# (ask, plan, implement, open PR, or address review)
fix_ci.py # run(issue_url, pr_url, branch_name, failed_checks) —
# self-fix a failing CI pipeline on a clayde PR
webhook/
__init__.py
app.py # FastAPI app, /webhook/pebble, /health, OTel enqueue span
Expand Down Expand Up @@ -108,6 +112,7 @@ Plain `KEY=VALUE` file (no shell quoting). All keys use `CLAYDE_` prefix and are
| `CLAYDE_CLAUDE_API_KEY` | Anthropic API key for Claude SDK calls (required when backend=`api`) |
| `CLAYDE_CLAUDE_MODEL` | Model to use (default: `claude-opus-4-6`) |
| `CLAYDE_CLAUDE_BACKEND` | `api` (default) or `cli` — selects Anthropic SDK or Claude Code CLI |
| `CLAYDE_CI_FIX_MAX_ATTEMPTS` | Max autonomous CI-fix attempts per PR before giving up and notifying (default 3) |
| `CLAYDE_PEBBLE_ENABLED` | Set to `true` to enable the Pebble webhook |
| `CLAYDE_PEBBLE_TOKEN` | Bearer token the Pebble app sends |
| `CLAYDE_PEBBLE_HOST` | Public hostname for Traefik routing |
Expand Down Expand Up @@ -142,13 +147,26 @@ Per-issue state is stored in `state.json` under
| `pr_url` | PR opened for this issue, once detected via `find_open_pr()` |
| `in_progress` | `True` while the work task runs; a crash leaves it set so the next cycle retries |
| `last_seen_at` | ISO-UTC timestamp of the last completed cycle; used to detect new activity |
| `ci_fix_attempts` | Number of autonomous CI-fix attempts made for this PR (capped at `ci_fix_max_attempts`) |
| `last_ci_fix_attempt_sha` | PR head SHA of the last CI-fix attempt; prevents re-attempting the same commit |
| `ci_fix_exhausted_notified` | `True` once the operator has been alerted that the attempt budget is spent (avoids re-notifying) |

**Activity detection** (`_handle_issue`): the work task is invoked when any of
— `in_progress` is set (retry), `last_seen_at` is `None` (never processed),
there are new whitelist-visible comments, or there is new PR review activity
(inline comments or a review body). A pure PR approval with no comments does
**not** invoke Claude — it just advances `last_seen_at`.

**CI self-fix**: when there is *no* new human activity but an open PR exists,
`_handle_ci_fix()` checks the PR head commit's check runs (`get_check_runs()`,
filtered to branch-protection-required checks when defined). If a required
check has failed and a fix has not yet been attempted for that head SHA, the
`fix_ci` task is invoked: Claude inspects the failing job logs, pushes a fix to
the PR branch, and a summary is posted as an issue comment. Attempts are capped
per PR by `ci_fix_max_attempts` (default 3); once exhausted, the operator is
notified once via ntfy and Clayde stops attempting. Green/pending CI falls
through to normal review monitoring unchanged.

**Limits & retries**: `UsageLimitError` / `InvocationTimeoutError` from Claude
leave `in_progress=True` so the next cycle retries automatically. Other
exceptions clear `in_progress` and log the error. Closed issues are pruned
Expand Down Expand Up @@ -221,6 +239,8 @@ Key functions:
- `get_pr_reviews()` / `get_pr_review_comments()` — fetch PR review data
- `edit_comment()` — edit an existing issue comment
- `parse_pr_url()` — parse PR URL into (owner, repo, pr_number)
- `get_check_runs(g, owner, repo, ref)` — failed check runs for a commit SHA (name, conclusion, details_url)
- `get_required_check_names(g, owner, repo, branch)` — required status-check names from branch protection (empty set when unprotected)

---

Expand Down
4 changes: 4 additions & 0 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -17,6 +17,7 @@ Clayde is assigned GitHub issues in software repositories. For each issue it:
3. Posts a summary comment after each work cycle
4. Opens a pull request (Claude creates the PR directly with a description and, for diffs spanning more than 3 files, a recommended reading order) and assigns the issue author as reviewer
5. Monitors the PR and addresses review comments when they appear
6. Monitors the PR's CI pipeline and, if a required check fails, autonomously diagnoses the failing job and pushes a fix

Clayde runs as a Docker container in a continuous loop (default: every 5 minutes). Rather than a rigid state machine, it uses **timestamp-based activity detection**: each issue records the last time it was processed, and only new visible activity since that timestamp triggers a new Claude invocation.

Expand All @@ -34,6 +35,7 @@ Clayde's loop is event-driven and stateless by design:
6. **Crash recovery**: `in_progress` is set before invoking Claude and cleared after. If the process crashes mid-run, the next cycle retries automatically.
7. **Pure PR approvals** (no comments) update `last_seen_at` without invoking Claude.
8. **Closed issues** are pruned from state automatically.
9. **CI self-fix**: when an issue's PR is open and there is no new human activity, Clayde checks the PR head commit's CI status. If a required check has failed (and a fix has not already been attempted for that commit), Claude inspects the failing job logs and pushes a fix to the branch — up to `CLAYDE_CI_FIX_MAX_ATTEMPTS` times per PR, after which the operator is notified via ntfy. Green CI falls through to normal review monitoring.

---

Expand All @@ -59,6 +61,7 @@ Whitelisted users are configured via `CLAYDE_WHITELISTED_USERS` in `data/config.
- **Full issue lifecycle**: Engage → implement → PR → review, all driven by new activity
- **PR creation by Claude**: Claude writes the PR description and a recommended reading order for larger diffs
- **PR review handling**: Reads and addresses reviewer feedback automatically
- **CI self-healing**: Detects failing required checks on its own PRs and pushes fixes autonomously, with a per-PR attempt cap and operator notification
- **Rate-limit resilience**: Detects Claude usage limits and automatically retries
- **Crash recovery**: `in_progress` flag ensures interrupted runs are retried next cycle
- **Safety filtering**: Whitelist-based content filtering prevents acting on unauthorized content
Expand Down Expand Up @@ -173,6 +176,7 @@ In any repository the bot has access to, assign issues to the bot account. Clayd
| `CLAYDE_CLAUDE_BACKEND` | `api` (default) or `cli` |
| `CLAYDE_CLAUDE_API_KEY` | Anthropic API key (required when backend=`api`) |
| `CLAYDE_CLAUDE_MODEL` | Model to use (default: `claude-opus-4-6`) |
| `CLAYDE_CI_FIX_MAX_ATTEMPTS` | Max autonomous CI-fix attempts per PR before notifying (default: `3`) |
| `CLAYDE_PEBBLE_ENABLED` | Set to `true` to enable the Pebble webhook |
| `CLAYDE_PEBBLE_TOKEN` | Bearer token the Pebble app sends |
| `CLAYDE_PEBBLE_HOST` | Public hostname for Traefik routing |
Expand Down
3 changes: 3 additions & 0 deletions config.env.template
Original file line number Diff line number Diff line change
Expand Up @@ -19,6 +19,9 @@ CLAYDE_WHITELISTED_USERS=your-username,your-bot-username
CLAYDE_CLAUDE_BACKEND=api
CLAYDE_CLAUDE_API_KEY=

# Max autonomous CI-fix attempts per PR before giving up and notifying (default 3).
CLAYDE_CI_FIX_MAX_ATTEMPTS=3

# --- Pebble webhook ---
# Set to true to enable the FastAPI webhook on port 8080 (routed via Traefik).
CLAYDE_PEBBLE_ENABLED=false
Expand Down
2 changes: 2 additions & 0 deletions src/clayde/config.py
Original file line number Diff line number Diff line change
Expand Up @@ -41,6 +41,8 @@ def effective_git_name(self) -> str:
# Orchestrator behaviour
loop_interval_s: int = 300
implement_max_retries: int = 3
# Max autonomous CI-fix attempts per PR before giving up and notifying.
ci_fix_max_attempts: int = 3

# Pebble webhook
pebble_enabled: bool = False
Expand Down
49 changes: 49 additions & 0 deletions src/clayde/github.py
Original file line number Diff line number Diff line change
Expand Up @@ -193,3 +193,52 @@ def get_pr_title(g: Github, owner: str, repo: str, pr_number: int) -> str:
def get_pull(g: Github, owner: str, repo: str, pr_number: int):
"""Return the PullRequest object for the given PR number."""
return _get_repo(g, owner, repo).get_pull(pr_number)


# ---------------------------------------------------------------------------
# CI / check-run helpers
# ---------------------------------------------------------------------------

# Check-run conclusions that represent a failed / blocking pipeline. "neutral",
# "skipped", "success" and "stale" are not treated as failures; queued and
# in-progress runs are ignored until they complete.
_FAILED_CONCLUSIONS = frozenset(
{"failure", "timed_out", "action_required", "startup_failure"}
)


def get_check_runs(g: Github, owner: str, repo: str, ref: str) -> list[dict]:
"""Return the *failed* check runs for a commit SHA via the Checks API.

Only completed runs whose conclusion is in ``_FAILED_CONCLUSIONS`` are
returned; queued, in-progress, successful, skipped and neutral runs are
omitted. Each item is a dict with ``name``, ``conclusion`` and
``details_url`` (the URL of the failing job's logs).
"""
commit = _get_repo(g, owner, repo).get_commit(ref)
failed: list[dict] = []
for run in commit.get_check_runs():
if run.status == "completed" and run.conclusion in _FAILED_CONCLUSIONS:
failed.append({
"name": run.name,
"conclusion": run.conclusion,
"details_url": run.details_url or run.html_url or "",
})
return failed


def get_required_check_names(g: Github, owner: str, repo: str, branch: str) -> set[str]:
"""Return the set of required status-check names from branch protection.

Returns an empty set when the branch is unprotected or the required checks
cannot be read (e.g. insufficient token permissions). Callers treat an
empty set as "no required-check filter" — every failed check is then
considered blocking.
"""
try:
b = _get_repo(g, owner, repo).get_branch(branch)
required = b.get_required_status_checks()
return set(required.contexts or [])
except Exception as e:
log.info("No required status checks for %s/%s@%s: %s", owner, repo, branch, e)
return set()
162 changes: 138 additions & 24 deletions src/clayde/orchestrator.py
Original file line number Diff line number Diff line change
Expand Up @@ -34,9 +34,11 @@
fetch_issue,
fetch_issue_comments,
get_assigned_issues,
get_check_runs,
get_pr_review_comments,
get_pr_reviews,
get_pull,
get_required_check_names,
is_blocked,
is_pull_request_item,
issue_ref,
Expand All @@ -45,8 +47,9 @@
)
from clayde.safety import filter_pr_reviews, get_new_visible_comments, has_visible_content
from clayde.state import get_issue_state, load_state, save_state, update_issue_state
from clayde.tasks import work, wrap_up, pr_work
from clayde.tasks import fix_ci, work, wrap_up, pr_work
from clayde.telemetry import get_tracer, init_tracer
from clayde.webhook.notify import send_ntfy_sync

log = logging.getLogger("clayde.orchestrator")

Expand Down Expand Up @@ -162,34 +165,145 @@ def _handle_issue(g: Github, issue: Issue, url: str) -> None:

should_invoke = in_progress or (last_seen_at is None) or bool(new_comments) or has_new_review_activity

if not should_invoke:
log.info("[%s] No new activity — skipping", label)
span.set_attribute("issue.skip_reason", "no_new_activity")
return
if should_invoke:
# Mark in_progress before invoking Claude so a crash leaves a retry marker
update_issue_state(url, {"in_progress": True})

# Mark in_progress before invoking Claude so a crash leaves a retry marker
update_issue_state(url, {"in_progress": True})
log.info("[%s] New activity — invoking work task", label)
try:
work.run(url)
except (UsageLimitError, InvocationTimeoutError) as e:
log.warning("[%s] Usage/timeout limit — will retry next cycle: %s", label, e)
span.set_attribute("issue.status", "retry")
# in_progress stays True so the next cycle retries automatically
return
except Exception as e:
log.error("[%s] ERROR in work task: %s", label, e)
span.set_status(StatusCode.ERROR, str(e))
span.record_exception(e)
update_issue_state(url, {"in_progress": False})
return

log.info("[%s] New activity — invoking work task", label)
try:
work.run(url)
except (UsageLimitError, InvocationTimeoutError) as e:
log.warning("[%s] Usage/timeout limit — will retry next cycle: %s", label, e)
span.set_attribute("issue.status", "retry")
# in_progress stays True so the next cycle retries automatically
# Successful completion — update last_seen_at to prevent re-triggering on
# Clayde's own comments posted during this run
update_issue_state(url, {"in_progress": False, "last_seen_at": _now_utc()})
span.set_attribute("issue.status", "completed")
log.info("[%s] Cycle complete", label)
return
except Exception as e:
log.error("[%s] ERROR in work task: %s", label, e)
span.set_status(StatusCode.ERROR, str(e))
span.record_exception(e)
update_issue_state(url, {"in_progress": False})

# No new human activity. If a PR is open, monitor its CI and self-fix a
# failing pipeline before falling back to "nothing to do".
if pr_url and _handle_ci_fix(g, owner, repo, pr_url, url, label, span):
return

# Successful completion — update last_seen_at to prevent re-triggering on
# Clayde's own comments posted during this run
update_issue_state(url, {"in_progress": False, "last_seen_at": _now_utc()})
span.set_attribute("issue.status", "completed")
log.info("[%s] Cycle complete", label)
log.info("[%s] No new activity — skipping", label)
span.set_attribute("issue.skip_reason", "no_new_activity")


def _handle_ci_fix(g: Github, owner: str, repo: str, pr_url: str, url: str,
label: str, span) -> bool:
"""Monitor CI on the open PR and self-fix a failing required pipeline.

Returns True when CI handling has consumed this cycle (a fix was attempted,
the PR is waiting on a previous fix for the same commit, or the attempt
budget is exhausted) so the caller should stop. Returns False when CI is
green, still pending, or has no failing required checks — the caller then
falls through to its normal "no new activity" handling.

Loop-safety: the attempt counter and the attempted head SHA are recorded
*before* invoking Claude, so a crash or usage limit can never cause an
endless retry on the same commit.
"""
settings = get_settings()
try:
_, _, pr_number = parse_pr_url(pr_url)
pr = get_pull(g, owner, repo, pr_number)
head_sha = pr.head.sha
base_branch = pr.base.ref
except Exception as e:
log.warning("[%s] Failed to fetch PR for CI check: %s", label, e)
return False

try:
failed = get_check_runs(g, owner, repo, head_sha)
required = get_required_check_names(g, owner, repo, base_branch)
if required:
# Branch protection defines required checks — only act on those.
failed = [f for f in failed if f["name"] in required]
# When no required checks are configured, every failed check is treated
# as blocking (fallback for unprotected branches).
except Exception as e:
log.warning("[%s] Failed to fetch CI status: %s", label, e)
return False

if not failed:
return False # CI green / pending — proceed with review monitoring

issue_state = get_issue_state(url)
attempts = issue_state.get("ci_fix_attempts", 0)
max_attempts = settings.ci_fix_max_attempts

if attempts >= max_attempts:
if not issue_state.get("ci_fix_exhausted_notified"):
log.warning("[%s] CI still failing after %d attempts — notifying operator",
label, attempts)
_notify_ci_exhausted(settings, owner, repo, pr_number, attempts)
update_issue_state(url, {"ci_fix_exhausted_notified": True})
span.set_attribute("issue.skip_reason", "ci_fix_exhausted")
return True

if issue_state.get("last_ci_fix_attempt_sha") == head_sha:
# Already attempted a fix for this exact commit — wait for new activity
# (a new push, review, or comment) rather than looping on the same SHA.
log.info("[%s] CI failing but fix already attempted for %s — waiting",
label, head_sha[:7])
span.set_attribute("issue.skip_reason", "ci_fix_already_attempted")
return True

branch_name = issue_state.get("branch_name", pr.head.ref)
check_names = ", ".join(f["name"] for f in failed)
log.info("[%s] CI failing (%s) — invoking fix task (attempt %d/%d)",
label, check_names, attempts + 1, max_attempts)

# Record the attempt *before* invoking so a crash/limit cannot loop on this
# SHA, and so the attempt counts toward the max-attempts budget.
update_issue_state(url, {
"ci_fix_attempts": attempts + 1,
"last_ci_fix_attempt_sha": head_sha,
})

try:
fix_ci.run(url, pr_url, branch_name, failed)
except (UsageLimitError, InvocationTimeoutError) as e:
log.warning("[%s] Usage/timeout limit during CI fix — will retry on a new commit: %s",
label, e)
span.set_attribute("issue.status", "ci_fix_retry")
return True
except Exception as e:
log.error("[%s] ERROR in CI fix task: %s", label, e)
span.set_status(StatusCode.ERROR, str(e))
span.record_exception(e)
return True

# Advance last_seen_at so Clayde's own summary comment does not re-trigger.
update_issue_state(url, {"last_seen_at": _now_utc()})
span.set_attribute("issue.status", "ci_fix_attempted")
log.info("[%s] CI fix attempt complete", label)
return True


def _notify_ci_exhausted(settings, owner: str, repo: str, pr_number: int, attempts: int) -> None:
"""Send an ntfy alert that CI is still failing after the attempt budget."""
if not settings.ntfy_topic:
return
send_ntfy_sync(
title="Clayde: CI still failing",
body=f"CI still failing after {attempts} attempts on {owner}/{repo}#{pr_number}",
success=False,
base_url=settings.ntfy_base_url,
topic=settings.ntfy_topic,
timeout_s=settings.ntfy_timeout_s,
)


def _handle_standalone_pr(g: Github, url: str) -> None:
Expand Down
Loading
Loading