Skip to content

Commit 7ab351b

Browse files
feat(bots): reviewer-bot live workflows (review + follow-up)
Third of the stacked reviewer-bot migration. Adds the live workflows that run the bot on PRs: - reviewer-bot.yml — reviews on pull_request (opened/synchronize/reopened/ ready_for_review) + manual workflow_dispatch (dry-run capable). Fork-guarded; protected runner; mints a peco-review-bot App token; setup-claude-sdk for the SDK/CLI install. Reads/explores the PR's own checkout (no driver clone). - reviewer-bot-followup.yml — responds to pull_request_review_comment with the cheap pre-checkout filter + the marker-based loop guards. Adapted from the driver-test workflows: removed the driver-repo clone auth (INTEGRATION_TEST_APP_TOKEN — N/A here) and made MODEL_ENDPOINT a secret rather than a hardcoded workspace URL. PREREQS (these workflows stay inert until provided): - peco-review-bot GitHub App installed on this repo (Pull requests / Issues / Contents: Read & Write). - Secrets: REVIEW_BOT_APP_ID, REVIEW_BOT_APP_PRIVATE_KEY, MODEL_ENDPOINT; DATABRICKS_TOKEN authorized for that serving endpoint. Co-authored-by: Isaac Signed-off-by: Eric Wang <e.wang@databricks.com>
1 parent 46ff1e7 commit 7ab351b

2 files changed

Lines changed: 243 additions & 0 deletions

File tree

Lines changed: 117 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,117 @@
1+
name: Reviewer Bot — Follow-up
2+
3+
on:
4+
pull_request_review_comment:
5+
types: [created]
6+
7+
permissions:
8+
# The workflow GITHUB_TOKEN is not used to interact with the PR — we mint a
9+
# dedicated peco-review-bot App installation token and use that everywhere.
10+
# Required App permissions on the installation (NOT this workflow):
11+
# Pull requests: Read & Write — posting inline replies
12+
# Issues: Read & Write — comment plumbing
13+
# Contents: Read & Write — resolveReviewThread mutation
14+
# (Pull-requests:write is NOT sufficient for the resolve mutation;
15+
# GitHub gates it behind Contents.)
16+
contents: read
17+
id-token: write # JFrog OIDC exchange for the SDK/CLI install (setup-claude-sdk)
18+
19+
jobs:
20+
followup:
21+
# SECURITY: skip fork PRs — keep DATABRICKS_TOKEN out of untrusted code's
22+
# reach. Mirrors the guard in reviewer-bot.yml.
23+
if: github.event.pull_request.head.repo.fork == false && github.event.pull_request.state == 'open'
24+
runs-on:
25+
group: databricks-protected-runner-group
26+
labels: [linux-ubuntu-latest]
27+
timeout-minutes: 10
28+
steps:
29+
- name: Mint review-bot App token
30+
id: app-token
31+
uses: actions/create-github-app-token@fee1f7d63c2ff003460e3d139729b119787bc349 # v2.2.2
32+
with:
33+
app-id: ${{ secrets.REVIEW_BOT_APP_ID }}
34+
private-key: ${{ secrets.REVIEW_BOT_APP_PRIVATE_KEY }}
35+
36+
- name: Cheap pre-checkout filter
37+
id: filter
38+
env:
39+
GH_TOKEN: ${{ steps.app-token.outputs.token }}
40+
REPO: ${{ github.repository }}
41+
TRIGGER_ID: ${{ github.event.comment.id }}
42+
IN_REPLY_TO: ${{ github.event.comment.in_reply_to_id }}
43+
COMMENT_USER: ${{ github.event.comment.user.login }}
44+
COMMENT_BODY: ${{ github.event.comment.body }}
45+
run: |
46+
# Cheap filters first — skip the expensive checkout / python setup
47+
# when the event is already known to be irrelevant. The Python entry
48+
# point repeats these checks (defense in depth), so being slightly
49+
# over-permissive here is safe.
50+
#
51+
# Filter 1: must be a reply to another inline comment.
52+
if [ -z "$IN_REPLY_TO" ] || [ "$IN_REPLY_TO" = "null" ]; then
53+
echo "skip=true" >> "$GITHUB_OUTPUT"
54+
echo "reason=no in_reply_to_id (top-level review comment, not a thread reply)" >> "$GITHUB_OUTPUT"
55+
exit 0
56+
fi
57+
# Filter 2: skip our own follow-up AND reconcile replies (loop
58+
# prevention). MARKER-based — never login-based.
59+
if printf '%s' "$COMMENT_BODY" | grep -q '<!-- pr-review-bot:v1 followup'; then
60+
echo "skip=true" >> "$GITHUB_OUTPUT"
61+
echo "reason=trigger comment is itself a bot follow-up (loop prevention)" >> "$GITHUB_OUTPUT"
62+
exit 0
63+
fi
64+
if printf '%s' "$COMMENT_BODY" | grep -q '<!-- pr-review-bot:v1 reconcile -->'; then
65+
echo "skip=true" >> "$GITHUB_OUTPUT"
66+
echo "reason=trigger comment is itself a bot reconcile reply (loop prevention)" >> "$GITHUB_OUTPUT"
67+
exit 0
68+
fi
69+
echo "skip=false" >> "$GITHUB_OUTPUT"
70+
71+
- name: Announce skip in step summary
72+
if: steps.filter.outputs.skip == 'true'
73+
run: |
74+
{
75+
echo "## Reviewer Bot — Follow-up"
76+
echo ""
77+
echo "**Skipped:** ${{ steps.filter.outputs.reason }}"
78+
} >> "$GITHUB_STEP_SUMMARY"
79+
80+
- name: Checkout
81+
if: steps.filter.outputs.skip != 'true'
82+
uses: actions/checkout@de0fac2e4500dabe0009e67214ff5f5447ce83dd # v6.0.2
83+
with:
84+
fetch-depth: 0
85+
ref: ${{ github.event.pull_request.head.sha }}
86+
# The followup reads this checkout via read_paths/grep, so the
87+
# persisted GITHUB_TOKEN must NOT sit in .git/config. The followup
88+
# only POSTS replies via the minted App token.
89+
persist-credentials: false
90+
91+
- name: Setup Python
92+
if: steps.filter.outputs.skip != 'true'
93+
uses: actions/setup-python@a309ff8b426b58ec0e2a45f0f869d46889d02405 # v6.2.0
94+
with:
95+
python-version: '3.11'
96+
97+
- name: Setup Claude Agent SDK + CLI
98+
if: steps.filter.outputs.skip != 'true'
99+
uses: ./.github/actions/setup-claude-sdk
100+
101+
- name: Run follow-up agent
102+
if: steps.filter.outputs.skip != 'true'
103+
env:
104+
GH_TOKEN: ${{ steps.app-token.outputs.token }}
105+
GITHUB_REPOSITORY: ${{ github.repository }}
106+
PR_NUMBER: ${{ github.event.pull_request.number }}
107+
TRIGGER_COMMENT_ID: ${{ github.event.comment.id }}
108+
# PR SHA range — used by followup.py to restrict `git show` to commits
109+
# actually in this PR (allowlist for SHA-diff verification).
110+
PR_BASE_SHA: ${{ github.event.pull_request.base.sha }}
111+
HEAD_SHA: ${{ github.event.pull_request.head.sha }}
112+
DATABRICKS_TOKEN: ${{ secrets.DATABRICKS_TOKEN }}
113+
MODEL_ENDPOINT: ${{ secrets.MODEL_ENDPOINT }}
114+
DRY_RUN: 'false'
115+
RUNNER_TEMP: ${{ runner.temp }}
116+
run: |
117+
python -m scripts.reviewer_bot.followup

.github/workflows/reviewer-bot.yml

Lines changed: 126 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,126 @@
1+
name: Reviewer Bot
2+
3+
on:
4+
pull_request:
5+
types: [opened, synchronize, reopened, ready_for_review]
6+
workflow_dispatch:
7+
inputs:
8+
pr_number:
9+
description: 'PR number to review'
10+
required: true
11+
type: string
12+
dry_run:
13+
description: 'Print what would be posted instead of posting'
14+
required: false
15+
default: 'true'
16+
type: string
17+
18+
permissions:
19+
# The workflow GITHUB_TOKEN is not used to interact with the PR — we mint a
20+
# dedicated peco-review-bot App installation token and use that everywhere.
21+
# Required App permissions on the installation (NOT this workflow):
22+
# Pull requests: Read & Write — posting findings + replies
23+
# Issues: Read & Write — review status updates
24+
# Contents: Read & Write — resolveReviewThread mutation
25+
# (Pull-requests:write is NOT sufficient for the resolve mutation;
26+
# GitHub gates it behind Contents.)
27+
contents: read
28+
id-token: write # JFrog OIDC exchange for the SDK/CLI install (setup-claude-sdk)
29+
30+
jobs:
31+
review:
32+
# SECURITY: the fork == false guard keeps DATABRICKS_TOKEN + the App token
33+
# out of untrusted fork code. Do not remove without alternative isolation.
34+
if: github.event_name == 'workflow_dispatch' || (github.event.pull_request.draft == false && github.event.pull_request.head.repo.fork == false)
35+
runs-on:
36+
group: databricks-protected-runner-group
37+
labels: [linux-ubuntu-latest]
38+
timeout-minutes: 15
39+
steps:
40+
- name: Checkout
41+
uses: actions/checkout@de0fac2e4500dabe0009e67214ff5f5447ce83dd # v6.0.2
42+
with:
43+
fetch-depth: 0
44+
# The reviewer reads this checkout via its read_paths/grep tools, so
45+
# the persisted GITHUB_TOKEN must NOT sit in .git/config (it would be
46+
# readable + leakable into a posted review). The bot posts via the
47+
# minted App token, not the checkout's git creds.
48+
persist-credentials: false
49+
50+
- name: Mint review-bot App token
51+
id: app-token
52+
uses: actions/create-github-app-token@fee1f7d63c2ff003460e3d139729b119787bc349 # v2.2.2
53+
with:
54+
app-id: ${{ secrets.REVIEW_BOT_APP_ID }}
55+
private-key: ${{ secrets.REVIEW_BOT_APP_PRIVATE_KEY }}
56+
57+
- name: Setup Python
58+
uses: actions/setup-python@a309ff8b426b58ec0e2a45f0f869d46889d02405 # v6.2.0
59+
with:
60+
python-version: '3.11'
61+
62+
- name: Resolve trigger inputs
63+
id: inputs
64+
run: |
65+
if [ "${{ github.event_name }}" = "workflow_dispatch" ]; then
66+
RAW_PR="${{ inputs.pr_number }}"
67+
DRY_RUN="${{ inputs.dry_run }}"
68+
else
69+
RAW_PR="${{ github.event.pull_request.number }}"
70+
DRY_RUN="false"
71+
fi
72+
if ! [[ "$RAW_PR" =~ ^[0-9]+$ ]]; then
73+
echo "::error::Invalid pr_number '$RAW_PR'"; exit 1
74+
fi
75+
HEAD_SHA=$(gh pr view "$RAW_PR" --repo "${{ github.repository }}" \
76+
--json headRefOid -q .headRefOid)
77+
echo "pr_number=$RAW_PR" >> "$GITHUB_OUTPUT"
78+
echo "head_sha=$HEAD_SHA" >> "$GITHUB_OUTPUT"
79+
echo "dry_run=$DRY_RUN" >> "$GITHUB_OUTPUT"
80+
env:
81+
GH_TOKEN: ${{ steps.app-token.outputs.token }}
82+
83+
- name: Setup Claude Agent SDK + CLI
84+
uses: ./.github/actions/setup-claude-sdk
85+
86+
- name: Checkout PR head into a SEPARATE dir for exploration (workflow_dispatch)
87+
# On `pull_request` the initial checkout is already refs/pull/N/merge, so
88+
# the reviewer's read_paths/grep explore the files under review (and the
89+
# fork guard gates untrusted code). On `workflow_dispatch` the initial
90+
# checkout is the dispatched ref (default branch) while the review targets
91+
# inputs.pr_number — so check the PR head into a SEPARATE `pr-head/` dir
92+
# and read its CONTENT via REVIEW_CONTENT_ROOT below.
93+
#
94+
# SECURITY: do NOT re-point the primary tree. `Run review bot` still runs
95+
# scripts/reviewer_bot/* from the primary (trusted, default-branch)
96+
# checkout; only the PR's file *content* is read out of pr-head/. The
97+
# job's if: exempts workflow_dispatch from the fork guard, so swapping
98+
# the primary tree would execute a fork PR's own bot code with secrets in
99+
# scope. Reading content is safe (read_paths/grep enforce
100+
# path-escape/.git/symlink guards); executing its code is not.
101+
if: github.event_name == 'workflow_dispatch'
102+
uses: actions/checkout@de0fac2e4500dabe0009e67214ff5f5447ce83dd # v6.0.2
103+
with:
104+
ref: ${{ steps.inputs.outputs.head_sha }}
105+
fetch-depth: 0
106+
persist-credentials: false
107+
path: pr-head
108+
109+
- name: Run review bot
110+
env:
111+
GH_TOKEN: ${{ steps.app-token.outputs.token }}
112+
GITHUB_REPOSITORY: ${{ github.repository }}
113+
PR_NUMBER: ${{ steps.inputs.outputs.pr_number }}
114+
HEAD_SHA: ${{ steps.inputs.outputs.head_sha }}
115+
# Where the bot READS PR-head content from (repo rules, read_paths/grep).
116+
# Set ONLY on workflow_dispatch (primary tree = trusted default branch,
117+
# PR head in pr-head/). On `pull_request` it's empty, so the bot reads
118+
# its own merge-ref checkout (fork-gated). Bot code always runs from
119+
# the primary trusted checkout.
120+
REVIEW_CONTENT_ROOT: ${{ github.event_name == 'workflow_dispatch' && format('{0}/pr-head', github.workspace) || '' }}
121+
DATABRICKS_TOKEN: ${{ secrets.DATABRICKS_TOKEN }}
122+
MODEL_ENDPOINT: ${{ secrets.MODEL_ENDPOINT }}
123+
DRY_RUN: ${{ steps.inputs.outputs.dry_run }}
124+
RUNNER_TEMP: ${{ runner.temp }}
125+
run: |
126+
python -m scripts.reviewer_bot.run_review

0 commit comments

Comments
 (0)