Skip to content

Commit 93af320

Browse files
committed
[SEA-NodeJS] Pin the kernel by SHA (KERNEL_REV) + kernel-e2e CI
The SEA napi binding is built from the kernel's private Rust source, not a published/versioned artifact, and the actual `.node` binary is gitignored — so nothing in the repo records which kernel revision the committed `native/sea/index.d.ts` / `index.js` correspond to, and the standard e2e job never builds or exercises the binding (its SEA suite skips). - `KERNEL_REV` — a single 40-char kernel commit SHA at the repo root: the one source of truth for the kernel version the driver is built against. Bumping it is the only way to pick up a new kernel, so a driver change and its kernel dependency always land together in one bisectable diff. Pinned to the kernel main SHA whose napi surface the committed binding (landed by the SEA stack) was generated from. - `.github/workflows/kernel-e2e.yml` — reads `KERNEL_REV`, checks the kernel out at that SHA via a GitHub App token, builds the napi binding (`npm run build:native` against the pinned checkout, cargo via the JFrog proxy), asserts the committed binding still matches the pin (drift-guard: `git diff --exit-code native/sea/index.*`), and runs the SEA e2e suite (`tests/e2e/sea/**`) against the dogfood warehouse. Synthetic-success on plain PRs; real run in the merge queue or via the `kernel-e2e` label; change-detection auto-passes when no SEA-relevant files moved. The test step invokes mocha directly (`npx mocha --config … "tests/e2e/sea/**"`) rather than `npm run e2e -- <glob>`: routing a glob through the npm-script's inner shell mangles `**` and silently resolves to zero files (a false pass). - `native/sea/README.md` — documents the pin and how to match it locally. Stacked on the SEA async/options PR: the committed b4d8822 binding (and the driver code + test fakes that consume its `submitStatement`/metadata surface) land there, so `KERNEL_REV` and the binding are consistent at every commit and the drift-guard passes. Requires one-time repo-admin setup (GitHub App allowlist for the kernel repo, the `kernel-e2e` label, warehouse secrets in azure-prod) — see the workflow header. Co-authored-by: Isaac Signed-off-by: Madhavendra Rathore <madhavendra.rathore@databricks.com>
1 parent 32929aa commit 93af320

3 files changed

Lines changed: 398 additions & 0 deletions

File tree

.github/workflows/kernel-e2e.yml

Lines changed: 377 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,377 @@
1+
name: Kernel E2E Tests
2+
3+
# Runs the SEA backend e2e suite (tests/e2e/sea/**) against a real
4+
# Databricks warehouse with a freshly-built napi-rs kernel binding.
5+
#
6+
# The kernel is a private repo with no published binary artifact. We pin
7+
# a kernel SHA in the `KERNEL_REV` file at the repo root, check the kernel
8+
# out via a GitHub App token, and run `npm run build:native` to compile
9+
# the napi binding into native/sea/ in the same checkout the tests run
10+
# against. Bumping `KERNEL_REV` is the ONLY way to pick up a new kernel
11+
# version — this keeps the driver <-> kernel pair bisectable, so a driver
12+
# change and the kernel revision it depends on always land together.
13+
#
14+
# Why this exists: the committed native/sea/index.d.ts + index.js are the
15+
# TypeScript declarations and the napi-rs platform router; the actual
16+
# `.node` binary is gitignored (large, per-platform) and is NOT in the
17+
# repo. The standard `main.yml` e2e job has no binary, so its SEA suite
18+
# skips (it gates on DATABRICKS_PECOTESTING_* secrets it doesn't set).
19+
# This workflow is what actually exercises the SEA path end-to-end against
20+
# a known kernel revision.
21+
#
22+
# Gate semantics:
23+
# - Plain PR events post a synthetic-success check so the required
24+
# "Kernel E2E" check doesn't block PRs that don't touch the SEA path.
25+
# Real tests run in the merge queue.
26+
# - `kernel-e2e` label triggers a preview run on the PR; the label is
27+
# auto-removed on `synchronize` for the same security reason.
28+
# - merge_group fires the real gate — runs when SEA-relevant files
29+
# changed, auto-passes otherwise.
30+
#
31+
# Required external setup (one-time, by a repo admin):
32+
# 1. `kernel-e2e` label exists in this repo.
33+
# 2. `INTEGRATION_TEST_APP_ID` / `INTEGRATION_TEST_PRIVATE_KEY` secrets
34+
# exist and the GitHub App's repo allowlist includes
35+
# `databricks/databricks-sql-kernel`.
36+
# 3. `KERNEL_REV` at the repo root contains a 40-char kernel commit SHA.
37+
# 4. `azure-prod` environment exposes DATABRICKS_HOST /
38+
# TEST_PECO_WAREHOUSE_HTTP_PATH / DATABRICKS_TOKEN.
39+
40+
on:
41+
pull_request:
42+
types: [opened, synchronize, reopened, labeled]
43+
merge_group:
44+
45+
permissions:
46+
contents: read
47+
id-token: write
48+
49+
concurrency:
50+
group: kernel-e2e-${{ github.workflow }}-${{ github.ref }}
51+
cancel-in-progress: ${{ github.event_name == 'pull_request' }}
52+
53+
jobs:
54+
# ───────────────────────────────────────────────────────────────
55+
# Security: auto-remove `kernel-e2e` label on new commits so a
56+
# labelled preview run can't be re-triggered with unreviewed code.
57+
# ───────────────────────────────────────────────────────────────
58+
strip-label:
59+
if: github.event_name == 'pull_request' && github.event.action == 'synchronize'
60+
runs-on:
61+
group: databricks-protected-runner-group
62+
labels: linux-ubuntu-latest
63+
permissions:
64+
pull-requests: write
65+
steps:
66+
- name: Remove kernel-e2e label
67+
uses: actions/github-script@f28e40c7f34bde8b3046d885e986cb6290c5673b # v7.1.0
68+
with:
69+
github-token: ${{ github.token }}
70+
script: |
71+
try {
72+
await github.rest.issues.removeLabel({
73+
owner: context.repo.owner,
74+
repo: context.repo.repo,
75+
issue_number: context.payload.pull_request.number,
76+
name: 'kernel-e2e',
77+
});
78+
} catch (error) {
79+
if (error.status !== 404) throw error;
80+
}
81+
82+
# ───────────────────────────────────────────────────────────────
83+
# Synthetic success on every non-label PR event so the required
84+
# "Kernel E2E" check doesn't permablock PRs that don't touch SEA
85+
# code. Real run happens in the merge queue (or via explicit label).
86+
# ───────────────────────────────────────────────────────────────
87+
skip-kernel-e2e-pr:
88+
if: github.event_name == 'pull_request' && github.event.action != 'labeled'
89+
runs-on:
90+
group: databricks-protected-runner-group
91+
labels: linux-ubuntu-latest
92+
permissions:
93+
checks: write
94+
steps:
95+
- name: Post synthetic-success check
96+
uses: actions/github-script@f28e40c7f34bde8b3046d885e986cb6290c5673b # v7.1.0
97+
with:
98+
github-token: ${{ github.token }}
99+
script: |
100+
await github.rest.checks.create({
101+
owner: context.repo.owner,
102+
repo: context.repo.repo,
103+
name: 'Kernel E2E',
104+
head_sha: context.payload.pull_request.head.sha,
105+
status: 'completed',
106+
conclusion: 'success',
107+
completed_at: new Date().toISOString(),
108+
output: {
109+
title: 'Skipped on PR — runs in merge queue',
110+
summary: 'Kernel E2E is skipped on PRs and runs as a required gate in the merge queue. Add the `kernel-e2e` label to preview on this PR.'
111+
}
112+
});
113+
114+
# ───────────────────────────────────────────────────────────────
115+
# Detect whether SEA-relevant files changed. Used by both the
116+
# labelled-PR path and the merge-queue path to decide between
117+
# "really run the suite" and "auto-pass the check".
118+
# ───────────────────────────────────────────────────────────────
119+
detect-changes:
120+
if: |
121+
github.event_name == 'merge_group' ||
122+
(github.event_name == 'pull_request' &&
123+
github.event.action == 'labeled' &&
124+
contains(github.event.pull_request.labels.*.name, 'kernel-e2e'))
125+
runs-on:
126+
group: databricks-protected-runner-group
127+
labels: linux-ubuntu-latest
128+
outputs:
129+
run_tests: ${{ steps.changed.outputs.run_tests }}
130+
head_sha: ${{ steps.refs.outputs.head_sha }}
131+
steps:
132+
- name: Resolve head SHA
133+
id: refs
134+
env:
135+
MERGE_QUEUE_REF: ${{ github.event.merge_group.head_ref }}
136+
uses: actions/github-script@f28e40c7f34bde8b3046d885e986cb6290c5673b # v7.1.0
137+
with:
138+
script: |
139+
if (context.eventName === 'pull_request') {
140+
core.setOutput('head_sha', context.payload.pull_request.head.sha);
141+
return;
142+
}
143+
core.setOutput('head_sha', context.payload.merge_group.head_sha);
144+
145+
- name: Check out repo at head SHA
146+
uses: actions/checkout@34e114876b0b11c390a56381ad16ebd13914f8d5 # v4
147+
with:
148+
ref: ${{ steps.refs.outputs.head_sha }}
149+
fetch-depth: 0
150+
151+
- name: Detect SEA-relevant changes
152+
id: changed
153+
env:
154+
HEAD_SHA: ${{ steps.refs.outputs.head_sha }}
155+
BASE_SHA: ${{ github.event_name == 'merge_group' && github.event.merge_group.base_sha || github.event.pull_request.base.sha }}
156+
run: |
157+
CHANGED=$(git diff --name-only "$BASE_SHA" "$HEAD_SHA")
158+
echo "Changed files:"
159+
echo "$CHANGED"
160+
# Run when the SEA driver layer, the napi binding contract, SEA
161+
# e2e tests, this workflow, the kernel revision pin, or core deps
162+
# move.
163+
if echo "$CHANGED" | grep -qE "^(lib/sea/|native/sea/|tests/e2e/sea/|tests/unit/sea/|\.github/workflows/kernel-e2e\.yml|KERNEL_REV|package\.json|package-lock\.json)"; then
164+
echo "run_tests=true" >> "$GITHUB_OUTPUT"
165+
else
166+
echo "run_tests=false" >> "$GITHUB_OUTPUT"
167+
fi
168+
169+
# ───────────────────────────────────────────────────────────────
170+
# Real test job. Builds the napi binding from the pinned kernel SHA
171+
# and runs the SEA e2e suite against the dogfood warehouse.
172+
# ───────────────────────────────────────────────────────────────
173+
run-kernel-e2e:
174+
needs: detect-changes
175+
if: needs.detect-changes.outputs.run_tests == 'true'
176+
runs-on:
177+
group: databricks-protected-runner-group
178+
labels: linux-ubuntu-latest
179+
environment: azure-prod
180+
permissions:
181+
contents: read
182+
checks: write
183+
id-token: write
184+
env:
185+
# SEA e2e tests gate on the DATABRICKS_PECOTESTING_* vars; map the
186+
# warehouse secrets onto them so the suite actually runs (it skips
187+
# when they are absent).
188+
DATABRICKS_PECOTESTING_SERVER_HOSTNAME: ${{ secrets.DATABRICKS_HOST }}
189+
DATABRICKS_PECOTESTING_HTTP_PATH: ${{ secrets.TEST_PECO_WAREHOUSE_HTTP_PATH }}
190+
DATABRICKS_PECOTESTING_TOKEN_PERSONAL: ${{ secrets.DATABRICKS_TOKEN }}
191+
steps:
192+
- name: Check out driver
193+
uses: actions/checkout@34e114876b0b11c390a56381ad16ebd13914f8d5 # v4
194+
with:
195+
ref: ${{ needs.detect-changes.outputs.head_sha }}
196+
197+
- name: Read pinned kernel SHA
198+
id: kernel-rev
199+
run: |
200+
if [[ ! -f KERNEL_REV ]]; then
201+
echo "::error::KERNEL_REV file missing"
202+
exit 1
203+
fi
204+
REV=$(tr -d '[:space:]' < KERNEL_REV)
205+
if [[ ! "$REV" =~ ^[0-9a-f]{40}$ ]]; then
206+
echo "::error::KERNEL_REV must be a 40-char commit SHA, got: $REV"
207+
exit 1
208+
fi
209+
echo "rev=$REV" >> "$GITHUB_OUTPUT"
210+
echo "Pinned kernel SHA: $REV"
211+
212+
- name: Generate GitHub App token (kernel repo read access)
213+
id: app-token
214+
uses: actions/create-github-app-token@f8d387b68d61c58ab83c6c016672934102569859 # v3.0.0
215+
with:
216+
app-id: ${{ secrets.INTEGRATION_TEST_APP_ID }}
217+
private-key: ${{ secrets.INTEGRATION_TEST_PRIVATE_KEY }}
218+
owner: databricks
219+
repositories: databricks-sql-kernel
220+
221+
- name: Check out kernel at pinned SHA
222+
uses: actions/checkout@34e114876b0b11c390a56381ad16ebd13914f8d5 # v4
223+
with:
224+
repository: databricks/databricks-sql-kernel
225+
ref: ${{ steps.kernel-rev.outputs.rev }}
226+
token: ${{ steps.app-token.outputs.token }}
227+
path: databricks-sql-kernel
228+
229+
- uses: actions/setup-node@49933ea5288caeca8642d1e84afbd3f7d6820020 # v4
230+
with:
231+
node-version: 20
232+
233+
- name: Set up Rust toolchain
234+
uses: actions-rust-lang/setup-rust-toolchain@1780873c7b576612439a134613cc4cc74ce5538c # v1.15.2
235+
with:
236+
cache: false
237+
238+
- name: Cache cargo build artifacts (keyed on kernel SHA)
239+
uses: Swatinem/rust-cache@98c8021b550208e191a6a3145459bfc9fb29c4c0 # v2.8.0
240+
with:
241+
workspaces: databricks-sql-kernel
242+
key: kernel-${{ steps.kernel-rev.outputs.rev }}
243+
244+
- name: Set up JFrog (npm registry proxy)
245+
uses: ./.github/actions/setup-jfrog
246+
247+
- name: Configure Cargo for JFrog proxy
248+
shell: bash
249+
# databricks-protected-runner-group blocks direct egress to
250+
# index.crates.io, so cargo must route through JFrog's
251+
# db-cargo-remote proxy. Reuses the JFrog token setup-jfrog
252+
# exported into the environment.
253+
run: |
254+
set -euo pipefail
255+
mkdir -p ~/.cargo
256+
cat > ~/.cargo/config.toml << 'EOF'
257+
[source.crates-io]
258+
replace-with = "jfrog"
259+
[source.jfrog]
260+
registry = "sparse+https://databricks.jfrog.io/artifactory/api/cargo/db-cargo-remote/index/"
261+
[registries.jfrog]
262+
index = "sparse+https://databricks.jfrog.io/artifactory/api/cargo/db-cargo-remote/index/"
263+
credential-provider = ["cargo:token"]
264+
EOF
265+
cat > ~/.cargo/credentials.toml << EOF
266+
[registries.jfrog]
267+
token = "Bearer ${JFROG_ACCESS_TOKEN}"
268+
EOF
269+
echo "CARGO_REGISTRIES_JFROG_TOKEN=Bearer ${JFROG_ACCESS_TOKEN}" >> "$GITHUB_ENV"
270+
271+
- name: Install driver deps
272+
run: npm ci
273+
274+
- name: Build napi binding from pinned kernel
275+
# build:native cd's into ${DATABRICKS_SQL_KERNEL_REPO}/napi, runs the
276+
# napi-rs build, and copies index.* into native/sea/. Pointing it at
277+
# the SHA-pinned kernel checkout is what makes the binary match
278+
# KERNEL_REV exactly.
279+
env:
280+
DATABRICKS_SQL_KERNEL_REPO: ${{ github.workspace }}/databricks-sql-kernel
281+
run: npm run build:native
282+
283+
- name: Assert committed binding matches KERNEL_REV
284+
# The committed native/sea/index.d.ts + index.js are the consumer-facing
285+
# type contract + platform router; they MUST correspond to the pinned
286+
# kernel. build:native just regenerated them from the KERNEL_REV
287+
# checkout, so any diff means the committed contract drifted from the
288+
# pin — fail loudly and tell the author to commit the regenerated files.
289+
# (The .node binaries are gitignored, so git diff only sees the contract.)
290+
run: |
291+
if ! git diff --exit-code -- native/sea/index.d.ts native/sea/index.js; then
292+
echo "::error::native/sea/index.d.ts / index.js are out of sync with KERNEL_REV ($(tr -d '[:space:]' < KERNEL_REV)). Run 'npm run build:native' against that kernel SHA and commit native/sea/index.*."
293+
exit 1
294+
fi
295+
echo "Committed binding matches KERNEL_REV."
296+
297+
- name: Smoke-check binding loads
298+
run: node -e "const b=require('./native/sea'); if(typeof b.version!=='function'){throw new Error('napi binding failed to load')} console.log('kernel binding ok:', b.version())"
299+
300+
- name: Run SEA e2e tests
301+
# Invoke mocha directly rather than via `npm run e2e -- <glob>`: routing a
302+
# glob through the npm-script's inner shell mangles `**` and silently
303+
# resolves to ZERO files (a false pass). mocha expands the quoted glob
304+
# itself, reliably matching every tests/e2e/sea file.
305+
run: NODE_OPTIONS="--max-old-space-size=4096" npx mocha --config tests/e2e/.mocharc.js "tests/e2e/sea/**/*.test.ts"
306+
307+
- name: Post Kernel E2E check (success)
308+
if: success()
309+
uses: actions/github-script@f28e40c7f34bde8b3046d885e986cb6290c5673b # v7.1.0
310+
with:
311+
github-token: ${{ github.token }}
312+
script: |
313+
await github.rest.checks.create({
314+
owner: context.repo.owner,
315+
repo: context.repo.repo,
316+
name: 'Kernel E2E',
317+
head_sha: '${{ needs.detect-changes.outputs.head_sha }}',
318+
status: 'completed',
319+
conclusion: 'success',
320+
completed_at: new Date().toISOString(),
321+
output: {
322+
title: 'Kernel E2E passed',
323+
summary: 'tests/e2e/sea ran green against the pinned kernel SHA.'
324+
}
325+
});
326+
327+
- name: Post Kernel E2E check (failure)
328+
if: failure()
329+
uses: actions/github-script@f28e40c7f34bde8b3046d885e986cb6290c5673b # v7.1.0
330+
with:
331+
github-token: ${{ github.token }}
332+
script: |
333+
await github.rest.checks.create({
334+
owner: context.repo.owner,
335+
repo: context.repo.repo,
336+
name: 'Kernel E2E',
337+
head_sha: '${{ needs.detect-changes.outputs.head_sha }}',
338+
status: 'completed',
339+
conclusion: 'failure',
340+
completed_at: new Date().toISOString(),
341+
output: {
342+
title: 'Kernel E2E failed',
343+
summary: 'See workflow logs for details.'
344+
}
345+
});
346+
347+
# ───────────────────────────────────────────────────────────────
348+
# Auto-pass the Kernel E2E check in the merge queue when no SEA-
349+
# relevant files changed.
350+
# ───────────────────────────────────────────────────────────────
351+
auto-pass-merge-queue:
352+
needs: detect-changes
353+
if: github.event_name == 'merge_group' && needs.detect-changes.outputs.run_tests != 'true'
354+
runs-on:
355+
group: databricks-protected-runner-group
356+
labels: linux-ubuntu-latest
357+
permissions:
358+
checks: write
359+
steps:
360+
- name: Auto-pass
361+
uses: actions/github-script@f28e40c7f34bde8b3046d885e986cb6290c5673b # v7.1.0
362+
with:
363+
github-token: ${{ github.token }}
364+
script: |
365+
await github.rest.checks.create({
366+
owner: context.repo.owner,
367+
repo: context.repo.repo,
368+
name: 'Kernel E2E',
369+
head_sha: '${{ github.event.merge_group.head_sha }}',
370+
status: 'completed',
371+
conclusion: 'success',
372+
completed_at: new Date().toISOString(),
373+
output: {
374+
title: 'Skipped — no SEA-relevant changes',
375+
summary: 'No files under lib/sea/, native/sea/, tests/e2e/sea/, tests/unit/sea/, KERNEL_REV, package.json, or package-lock.json changed.'
376+
}
377+
});

KERNEL_REV

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1 @@
1+
b4d88220cdfad8dba1cfa89892269342ae26feeb

0 commit comments

Comments
 (0)