Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 1 addition & 0 deletions .docs-introspect-sha
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
f6eb9f97dfd21135c407658ac911ced0ed0bd097
25 changes: 25 additions & 0 deletions .github/workflows/archive.yml
Original file line number Diff line number Diff line change
Expand Up @@ -28,6 +28,23 @@ jobs:
# Need full history to push a tag from the workflow.
fetch-depth: 0

# Precondition: the archive step needs to fetch introspect.py
# from anyscale/docs at the SHA pinned in .docs-introspect-sha.
# That requires GH_TOKEN with contents:read on anyscale/docs.
# Fail fast and clearly here if the token doesn't have the
# scope, rather than mid-archive run.
- name: Verify docs-repo read access
env:
GH_TOKEN: ${{ secrets.DOCS_DISPATCH_TOKEN }}
run: |
SHA="$(tr -d '[:space:]' < .docs-introspect-sha)"
if ! gh api "repos/anyscale/docs/contents/scripts/docgen/introspect.py?ref=${SHA}" \
-H "Accept: application/vnd.github.raw" --silent > /dev/null; then
echo "::error::DOCS_DISPATCH_TOKEN can't read anyscale/docs at ${SHA}. Verify the secret has contents:read scope, or swap to a dedicated read-access token in this workflow file."
exit 1
fi
echo "OK: docs-repo read access confirmed at ${SHA}."

- name: Determine missing versions
id: missing
run: |
Expand Down Expand Up @@ -102,6 +119,14 @@ jobs:

- name: Archive each missing version
if: steps.missing.outputs.versions != ''
env:
# archive_version.sh fetches introspect.py + util.py from
# anyscale/docs at the SHA pinned in .docs-introspect-sha,
# via `gh api`. That needs a token with contents:read on
# anyscale/docs. The existing DOCS_DISPATCH_TOKEN is reused
# if it carries that scope; otherwise add a separate secret
# and reference it here.
GH_TOKEN: ${{ secrets.DOCS_DISPATCH_TOKEN }}
run: |
set -e
for v in ${{ steps.missing.outputs.versions }}; do
Expand Down
2 changes: 2 additions & 0 deletions .gitignore
Original file line number Diff line number Diff line change
@@ -0,0 +1,2 @@
__pycache__/
scripts/.docgen/
14 changes: 11 additions & 3 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -27,6 +27,14 @@ https://cdn.jsdelivr.net/gh/anyscale/api-docs-schema@latest/pages.json
4. Regenerates `versions.json`.
5. Commits, pushes, and tags. jsDelivr's `@latest` picks up the new tag within ~15 minutes.

## Where introspect.py lives

The introspector this repo runs against each anyscale wheel is **not** stored here. It lives in `anyscale/docs` at `scripts/docgen/introspect.py` (plus its `util.py` sibling) and powers current-version reference rendering there too. `scripts/archive_version.sh` downloads both files at run time from a SHA pinned in [`.docs-introspect-sha`](./.docs-introspect-sha), so the docs repo stays the single source of truth and the two surfaces can't silently drift.

Because `anyscale/docs` is a private repo, the fetch uses `gh api` rather than raw.githubusercontent.com. Locally that means having an authenticated `gh` session (`gh auth login`). In CI, the archive workflow exports `GH_TOKEN` from `secrets.DOCS_DISPATCH_TOKEN`; that token needs `contents: read` scope on `anyscale/docs`.

To pick up an introspect change after it lands in the docs repo, edit `.docs-introspect-sha` with the new commit SHA and merge. The next archive run uses it.

## Manual regeneration

```
Expand All @@ -37,13 +45,13 @@ python3 scripts/update_manifests.py
## Layout

```
<version>.json # one per anyscale release (0.26.46 - 0.26.100 today)
<version>.json # one per anyscale release
versions.json # array of versions, sorted desc
pages.json # {version: [page_names]} map for the docs redirect generator
.docs-introspect-sha # pinned anyscale/docs commit that scripts/archive_version.sh
# pulls introspect.py + util.py from
scripts/
introspect.py # reads the installed anyscale wheel, emits reference.json
archive_json.py # post-processes reference.json into the schema served at <version>.json
util.py # shared helpers
archive_version.sh # one-shot wrapper used by the workflow and humans
update_manifests.py # regenerates versions.json and pages.json
.github/workflows/
Expand Down
Binary file removed scripts/__pycache__/util.cpython-310.pyc
Binary file not shown.
34 changes: 31 additions & 3 deletions scripts/archive_version.sh
Original file line number Diff line number Diff line change
@@ -1,8 +1,18 @@
#!/bin/bash
#
# Generate the JSON schema for one anyscale version. Installs the
# matching wheel into a cached venv, runs introspect.py, then
# archive_json.py to produce <repo-root>/<version>.json.
# matching wheel into a cached venv, downloads the docs-repo's
# introspect.py at the SHA pinned in .docs-introspect-sha, runs it
# against the venv, then post-processes with archive_json.py to
# produce <repo-root>/<version>.json.
#
# Why pin to a docs-repo SHA: introspect.py is also used to render
# current-version docs in anyscale/docs (`scripts/docgen/introspect.py`).
# Keeping a copy here would drift. Pulling at a pinned SHA makes the
# docs repo the single source of truth without coupling our nightly
# archive to whatever happens to be on master at 4am. Bump
# .docs-introspect-sha when an introspect change in the docs repo
# should propagate here.
#
# Usage:
# ./scripts/archive_version.sh 0.26.100
Expand Down Expand Up @@ -39,9 +49,27 @@ if [[ ! -x "$VENV_DIR/bin/python" ]]; then
"$VENV_DIR/bin/pip" install -q "anyscale==$ANYSCALE_VERSION"
fi

# Fetch introspect.py + util.py from anyscale/docs at the pinned SHA
# via `gh api` (works against the private docs repo unlike a plain
# curl). Downloaded into scripts/.docgen/ (gitignored) so the
# `from util import ...` inside introspect.py resolves against the
# sibling file.
#
# Locally: your `gh auth login` covers it. In the archive workflow:
# GH_TOKEN must be set to a token with contents:read on anyscale/docs.
DOCS_INTROSPECT_SHA="$(tr -d '[:space:]' < "$REPO_ROOT/.docs-introspect-sha")"
DOCGEN_CACHE_DIR="$SCRIPTS_DIR/.docgen"
mkdir -p "$DOCGEN_CACHE_DIR"
for f in introspect.py util.py; do
gh api \
"repos/anyscale/docs/contents/scripts/docgen/${f}?ref=${DOCS_INTROSPECT_SHA}" \
-H "Accept: application/vnd.github.raw" \
> "${DOCGEN_CACHE_DIR}/${f}"
done

# `--allow-duplicate-models` softens the introspector's uniqueness
# check (anyscale 0.26.48-0.26.52 had a duplicate CloudDeployment).
"$VENV_DIR/bin/python" "$SCRIPTS_DIR/introspect.py" "$TMP_JSON" --allow-duplicate-models
"$VENV_DIR/bin/python" "$DOCGEN_CACHE_DIR/introspect.py" "$TMP_JSON" --allow-duplicate-models
"$VENV_DIR/bin/python" "$SCRIPTS_DIR/archive_json.py" "$TMP_JSON" "$OUT_JSON" "$ANYSCALE_VERSION"

echo "Wrote $OUT_JSON"
Loading
Loading