This repo stores exploratory Jupyter notebooks shared across the engineering org for later reference. The goal is for ad-hoc analyses remain readable and re-runnable long after the original author has moved on.
- Install
uv. - Clone the repo and
cdinto it. - Run
uv sync. This installs Python 3.12 and all dependencies into.venv/. - Run
uv run pre-commit install. This installs the nbstripout Git hook locally. This step is required, not optional. Without it, your commits will leak notebook outputs and CI will fail your PR. - Start working in
notebooks/<your-name>/.
Each contributor gets their own directory under notebooks/. Create notebooks/<your-name>/ the first time you add work. Notebook filenames must follow YYYY-MM-DD_short-description.ipynb so they sort chronologically within a directory. This is enforced by a pre-commit hook (bin/check-notebook-filename.sh) and a CI backstop in .github/workflows/check-notebooks.yml.
Open the repo in VSCode. It should auto-detect .venv/ as the Python interpreter; if it doesn't, open the command palette and run "Python: Select Interpreter", then pick .venv/bin/python. When you open an .ipynb file, use the kernel picker in the top right of the notebook to select the same interpreter as the kernel.
From the repo root, run uv run jupyter lab.
This repo is configured so Claude Code can read, write, and execute notebook cells while you watch them update live in either VS Code or JupyterLab in the browser. The setup is one-time; afterwards, "use notebooks/yourname/foo.ipynb" is enough to point Claude at a notebook.
-
Make sure Docker Desktop is installed and running.
-
From the repo root, run:
./bin/jupyter-mcp-setup.shIt verifies prerequisites and prints the next steps.
-
Follow the printed instructions to connect your notebook client of choice (VS Code and/or browser — see below) and confirm
claude mcp listshowsDatalayerJupyter ✓ Connected.
Open two terminals in the repo root:
# Terminal 1 — keep this running while editing notebooks with Claude
./bin/jupyter-mcp-start.sh
# Terminal 2 — your normal Claude Code session
claude
Then open the notebook in whichever client you prefer. Both work simultaneously — you can even have VS Code and the browser open on the same notebook at once.
Option A — Browser (JupyterLab). Simpler, and the most reliable view of live edits (JupyterLab is the native client for the realtime-collaboration protocol the MCP uses). Just open:
http://localhost:8888/lab?token=token
Navigate to the notebook and start working. No additional setup.
Option B — VS Code. One-time per machine: open any .ipynb, click the kernel picker (top right of the notebook), select "Select Another Kernel..." → "Existing Jupyter Server..." → "Enter URL", paste http://localhost:8888/?token=token, and pick the Python kernel that appears. VS Code remembers the server for future sessions. Note that VS Code can lag on structural changes (new cells, reordering) — if you find this disruptive, switch to the browser.
Once your client is connected, tell Claude things like:
- "use notebooks/garrett/foo.ipynb"
- "add a markdown cell at the top explaining what this notebook does"
- "load data from /path/to/file.csv and show a sample"
Cells Claude adds or edits will appear live in whichever client you have open.
.mcp.jsonregisters the Datalayer Jupyter MCP server at project scope. When you launchclaudefrom this repo, it spawns a Docker container running the MCP server.- The MCP container connects over HTTP to the Jupyter server you start with
bin/jupyter-mcp-start.sh. - The token is the literal string
token— local-dev only, never exposed beyondlocalhost:8888. - The
jupyter-collaborationpackage (inpyproject.toml) enables the realtime-collaboration protocol that the MCP and VS Code both use to stay in sync.
Anything under private/ is gitignored (except private/README.md). Use it for CSVs, markdown, or other artifacts your notebooks produce that you want on disk but don't want to commit. A common pattern is writing to private/<notebook-name>/ from a notebook so each analysis keeps its own outputs together.
Use uv add <package> for runtime dependencies and uv add --dev <package> for tooling. Commit pyproject.toml and uv.lock together.
Notebook outputs — plots, large dataframes, base64-encoded images — produce huge, noisy diffs and bloat the repository over time. nbstripout removes outputs at commit time so only the source cells are tracked. The pre-commit hook is the primary line of defense; the GitHub Actions workflow at .github/workflows/check-notebooks.yml is a backstop that fails CI if someone forgets to run pre-commit install and pushes a notebook with outputs intact.
- "My commit was modified by the hook." Expected on the first commit of a notebook with outputs. The hook strips them and aborts the commit so you can review. Just
git addthe file again and re-commit. - "CI is failing on my PR with a notebook outputs error." You forgot to run
pre-commit install. Run it, thenpre-commit run --all-files, commit the result, and push. - "VSCode can't find the kernel." Make sure
uv syncsucceeded and explicitly pick.venv/bin/pythonin the interpreter and kernel pickers.