XCell

Interactive web application for exploring and analyzing scRNA-seq and spatial transcriptomics data. Load an h5ad, 10x Genomics h5, Seurat .rds file, 10x CellRanger matrix folder, or prefixed 10x file trio from GEO, visualize cells on a scatter plot, run Scanpy analysis pipelines, and explore results — all from your browser.

Quick Start

Prerequisites

Python 3.9+
Node.js 18+
R with Seurat and SeuratDisk packages (optional, required for loading .rds files)

Backend Setup

cd xcell/backend

# Create virtual environment (recommended)
python -m venv venv
source venv/bin/activate  # On Windows: venv\Scripts\activate

# Install in editable mode
pip install -e .

Frontend Setup

cd xcell/frontend
npm install

Launch

# Terminal 1: Start the backend (from xcell/backend/)
uvicorn xcell.main:app --reload

# Terminal 2: Start the frontend (from xcell/frontend/)
npm run dev

Open http://localhost:5173 in your browser.

A bundled toy dataset (toy_spatial.h5ad) loads automatically if no data path is specified. To load your own data, set the XCELL_DATA_PATH environment variable:

XCELL_DATA_PATH=/path/to/your/data.h5ad uvicorn xcell.main:app --reload  # also supports .h5 and .rds

Getting Started with Toy Data

The included test_data/toy_spatial.h5ad dataset is a small spatial transcriptomics dataset for exploring XCell's features. Here's a step-by-step walkthrough:

1. Explore the Scatter Plot

Pan by clicking and dragging
Zoom with scroll wheel
Cells are rendered as points at their spatial coordinates

2. Color by Metadata

Open Cell Manager (left panel)
Select a metadata column to color cells by that annotation

3. Select Cells

Click the Select button in the toolbar (use the dropdown arrow to choose between Lasso and Polygon tools)
- Lasso: click and drag to draw a freehand selection
- Polygon: click to add vertices, double-click to close and select cells inside
Hold Shift while selecting to add to the existing selection
Checkboxes in the Cell Manager also select/deselect cells by category
Selected cells can be masked or deleted

4. Run Preprocessing

Open the Scanpy modal (top toolbar)
Go to Preprocessing and run in order:
1. Normalize Total — normalize counts per cell
2. Log1p — log-transform the data
3. Highly Variable Genes — identify informative genes

5. Run Cell Analysis

In the Scanpy modal, go to Cell Analysis and run in order:
1. PCA — reduce dimensionality
2. Neighbors — build cell neighborhood graph (requires PCA)
3. UMAP — compute 2D embedding (requires Neighbors)
4. Leiden — cluster cells (requires Neighbors)

6. View Clustering Results

In Cell Manager, select the leiden column to color by cluster
Switch the embedding to X_umap to see the UMAP layout

7. Color by Gene Expression

Open Gene Manager (right panel)
If the dataset has alternative gene identifier columns (e.g., gene symbols alongside Ensembl IDs), use the Gene IDs dropdown at the top of the panel to switch
Search or browse genes
Click a gene to color cells by its expression

Gene Mask

To scope the Gene Panel to a relevant gene universe, click the ⋯ button in the Genes panel header and choose Gene mask…. The modal lists all boolean columns in your dataset's .var (for example, highly_variable after running Highly Variable Genes, or spatially_variable after spatial autocorrelation). For each column, choose:

Off — ignore this column
Keep — include genes where this column is True
Hide — exclude genes where this column is True

When you have multiple Keep columns, choose whether to match ANY (union) or ALL (intersection). Hide columns always combine as a union.

The mask applies to the gene browse list, gene search, expanded gene set rows, and gene set score aggregation used for display coloring. It does not apply to analysis operations (Diff Exp, Marker Genes, Gene PCA, etc.) — those have their own gene subset dropdowns. The mask is per-dataset and session-only; reloading the page clears it.

8. Gene Sets

Create gene sets manually in Gene Manager
Import gene lists from files

Curating gene sets into folders

The Manual category at the top of the Gene Panel is the home for gene sets you create by hand. Click + 📁 to create a named folder (e.g. "Fig 3 markers"). Inside a folder, click + to add a new empty set, or drag an existing top-level set onto the folder row to move it in. Drag a set back onto the thin strip above the first folder to move it out. Drag sets within the same container to reorder them.

Each gene set and folder row has a ⋯ button with secondary actions. On a gene set row, that's where you find Pin and Cluster genes. On a manual folder row, that's where you find Pin and Export (JSON/GMT/CSV).

Use the Pin/Unpin option in the ⋯ menu on any set or folder to float it to the top of its container. Pinning works in every category — including auto-generated ones — and survives moving a set between folders.

The Export ▸ option in the ⋯ menu on any manual folder lets you export just that folder's gene sets to JSON, GMT, or CSV. Filename defaults to the sanitized folder name. JSON round-trips via the existing Import modal.

Use the 👁 button on a category header to hide a whole category from view (useful when an analysis has filled Gene Clusters or Differential Expression with results you're done with). A N hidden ▸ footer appears at the bottom of the Gene Panel — click it and then Unhide to bring a category back.

Tip: double-click any gene set name or manual folder name to rename it inline.

Sub-clustering a gene set

Any gene set with at least 4 genes can be sub-clustered by expression pattern. Click the ⋯ button on a gene set row and choose Cluster genes…. Pick a method (Hierarchical or K-means), a number of clusters K (default 3), and a cell context ("All cells", "Current selection" if you've lasso-picked some cells, or "Annotation category" to restrict to specific categorical values in a .obs column). Clicking Run creates a new folder in Gene Clusters named after the source set, containing one gene set per cluster. Re-running with different K or a different cell context appends another folder so you can compare runs side by side.

Selecting cells by expression threshold

You can select cells based on a gene's expression or a gene set score without needing to eyeball the scatter plot:

In the Gene Panel, click the ⋯ menu on any gene row or gene set row and choose Select cells….
The modal opens and the scatter plot switches to expression coloring for that source. An interactive histogram of the values is shown.
Pick a threshold mode (Above, Below, or Between) and drag the red cutoff line(s). The match counter updates live.
Choose an action:
- Update selection replaces, adds to, or intersects with your current lasso selection.
- Label cells creates a new annotation column with high/low labels for the cells in the chosen context (current selection or all cells). On success, click Open Diff Exp ▸ to immediately run differential expression between the two groups.

Typical workflow for "find DEGs by expression state in a region": lasso a region → ⋯ → Select cells… on a gene → drag the threshold → Label cells → Open Diff Exp.

9. Compare Cell Groups

Open the Analyze modal (top toolbar) → Cell Analysis → Compare Cells
Select an .obs column (e.g., leiden) from the dropdown
Check 2 or more groups to compare:
- 2 checked → pairwise differential expression
- 3+ checked → one-vs-rest marker gene analysis
Set Top N genes and click Run
You can also use lasso selection: select cells → Set as Group 1 / Set as Group 2 → click Compare in the comparison bar

10. Trajectory Analysis

Draw lines on the scatter plot
Click the gear icon on a shape in the Shapes panel to open Line Tools
Under Gene Association, configure:
- Test against: position along line or distance from line
- Gene subset: filter to highly variable genes or other boolean columns
- Spline knots: number of interior knots for the B-spline model (default 5; higher = more flexible fit)
- FDR: significance threshold (default 0.05)
- Max genes/module: cap on genes returned per expression module
Click Find Associated Genes to run the analysis
In the results modal, use the Filters bar to refine results interactively: adjust min R², min amplitude, max FDR, or toggle pattern types (increasing, decreasing, peak, trough, complex)

Multi-section / replicate analysis

Draw a line on each tissue section representing the same biological axis
For each line, select cells (via lasso or clicking a category value in the Cells panel) and click + to associate them with the line
Check the lines to include using the checkboxes that appear on lines with projected cells
Click Find Associated Genes in the action bar
In the multi-line modal, toggle direction per line if needed (arrow button) and set analysis parameters
Results pool cells across all lines for a single, higher-powered analysis

11. Run Gene Analysis

In the Scanpy modal, go to Gene Analysis:
1. Build Gene Graph — compute gene-gene similarity
2. Cluster Genes — group genes by expression pattern

12. Spatial Contouring

Select genes in the Gene Panel (click individual genes or use a gene set)
Open the Scanpy modal, go to Spatial Analysis > Contourize
Adjust smoothing sigma, contour levels, and grid resolution as needed
Click Run — a new categorical column appears in the Cell Panel
Color cells by the contour column to visualize spatial expression zones

13. Load a Second Dataset

Click Load in the toolbar — the modal shows a sidebar with quick-access locations (Home, Desktop, Documents, Downloads) and recently loaded files, plus breadcrumb path navigation for clicking any ancestor directory
Choose Secondary from the "Load into" dropdown
Browse or enter the path to a second h5ad, h5, rds file, 10x matrix folder, or prefixed 10x file trio and click Load
A dataset switcher dropdown appears in the header — switch between Primary and Secondary to compare datasets
Click the Split button to view both datasets side by side
Click on either plot to make it the active dataset — the Cell and Gene panels update accordingly
Each plot has its own embedding selector, legend, and independent pan/zoom

14. Export Results

Click Export in the toolbar to download annotations and results

Features

Interactive scatter plot — deck.gl-powered visualization with pan, zoom, lasso selection
Cell Manager — browse/color by metadata, mask/delete cells
Gene Manager — search genes, create gene sets, import gene lists
Scanpy integration — run preprocessing, cell analysis (PCA, Neighbors, UMAP, Leiden), gene analysis, spatial analysis (contourize), and differential expression directly in the browser. Long-running operations (gene neighbors, spatial neighbors, spatial autocorrelation, contourize, line gene association) can be cancelled mid-run without corrupting session data.
Trajectory analysis — draw lines and associate genes with spatial trajectories
Quilt mode — lasso and rearrange tissue pieces: drag to translate, shift+drag to rotate, flip to reflect selected cell subsets
Display settings — adjust point size, opacity, colormaps, bivariate coloring
Multi-dataset support — load two datasets (h5ad, h5, rds, 10x matrix folders, or prefixed 10x file trios from GEO), switch between them, or view side by side in split mode
Export — download annotations and analysis results

Project Structure

xcell/
├── backend/
│   ├── xcell/
│   │   ├── main.py          # FastAPI app entry point
│   │   ├── adaptor.py       # DataAdaptor class (wraps AnnData)
│   │   ├── diffexp.py       # Differential expression
│   │   ├── data/
│   │   │   └── toy_spatial.h5ad  # Bundled toy dataset
│   │   └── api/
│   │       └── routes.py    # REST API endpoints
│   └── pyproject.toml       # Python dependencies
├── frontend/
│   ├── src/
│   │   ├── App.tsx           # Main app component
│   │   ├── store.ts          # Zustand state management
│   │   ├── main.tsx          # Entry point
│   │   ├── components/
│   │   │   ├── ScatterPlot.tsx        # deck.gl scatter plot
│   │   │   ├── CellPanel.tsx          # Cell metadata manager
│   │   │   ├── GenePanel.tsx          # Gene browser / gene sets
│   │   │   ├── ScanpyModal.tsx        # Scanpy analysis pipeline UI
│   │   │   ├── DiffExpModal.tsx       # Differential expression
│   │   │   ├── LineAssociationModal.tsx # Trajectory analysis
│   │   │   ├── DisplaySettings.tsx    # Visualization settings
│   │   │   ├── ShapeManager.tsx       # Shape/selection tools
│   │   │   └── ImportModal.tsx        # Gene list import
│   │   └── hooks/
│   │       └── useData.ts    # Data fetching hooks
│   ├── package.json          # Node dependencies
│   └── vite.config.ts        # Vite configuration
├── README.md
test_data/
├── toy_spatial.h5ad          # Toy dataset for testing
└── generate_toy.py           # Script to regenerate toy data

Architecture

Backend: FastAPI + AnnData + Scanpy, serving data and running analysis via REST API
Frontend: React + TypeScript + Vite + deck.gl + Zustand for state management
Data flow: h5ad file → DataAdaptor → REST API → React hooks → deck.gl visualization
API docs: Available at http://localhost:8000/docs when the backend is running

Name		Name	Last commit message	Last commit date
Latest commit History 95 Commits
backend		backend
docs		docs
frontend		frontend
.gitignore		.gitignore
CHANGELOG.md		CHANGELOG.md
README.md		README.md

Folders and files

Latest commit

History

Repository files navigation

XCell

Quick Start

Prerequisites

Backend Setup

Frontend Setup

Launch

Getting Started with Toy Data

1. Explore the Scatter Plot

2. Color by Metadata

3. Select Cells

4. Run Preprocessing

5. Run Cell Analysis

6. View Clustering Results

7. Color by Gene Expression

Gene Mask

8. Gene Sets

Curating gene sets into folders

Sub-clustering a gene set

Selecting cells by expression threshold

9. Compare Cell Groups

10. Trajectory Analysis

Multi-section / replicate analysis

11. Run Gene Analysis

12. Spatial Contouring

13. Load a Second Dataset

14. Export Results

Features

Project Structure

Architecture

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages