Raptor is a from-scratch tensor autograd engine built in pure NumPy, paired with raptorgraph, a lightweight graph inspection tool for visualizing forward and backward computation graphs.
The project is intentionally small enough to read end to end and deep enough to surface the core engineering ideas behind modern deep learning systems:
- tensor-based reverse-mode autodiff
- broadcasting-aware gradient propagation
- neural network layer abstractions
- optimizers such as SGD and Adam
- MNIST training without relying on PyTorch for the core engine
- graph tracing and interactive inspection of tensor graphs
- Project Structure
- Design Goals
- Architecture
- Core Components
- Installation
- Quick Start
- Training on MNIST
- Benchmarking Against PyTorch
- Using RaptorGraph
- Repository Layout
- Current Scope
Raptor is split into two main parts:
raptor/The core tensor engine, neural network abstractions, optimizers, and utility code.raptorgraph/A FastAPI-backed graph viewer that can render traced tensor graphs from built-in demos or graphs registered from user code.
This separation is deliberate:
raptorcomputesraptorgraphobserves
The project is designed around a few concrete engineering goals.
- Keep the core engine readable.
- Implement tensor autograd directly rather than delegating to PyTorch.
- Handle the non-trivial parts of tensor autodiff explicitly, especially:
- broadcasting
- reductions
- matrix multiplication
- graph traversal and gradient accumulation
- Expose the runtime graph in a form that is inspectable in a browser.
- Make the project useful as both a learning artifact and a practical systems exercise.
The center of the system is the Tensor type in raptor/engine.py.
Each tensor stores:
data: the NumPy array payloadgrad: the accumulated gradient_prev: parent tensors that produced it_op: the operation label used for tracing/debugging_backward: a closure encoding the local backward rule
Backward propagation works by:
- building a topological ordering of the graph from an output tensor
- seeding the output gradient with ones
- running each stored
_backwardclosure in reverse topological order
raptor/ops.py contains reusable nonlinear operations such as:
relusigmoidtanh
Elementwise arithmetic, reductions, shape transforms, and matrix multiplication live on Tensor itself.
A key implementation detail is the broadcasting-aware gradient reduction helper used to map broadcasted gradients back to original input shapes.
raptor/nn.py builds higher-level abstractions on top of the tensor engine:
ModuleLinearSequential- activation wrappers
MSELossCrossEntropyLoss
The layer stack is intentionally minimal but sufficient for real training workloads such as MNIST classification.
raptor/optim.py provides:
SGDAdam
These operate directly on Tensor.data using gradients accumulated by the engine.
raptor/utils.py includes:
- batch iteration
- classifier evaluation helpers
- MNIST download and IDX parsing
- history export utilities for JSON and CSV
- optional curve plotting helpers
raptorgraph traces an output tensor into a browser-friendly graph representation.
raptorgraph/tracer.py serializes a graph into:
- nodes
- edges
- compact summaries of tensor values and gradients
raptorgraph/server.py serves:
- a static frontend
- demo graphs
- graph activation endpoints
- custom graph registration endpoints
raptorgraph/client.py provides a simple notebook/script-side helper for sending traced graphs to the running server.
Responsibilities:
- dynamic graph construction
- reverse-mode autodiff
- gradient accumulation
- broadcasting-aware backward behavior
- reductions, reshape, transpose, matrix multiply
Responsibilities:
- parameter discovery
- forward composition
- trainable linear layers
- loss computation
Responsibilities:
- parameter update rules
- optimizer state such as momentum or Adam moments
- gradient clearing
Responsibilities:
- serve the frontend
- expose graph JSON over HTTP
- provide built-in demo graphs
- store custom named graphs registered from user code
The project is managed as a Python package via pyproject.toml.
uv syncpip install -e .Project metadata currently declares Python >=3.13 in pyproject.toml.
import numpy as np
from raptor import Tensor
x = Tensor(np.array([1.0, 2.0, 3.0], dtype=np.float32), requires_grad=True)
y = (x * 2).sum()
y.backward()
print(x.grad)import numpy as np
from raptor import Tensor, nn
model = nn.Sequential(
nn.Linear(4, 8),
nn.ReLU(),
nn.Linear(8, 3),
)
x = Tensor(np.random.randn(2, 4).astype(np.float32), requires_grad=False)
labels = np.array([0, 2], dtype=np.int64)
logits = model(x)
loss = nn.CrossEntropyLoss()(logits, labels)
loss.backward()Raptor includes a NumPy-based MNIST loader in raptor/utils.py.
Example:
from raptor import Tensor, nn
from raptor.optim import Adam
from raptor.utils import load_mnist, batch_iterator, evaluate_classifier
X_train, y_train, X_test, y_test = load_mnist()
model = nn.Sequential(
nn.Linear(784, 128),
nn.ReLU(),
nn.Linear(128, 64),
nn.ReLU(),
nn.Linear(64, 10),
)
criterion = nn.CrossEntropyLoss()
optimizer = Adam(model.parameters(), lr=1e-3)
for X_batch, y_batch in batch_iterator(X_train, y_train, batch_size=64, shuffle=True):
x = Tensor(X_batch, requires_grad=False)
logits = model(x)
loss = criterion(logits, y_batch)
optimizer.zero_grad()
loss.backward()
optimizer.step()The current benchmark results in benchmarks/results/compare_results.json show that the engine trains a competitive MNIST MLP and lands in essentially the same accuracy range as an equivalent PyTorch implementation.
The benchmark script lives at benchmarks/compare.py.
It compares Raptor and PyTorch under controlled conditions:
- same architecture
- same initial weights
- same batch order per epoch
- same optimizer family and learning rate
- multiple seeds
Run it with:
python benchmarks/compare.pyRepresentative summary from the saved benchmark results:
- Raptor final test accuracy mean:
0.9767 - PyTorch final test accuracy mean:
0.9764 - Raptor average epoch time mean:
2.59s - PyTorch average epoch time mean:
2.06s
The important result is not that one implementation “wins” a single run, but that the from-scratch engine converges to the same performance regime under controlled comparisons.
Start the graph server:
uvicorn raptorgraph.server:app --reloadOpen:
http://127.0.0.1:8000
The viewer exposes several built-in graphs:
arithmeticmlpmnist_loss
These are selectable from the minimal top-bar graph picker.
Use the helper in raptorgraph/client.py:
import numpy as np
from raptor import Tensor, nn
from raptorgraph.client import register_graph
x = Tensor(np.array([[1.0, -2.0, 3.0, 0.5]], dtype=np.float32), requires_grad=False)
model = nn.Sequential(
nn.Linear(4, 6),
nn.ReLU(),
nn.Linear(6, 3),
)
labels = np.array([2], dtype=np.int64)
logits = model(x)
loss = nn.CrossEntropyLoss()(logits, labels)
loss.backward()
register_graph(loss, name="quick_test_graph")This sends the traced graph over HTTP to the running raptorgraph server, stores it there, and makes it selectable in the UI dropdown as custom:quick_test_graph.
This design avoids the earlier process-boundary issue where notebook memory and server memory were separate.
raptor/
├── engine.py
├── nn.py
├── ops.py
├── optim.py
└── utils.py
raptorgraph/
├── client.py
├── server.py
├── tracer.py
└── static/
├── index.html
├── graph.js
└── style.css
benchmarks/
├── compare.ipynb
├── compare.py
└── results/
└── compare_results.json
data/
└── mnist/
notebooks/
├── graphlens.ipynb
├── mlp.ipynb
├── tensor_basics.ipynb
└── training.ipynb
What is implemented now:
- tensor autograd with NumPy
- broadcasting-aware backward rules
- linear layers and activations
- MSE and cross-entropy losses
- SGD and Adam
- MNIST loading and training
- controlled benchmark comparison against PyTorch
- browser-based graph inspection with named custom graph registration
What is intentionally still lightweight:
- no GPU backend
- no convolutional layers
- no distributed training support
- no persistent graph database for
raptorgraph - no step-by-step backward playback yet
No license file has been added yet.

