Compile your PyTorch models to Rust for ultra-fast, memory-safe inference.
OxidizedVision is a production-grade toolkit that bridges the gap between Python-based model training and Rust-based deployment. It provides a seamless pipeline to convert, optimize, validate, benchmark, profile, and package your models β from a trained PyTorch nn.Module to a deployable Rust binary, REST API, or WebAssembly module.
| Feature | Description |
|---|---|
| π Model Conversion | PyTorch β TorchScript β ONNX with a single command |
| β‘ Optimization | ONNX graph simplification, constant folding, INT8/FP16 quantization |
| β Validation | Numerical consistency checks (MAE, RMSE, Cosine Similarity) across formats |
| π Benchmarking | Latency (avg, p50, p95, p99), throughput, and memory profiling |
| π¬ Profiling | Parameter count, model size, per-layer breakdown |
| π¦ Packaging | Auto-generate a deployable Rust crate (server or CLI) |
| π Multi-Backend | tract (pure Rust), tch (LibTorch), tensorrt (NVIDIA GPU) |
| π§© WASM Support | Run models in the browser via WebAssembly |
| π Model Registry | Track all converted models and their metadata locally |
| π¨ Rich CLI | Beautiful terminal output with progress indicators and tables |
| π Multi-Model Server | Serve multiple models from a single Rust server instance |
| β±οΈ Dynamic Batching | Configurable request batching for efficient inference |
| π Structured Logging | tracing (Rust) + Rich/JSON (Python) for full observability |
| π Metrics Endpoint | /metrics for monitoring request counts and server health |
flowchart TB
subgraph CLI["Python Client (CLI)"]
direction TB
commands["convert | validate | benchmark | optimize | profile<br/>package | serve | list | info"]
globals["Global Flags:<br/>--verbose | --json-log"]
end
CLI -->|Generates| RUST
subgraph RUST["Rust Runtimes"]
direction TB
TCH["runner_tch<br/>(LibTorch)"]
TRACT["runner_tract<br/>(Pure Rust)"]
TRT["runner_tensorrt<br/>(GPU / TensorRT)"]
CORE["runner_core (Shared Trait)<br/>+ tracing structured logging"]
TCH --> CORE
TRACT --> CORE
TRT --> CORE
end
RUST -->|Deploys to| NATIVE
RUST --> REST
RUST --> WASM
NATIVE["Native Binary"]
REST["REST API Server<br/>(multi-model, batching, /metrics)"]
WASM["WASM Module"]
# From PyPI
pip install oxidizedvision
# From source (development)
pip install -e "./python_client[dev]"# config.yml
model:
path: examples/example_unet/model.py
class_name: UNet
input_shape: [1, 3, 256, 256]
export:
output_dir: out
model_name: unet
validate:
tolerance_mae: 1e-4
tolerance_cos_sim: 0.999
benchmark:
iters: 100
device: cpu# Convert PyTorch β TorchScript + ONNX
oxidizedvision convert config.yml
# Validate numerical consistency
oxidizedvision validate config.yml
# Optimize the ONNX model
oxidizedvision optimize out/unet.onnx --quantize int8
# Benchmark performance
oxidizedvision benchmark out/unet.pt --runners torchscript,tract
# Profile the model
oxidizedvision profile config.yml
# Package into a Rust crate
oxidizedvision package out/unet.onnx --runner tract --template server
# List registered models
oxidizedvision list# Verbose mode (DEBUG level)
oxidizedvision --verbose convert config.yml
# JSON log output (for CI / log aggregation)
oxidizedvision --json-log convert config.yml| Command | Description | Example |
|---|---|---|
convert |
Convert PyTorch β TorchScript + ONNX | oxidizedvision convert config.yml |
validate |
Check numerical consistency | oxidizedvision validate config.yml --num-tests 5 |
benchmark |
Measure inference performance | oxidizedvision benchmark out/model.pt --runners torchscript,tract |
optimize |
Optimize an ONNX model | oxidizedvision optimize out/model.onnx --quantize fp16 |
profile |
Analyze model parameters and layers | oxidizedvision profile config.yml |
package |
Generate deployable Rust crate | oxidizedvision package out/model.onnx --template server |
serve |
Start inference server | oxidizedvision serve ./binary --port 8080 |
list |
List registered models | oxidizedvision list |
info |
Detailed model information | oxidizedvision info unet |
| Flag | Description |
|---|---|
--verbose / -v |
Enable DEBUG-level logging |
--json-log |
Emit logs as JSON lines (for CI / production) |
All backends implement a common Runner trait:
pub trait Runner: Send + Sync {
fn from_config(config: &RunnerConfig) -> Result<Self> where Self: Sized;
fn run(&self, input: &ArrayD<f32>) -> Result<ArrayD<f32>>;
fn info(&self) -> ModelInfo;
}| Backend | Model Format | GPU | WASM | Dependencies |
|---|---|---|---|---|
runner_tract |
ONNX | β | β | None (pure Rust) |
runner_tch |
TorchScript | β | β | LibTorch |
runner_tensorrt |
ONNX β Engine | β | β | TensorRT SDK |
The built-in image_server example provides a production-ready REST API:
# Single model
cargo run -p image_server -- --model model.onnx --port 8080
# Multi-model (serve multiple models simultaneously)
cargo run -p image_server -- \
--model segmenter=models/seg.onnx \
--model classifier=models/cls.onnx \
--port 8080
# With dynamic batching
cargo run -p image_server -- \
--model model.onnx \
--max-batch-size 8 \
--max-wait-ms 50
# JSON structured logs
cargo run -p image_server -- --model model.onnx --log-format json| Method | Path | Description |
|---|---|---|
POST |
/predict |
Inference on the default model |
POST |
/predict/{model_name} |
Inference on a named model |
GET |
/health |
Health check with per-model status |
GET |
/metrics |
Request counts, error counts, batch status |
GET |
/models |
List all loaded models |
Oxidized-Vision/
βββ python_client/ # Python CLI & pipeline
β βββ oxidizedvision/
β β βββ cli.py # Typer CLI entry point
β β βββ config.py # Pydantic config models
β β βββ convert.py # Model conversion
β β βββ validate.py # Numerical validation
β β βββ benchmark.py # Performance measurement
β β βββ optimize.py # ONNX optimization
β β βββ profile.py # Model profiling
β β βββ registry.py # Model registry
β β βββ logging.py # Structured logging (Rich / JSON)
β βββ tests/ # pytest test suite
βββ rust_runtime/ # Rust inference runtimes
β βββ crates/
β β βββ runner_core/ # Shared Runner trait + tracing
β β βββ runner_tch/ # LibTorch backend
β β βββ runner_tract/ # tract (ONNX) backend
β β βββ runner_tensorrt/ # TensorRT backend
β βββ examples/
β βββ image_server/ # Multi-model REST API with batching
β βββ denoiser_cli/ # Image denoising CLI
β βββ wasm_frontend/ # Browser inference demo
βββ tools/ # Standalone scripts
βββ benchmarks/ # Benchmark infrastructure
βββ examples/ # User-facing examples
β βββ example_unet/ # Complete UNet example
βββ docs/ # Architecture docs
βββ .github/workflows/ # CI/CD + PyPI auto-deploy
# Python tests
pytest python_client/tests/ -v --cov=oxidizedvision
# Rust tests
cargo test --workspacepip install pre-commit
pre-commit install
pre-commit run --all-filesMIT License β see LICENSE for details.