Skip to content

OnePunchMonk/Oxidized-Vision

Repository files navigation

πŸš€ OxidizedVision

License: MIT

Compile your PyTorch models to Rust for ultra-fast, memory-safe inference.

OxidizedVision is a production-grade toolkit that bridges the gap between Python-based model training and Rust-based deployment. It provides a seamless pipeline to convert, optimize, validate, benchmark, profile, and package your models β€” from a trained PyTorch nn.Module to a deployable Rust binary, REST API, or WebAssembly module.


✨ Key Features

Feature Description
πŸ”„ Model Conversion PyTorch β†’ TorchScript β†’ ONNX with a single command
⚑ Optimization ONNX graph simplification, constant folding, INT8/FP16 quantization
βœ… Validation Numerical consistency checks (MAE, RMSE, Cosine Similarity) across formats
πŸ“Š Benchmarking Latency (avg, p50, p95, p99), throughput, and memory profiling
πŸ”¬ Profiling Parameter count, model size, per-layer breakdown
πŸ“¦ Packaging Auto-generate a deployable Rust crate (server or CLI)
🌐 Multi-Backend tract (pure Rust), tch (LibTorch), tensorrt (NVIDIA GPU)
🧩 WASM Support Run models in the browser via WebAssembly
πŸ“‹ Model Registry Track all converted models and their metadata locally
🎨 Rich CLI Beautiful terminal output with progress indicators and tables
πŸ”€ Multi-Model Server Serve multiple models from a single Rust server instance
⏱️ Dynamic Batching Configurable request batching for efficient inference
πŸ“ Structured Logging tracing (Rust) + Rich/JSON (Python) for full observability
πŸ“ˆ Metrics Endpoint /metrics for monitoring request counts and server health

πŸ—οΈ Architecture

flowchart TB

subgraph CLI["Python Client (CLI)"]
    direction TB
    commands["convert | validate | benchmark | optimize | profile<br/>package | serve | list | info"]
    globals["Global Flags:<br/>--verbose | --json-log"]
end

CLI -->|Generates| RUST

subgraph RUST["Rust Runtimes"]
    direction TB

    TCH["runner_tch<br/>(LibTorch)"]
    TRACT["runner_tract<br/>(Pure Rust)"]
    TRT["runner_tensorrt<br/>(GPU / TensorRT)"]

    CORE["runner_core (Shared Trait)<br/>+ tracing structured logging"]

    TCH --> CORE
    TRACT --> CORE
    TRT --> CORE
end

RUST -->|Deploys to| NATIVE
RUST --> REST
RUST --> WASM

NATIVE["Native Binary"]
REST["REST API Server<br/>(multi-model, batching, /metrics)"]
WASM["WASM Module"]
Loading

⚑ Quickstart

1. Install

# From PyPI
pip install oxidizedvision

# From source (development)
pip install -e "./python_client[dev]"

2. Create a Config

# config.yml
model:
  path: examples/example_unet/model.py
  class_name: UNet
  input_shape: [1, 3, 256, 256]

export:
  output_dir: out
  model_name: unet

validate:
  tolerance_mae: 1e-4
  tolerance_cos_sim: 0.999

benchmark:
  iters: 100
  device: cpu

3. Run the Pipeline

# Convert PyTorch β†’ TorchScript + ONNX
oxidizedvision convert config.yml

# Validate numerical consistency
oxidizedvision validate config.yml

# Optimize the ONNX model
oxidizedvision optimize out/unet.onnx --quantize int8

# Benchmark performance
oxidizedvision benchmark out/unet.pt --runners torchscript,tract

# Profile the model
oxidizedvision profile config.yml

# Package into a Rust crate
oxidizedvision package out/unet.onnx --runner tract --template server

# List registered models
oxidizedvision list

4. Debug with Structured Logging

# Verbose mode (DEBUG level)
oxidizedvision --verbose convert config.yml

# JSON log output (for CI / log aggregation)
oxidizedvision --json-log convert config.yml

πŸ“– CLI Reference

Command Description Example
convert Convert PyTorch β†’ TorchScript + ONNX oxidizedvision convert config.yml
validate Check numerical consistency oxidizedvision validate config.yml --num-tests 5
benchmark Measure inference performance oxidizedvision benchmark out/model.pt --runners torchscript,tract
optimize Optimize an ONNX model oxidizedvision optimize out/model.onnx --quantize fp16
profile Analyze model parameters and layers oxidizedvision profile config.yml
package Generate deployable Rust crate oxidizedvision package out/model.onnx --template server
serve Start inference server oxidizedvision serve ./binary --port 8080
list List registered models oxidizedvision list
info Detailed model information oxidizedvision info unet

Global Options

Flag Description
--verbose / -v Enable DEBUG-level logging
--json-log Emit logs as JSON lines (for CI / production)

πŸ¦€ Rust Runtimes

Shared Runner Trait

All backends implement a common Runner trait:

pub trait Runner: Send + Sync {
    fn from_config(config: &RunnerConfig) -> Result<Self> where Self: Sized;
    fn run(&self, input: &ArrayD<f32>) -> Result<ArrayD<f32>>;
    fn info(&self) -> ModelInfo;
}

Available Backends

Backend Model Format GPU WASM Dependencies
runner_tract ONNX ❌ βœ… None (pure Rust)
runner_tch TorchScript βœ… ❌ LibTorch
runner_tensorrt ONNX β†’ Engine βœ… ❌ TensorRT SDK

πŸ–₯️ Inference Server

The built-in image_server example provides a production-ready REST API:

# Single model
cargo run -p image_server -- --model model.onnx --port 8080

# Multi-model (serve multiple models simultaneously)
cargo run -p image_server -- \
  --model segmenter=models/seg.onnx \
  --model classifier=models/cls.onnx \
  --port 8080

# With dynamic batching
cargo run -p image_server -- \
  --model model.onnx \
  --max-batch-size 8 \
  --max-wait-ms 50

# JSON structured logs
cargo run -p image_server -- --model model.onnx --log-format json

Endpoints

Method Path Description
POST /predict Inference on the default model
POST /predict/{model_name} Inference on a named model
GET /health Health check with per-model status
GET /metrics Request counts, error counts, batch status
GET /models List all loaded models

πŸ—‚οΈ Project Structure

Oxidized-Vision/
β”œβ”€β”€ python_client/             # Python CLI & pipeline
β”‚   β”œβ”€β”€ oxidizedvision/
β”‚   β”‚   β”œβ”€β”€ cli.py             # Typer CLI entry point
β”‚   β”‚   β”œβ”€β”€ config.py          # Pydantic config models
β”‚   β”‚   β”œβ”€β”€ convert.py         # Model conversion
β”‚   β”‚   β”œβ”€β”€ validate.py        # Numerical validation
β”‚   β”‚   β”œβ”€β”€ benchmark.py       # Performance measurement
β”‚   β”‚   β”œβ”€β”€ optimize.py        # ONNX optimization
β”‚   β”‚   β”œβ”€β”€ profile.py         # Model profiling
β”‚   β”‚   β”œβ”€β”€ registry.py        # Model registry
β”‚   β”‚   └── logging.py         # Structured logging (Rich / JSON)
β”‚   └── tests/                 # pytest test suite
β”œβ”€β”€ rust_runtime/              # Rust inference runtimes
β”‚   β”œβ”€β”€ crates/
β”‚   β”‚   β”œβ”€β”€ runner_core/       # Shared Runner trait + tracing
β”‚   β”‚   β”œβ”€β”€ runner_tch/        # LibTorch backend
β”‚   β”‚   β”œβ”€β”€ runner_tract/      # tract (ONNX) backend
β”‚   β”‚   └── runner_tensorrt/   # TensorRT backend
β”‚   └── examples/
β”‚       β”œβ”€β”€ image_server/      # Multi-model REST API with batching
β”‚       β”œβ”€β”€ denoiser_cli/      # Image denoising CLI
β”‚       └── wasm_frontend/     # Browser inference demo
β”œβ”€β”€ tools/                     # Standalone scripts
β”œβ”€β”€ benchmarks/                # Benchmark infrastructure
β”œβ”€β”€ examples/                  # User-facing examples
β”‚   └── example_unet/         # Complete UNet example
β”œβ”€β”€ docs/                      # Architecture docs
└── .github/workflows/         # CI/CD + PyPI auto-deploy

πŸ§ͺ Testing

# Python tests
pytest python_client/tests/ -v --cov=oxidizedvision

# Rust tests
cargo test --workspace

Pre-commit Hooks

pip install pre-commit
pre-commit install
pre-commit run --all-files

πŸ“„ License

MIT License β€” see LICENSE for details.

About

Compile PyTorch vision models into ultra-fast Rust binaries for edge, server, and browser deployment.

Resources

License

Contributing

Stars

Watchers

Forks

Packages

 
 
 

Contributors