Expand description
§oximedia-ml — Sovereign ML for Media
oximedia-ml wraps the Pure-Rust OxiONNX
runtime in a set of typed pipelines tailored to multimedia workloads:
scene classification, shot boundary detection, aesthetic scoring,
object detection, face embedding, and more as the feature-gated zoo
grows.
§Design goals
- Default Pure Rust — the default build pulls in zero ONNX
symbols. Enable the
onnxfeature to opt in to inference. - Typed pipelines — callers get domain-shaped inputs and outputs
rather than raw tensors. Every pipeline implements
TypedPipeline. - No unwrap policy — every fallible operation returns
MlResult; doc-tests follow the same rule. - Cache-friendly — loaded models can be shared across pipelines
via
ModelCache(bounded LRU keyed by canonical path). - Device portable —
DeviceType::autopicks the best available backend at runtime (CUDA → DirectML → WebGPU → CPU) and memoises the result.
§Quick start
Load a Places365-compatible scene classifier, run it on a 224×224 RGB frame, and print the top-5 predictions:
use oximedia_ml::pipelines::{SceneClassifier, SceneImage};
use oximedia_ml::{DeviceType, TypedPipeline};
let classifier = SceneClassifier::load("places365.onnx", DeviceType::auto())?;
let image = SceneImage::new(vec![0u8; 224 * 224 * 3], 224, 224)?;
for pred in classifier.run(image)? {
println!("class {} @ {:.3}", pred.class_index, pred.score);
}§Feature matrix
Backend features control which ONNX execution providers are compiled
in; pipeline features enable individual domain adapters. Everything
except cuda is WASM-compatible (see the support table below).
| Feature | Purpose | Notes |
|---|---|---|
onnx | Enables the real OnnxModel backed by OxiONNX. | Required for any inference. |
cuda | Additionally compile oxionnx-cuda for NVIDIA GPU execution. | Native only (no WASM). |
webgpu | Additionally compile oxionnx-gpu (wgpu backend). | Works on native + browsers. |
directml | Additionally compile oxionnx-directml. | Stub outside Windows. |
serde | Derives Serialize on pipeline info/value types. | Opt-in; no runtime cost. |
scene-classifier | Builds the pipelines::SceneClassifier pipeline. | Places365-compatible. |
shot-boundary | Builds the pipelines::ShotBoundaryDetector pipeline. | TransNet V2-compatible. |
aesthetic-score | Builds the pipelines::AestheticScorer pipeline. | NIMA-compatible. |
object-detector | Builds the pipelines::ObjectDetector pipeline. | YOLOv8-compatible. |
face-embedder | Builds the pipelines::FaceEmbedder pipeline. | ArcFace-compatible. |
all-pipelines | Shortcut enabling every pipeline above. | Implies onnx. |
§Device selection
Callers rarely need to hard-code a backend. DeviceType::auto
probes capabilities once (memoised in an OnceLock) and returns the
strongest available device:
use oximedia_ml::{DeviceCapabilities, DeviceType};
// Cached after the first call.
let device = DeviceType::auto();
// Want the full capability report? (panic-safe probes.)
for cap in DeviceCapabilities::probe_all() {
println!(
"{:?}: {}",
cap.device_type,
if cap.is_available { "available" } else { "unavailable" },
);
}Each pipeline constructor accepts a DeviceType. Pass
DeviceType::Cpu to force the pure-Rust path, or pick a specific
GPU backend when you know the deployment target.
§Pipeline ecosystem
All pipelines live under pipelines and implement
TypedPipeline. Each is gated behind its own feature so apps only
compile what they use:
| Pipeline | Feature | Input | Output | Reference model |
|---|---|---|---|---|
pipelines::SceneClassifier | scene-classifier | 224×224 RGB frame | Vec<SceneClassification> | Places365/ResNet |
pipelines::ShotBoundaryDetector | shot-boundary | 48×27 RGB window | Vec<ShotBoundary> | TransNet V2 |
pipelines::AestheticScorer | aesthetic-score | 224×224 RGB frame | AestheticScore | NIMA |
pipelines::ObjectDetector | object-detector | 640×640 RGB frame | Vec<Detection> | YOLOv8 (80 COCO) |
pipelines::FaceEmbedder | face-embedder | 112×112 RGB face | FaceEmbedding (512-dim) | ArcFace |
Value types (AestheticScore, Detection, FaceEmbedding) are
always re-exported at the crate root so callers can handle results
even if they only consume them from another crate.
§WebAssembly (wasm32-unknown-unknown)
oximedia-ml is validated for the WASM target on every release. The
support matrix is:
| Feature set | wasm32-unknown-unknown |
|---|---|
| default (no features) | builds |
onnx | builds |
onnx + any subset of scene-classifier/shot-boundary/aesthetic-score/object-detector/face-embedder/all-pipelines | builds |
webgpu (wgpu browser backend) | builds |
directml (stub on non-Windows) | builds |
cuda | does not build |
The cuda feature transitively depends on oxicuda-driver, which uses
libloading to bind the NVIDIA driver at runtime. libloading gates
its Library type behind cfg(any(unix, windows)), so the crate will
never compile on wasm32-unknown-unknown. This is a fundamental
property of GPU driver loading rather than a limitation of this crate,
so cuda is treated as a native-only feature.
Everything on WASM executes the pure-Rust CPU path (DeviceType::Cpu),
which is what browsers actually want anyway — the WebGPU backend is
opted into by enabling the webgpu feature. There is no mock inference
path; if the onnx feature is disabled, OnnxModel::load returns
MlError::FeatureDisabled as on native.
Re-exports§
pub use cache::ModelCache;pub use cache::DEFAULT_CAPACITY;pub use device::DeviceCapabilities;pub use device::DeviceType;pub use error::MlError;pub use error::MlResult;pub use model::load_auto;pub use model::ModelInfo;pub use model::OnnxModel;pub use model::TensorDType;pub use model::TensorSpec;pub use pipeline::PipelineInfo;pub use pipeline::PipelineTask;pub use pipeline::TypedPipeline;pub use pipelines::AestheticScore;pub use pipelines::Detection;pub use pipelines::FaceEmbedding;pub use postprocess::argmax;pub use postprocess::cosine_similarity;pub use postprocess::iou;pub use postprocess::l2_normalize;pub use postprocess::nms;pub use postprocess::sigmoid;pub use postprocess::sigmoid_slice;pub use postprocess::softmax;pub use postprocess::top_k;pub use postprocess::BoundingBox;pub use preprocess::ImagePreprocessor;pub use preprocess::InputRange;pub use preprocess::PixelLayout;pub use preprocess::TensorLayout;pub use zoo::ModelEntry;pub use zoo::ModelZoo;
Modules§
- cache
- Bounded LRU cache for loaded ONNX models.
- device
- Execution device abstraction.
- error
- Error types for
oximedia-ml. - model
- Pure-Rust ONNX model wrapper.
- pipeline
- Typed ML pipeline abstraction.
- pipelines
- Built-in typed pipelines.
- postprocess
- Output tensor post-processing utilities.
- preprocess
- Image preprocessing for ML inference.
- zoo
- Lightweight model registry (“zoo”).