Skip to main content

Crate oximedia_ml

Crate oximedia_ml 

Source
Expand description

§oximedia-ml — Sovereign ML for Media

oximedia-ml wraps the Pure-Rust OxiONNX runtime in a set of typed pipelines tailored to multimedia workloads: scene classification, shot boundary detection, aesthetic scoring, object detection, face embedding, and more as the feature-gated zoo grows.

§Design goals

  • Default Pure Rust — the default build pulls in zero ONNX symbols. Enable the onnx feature to opt in to inference.
  • Typed pipelines — callers get domain-shaped inputs and outputs rather than raw tensors. Every pipeline implements TypedPipeline.
  • No unwrap policy — every fallible operation returns MlResult; doc-tests follow the same rule.
  • Cache-friendly — loaded models can be shared across pipelines via ModelCache (bounded LRU keyed by canonical path).
  • Device portableDeviceType::auto picks the best available backend at runtime (CUDA → DirectML → WebGPU → CPU) and memoises the result.

§Quick start

Load a Places365-compatible scene classifier, run it on a 224×224 RGB frame, and print the top-5 predictions:

use oximedia_ml::pipelines::{SceneClassifier, SceneImage};
use oximedia_ml::{DeviceType, TypedPipeline};

let classifier = SceneClassifier::load("places365.onnx", DeviceType::auto())?;
let image = SceneImage::new(vec![0u8; 224 * 224 * 3], 224, 224)?;
for pred in classifier.run(image)? {
    println!("class {} @ {:.3}", pred.class_index, pred.score);
}

§Feature matrix

Backend features control which ONNX execution providers are compiled in; pipeline features enable individual domain adapters. Everything except cuda is WASM-compatible (see the support table below).

FeaturePurposeNotes
onnxEnables the real OnnxModel backed by OxiONNX.Required for any inference.
cudaAdditionally compile oxionnx-cuda for NVIDIA GPU execution.Native only (no WASM).
webgpuAdditionally compile oxionnx-gpu (wgpu backend).Works on native + browsers.
directmlAdditionally compile oxionnx-directml.Stub outside Windows.
serdeDerives Serialize on pipeline info/value types.Opt-in; no runtime cost.
scene-classifierBuilds the pipelines::SceneClassifier pipeline.Places365-compatible.
shot-boundaryBuilds the pipelines::ShotBoundaryDetector pipeline.TransNet V2-compatible.
aesthetic-scoreBuilds the pipelines::AestheticScorer pipeline.NIMA-compatible.
object-detectorBuilds the pipelines::ObjectDetector pipeline.YOLOv8-compatible.
face-embedderBuilds the pipelines::FaceEmbedder pipeline.ArcFace-compatible.
all-pipelinesShortcut enabling every pipeline above.Implies onnx.

§Device selection

Callers rarely need to hard-code a backend. DeviceType::auto probes capabilities once (memoised in an OnceLock) and returns the strongest available device:

use oximedia_ml::{DeviceCapabilities, DeviceType};

// Cached after the first call.
let device = DeviceType::auto();

// Want the full capability report? (panic-safe probes.)
for cap in DeviceCapabilities::probe_all() {
    println!(
        "{:?}: {}",
        cap.device_type,
        if cap.is_available { "available" } else { "unavailable" },
    );
}

Each pipeline constructor accepts a DeviceType. Pass DeviceType::Cpu to force the pure-Rust path, or pick a specific GPU backend when you know the deployment target.

§Pipeline ecosystem

All pipelines live under pipelines and implement TypedPipeline. Each is gated behind its own feature so apps only compile what they use:

PipelineFeatureInputOutputReference model
pipelines::SceneClassifierscene-classifier224×224 RGB frameVec<SceneClassification>Places365/ResNet
pipelines::ShotBoundaryDetectorshot-boundary48×27 RGB windowVec<ShotBoundary>TransNet V2
pipelines::AestheticScoreraesthetic-score224×224 RGB frameAestheticScoreNIMA
pipelines::ObjectDetectorobject-detector640×640 RGB frameVec<Detection>YOLOv8 (80 COCO)
pipelines::FaceEmbedderface-embedder112×112 RGB faceFaceEmbedding (512-dim)ArcFace

Value types (AestheticScore, Detection, FaceEmbedding) are always re-exported at the crate root so callers can handle results even if they only consume them from another crate.

§WebAssembly (wasm32-unknown-unknown)

oximedia-ml is validated for the WASM target on every release. The support matrix is:

Feature setwasm32-unknown-unknown
default (no features)builds
onnxbuilds
onnx + any subset of scene-classifier/shot-boundary/aesthetic-score/object-detector/face-embedder/all-pipelinesbuilds
webgpu (wgpu browser backend)builds
directml (stub on non-Windows)builds
cudadoes not build

The cuda feature transitively depends on oxicuda-driver, which uses libloading to bind the NVIDIA driver at runtime. libloading gates its Library type behind cfg(any(unix, windows)), so the crate will never compile on wasm32-unknown-unknown. This is a fundamental property of GPU driver loading rather than a limitation of this crate, so cuda is treated as a native-only feature.

Everything on WASM executes the pure-Rust CPU path (DeviceType::Cpu), which is what browsers actually want anyway — the WebGPU backend is opted into by enabling the webgpu feature. There is no mock inference path; if the onnx feature is disabled, OnnxModel::load returns MlError::FeatureDisabled as on native.

Re-exports§

pub use cache::ModelCache;
pub use cache::DEFAULT_CAPACITY;
pub use device::DeviceCapabilities;
pub use device::DeviceType;
pub use error::MlError;
pub use error::MlResult;
pub use model::load_auto;
pub use model::ModelInfo;
pub use model::OnnxModel;
pub use model::TensorDType;
pub use model::TensorSpec;
pub use pipeline::PipelineInfo;
pub use pipeline::PipelineTask;
pub use pipeline::TypedPipeline;
pub use pipelines::AestheticScore;
pub use pipelines::Detection;
pub use pipelines::FaceEmbedding;
pub use postprocess::argmax;
pub use postprocess::cosine_similarity;
pub use postprocess::iou;
pub use postprocess::l2_normalize;
pub use postprocess::nms;
pub use postprocess::sigmoid;
pub use postprocess::sigmoid_slice;
pub use postprocess::softmax;
pub use postprocess::top_k;
pub use postprocess::BoundingBox;
pub use preprocess::ImagePreprocessor;
pub use preprocess::InputRange;
pub use preprocess::PixelLayout;
pub use preprocess::TensorLayout;
pub use zoo::ModelEntry;
pub use zoo::ModelZoo;

Modules§

cache
Bounded LRU cache for loaded ONNX models.
device
Execution device abstraction.
error
Error types for oximedia-ml.
model
Pure-Rust ONNX model wrapper.
pipeline
Typed ML pipeline abstraction.
pipelines
Built-in typed pipelines.
postprocess
Output tensor post-processing utilities.
preprocess
Image preprocessing for ML inference.
zoo
Lightweight model registry (“zoo”).