oximedia_ml/lib.rs
1//! # oximedia-ml — Sovereign ML for Media
2//!
3//! `oximedia-ml` wraps the [Pure-Rust OxiONNX](https://crates.io/crates/oxionnx)
4//! runtime in a set of typed pipelines tailored to multimedia workloads:
5//! scene classification, shot boundary detection, aesthetic scoring,
6//! object detection, face embedding, and more as the feature-gated zoo
7//! grows.
8//!
9//! ## Design goals
10//!
11//! * **Default Pure Rust** — the default build pulls in *zero* ONNX
12//! symbols. Enable the `onnx` feature to opt in to inference.
13//! * **Typed pipelines** — callers get domain-shaped inputs and outputs
14//! rather than raw tensors. Every pipeline implements
15//! [`TypedPipeline`].
16//! * **No unwrap policy** — every fallible operation returns
17//! [`MlResult`]; doc-tests follow the same rule.
18//! * **Cache-friendly** — loaded models can be shared across pipelines
19//! via [`ModelCache`] (bounded LRU keyed by canonical path).
20//! * **Device portable** — [`DeviceType::auto`] picks the best available
21//! backend at runtime (CUDA → DirectML → WebGPU → CPU) and memoises
22//! the result.
23//!
24//! ## Quick start
25//!
26//! Load a Places365-compatible scene classifier, run it on a 224×224 RGB
27//! frame, and print the top-5 predictions:
28//!
29//! ```no_run
30//! # #[cfg(all(feature = "onnx", feature = "scene-classifier"))]
31//! # fn demo() -> oximedia_ml::MlResult<()> {
32//! use oximedia_ml::pipelines::{SceneClassifier, SceneImage};
33//! use oximedia_ml::{DeviceType, TypedPipeline};
34//!
35//! let classifier = SceneClassifier::load("places365.onnx", DeviceType::auto())?;
36//! let image = SceneImage::new(vec![0u8; 224 * 224 * 3], 224, 224)?;
37//! for pred in classifier.run(image)? {
38//! println!("class {} @ {:.3}", pred.class_index, pred.score);
39//! }
40//! # Ok(())
41//! # }
42//! ```
43//!
44//! ## Feature matrix
45//!
46//! Backend features control which ONNX execution providers are compiled
47//! in; pipeline features enable individual domain adapters. Everything
48//! except `cuda` is WASM-compatible (see the support table below).
49//!
50//! | Feature | Purpose | Notes |
51//! |----------------------|-------------------------------------------------------------------|-------------------------------|
52//! | `onnx` | Enables the real [`OnnxModel`] backed by OxiONNX. | Required for any inference. |
53//! | `cuda` | Additionally compile `oxionnx-cuda` for NVIDIA GPU execution. | **Native only** (no WASM). |
54//! | `webgpu` | Additionally compile `oxionnx-gpu` (wgpu backend). | Works on native + browsers. |
55//! | `directml` | Additionally compile `oxionnx-directml`. | Stub outside Windows. |
56//! | `serde` | Derives `Serialize` on pipeline info/value types. | Opt-in; no runtime cost. |
57//! | `scene-classifier` | Builds the `pipelines::SceneClassifier` pipeline. | Places365-compatible. |
58//! | `shot-boundary` | Builds the `pipelines::ShotBoundaryDetector` pipeline. | TransNet V2-compatible. |
59//! | `aesthetic-score` | Builds the `pipelines::AestheticScorer` pipeline. | NIMA-compatible. |
60//! | `object-detector` | Builds the `pipelines::ObjectDetector` pipeline. | YOLOv8-compatible. |
61//! | `face-embedder` | Builds the `pipelines::FaceEmbedder` pipeline. | ArcFace-compatible. |
62//! | `all-pipelines` | Shortcut enabling every pipeline above. | Implies `onnx`. |
63//!
64//! ## Device selection
65//!
66//! Callers rarely need to hard-code a backend. [`DeviceType::auto`]
67//! probes capabilities once (memoised in an `OnceLock`) and returns the
68//! strongest available device:
69//!
70//! ```no_run
71//! use oximedia_ml::{DeviceCapabilities, DeviceType};
72//!
73//! // Cached after the first call.
74//! let device = DeviceType::auto();
75//!
76//! // Want the full capability report? (panic-safe probes.)
77//! for cap in DeviceCapabilities::probe_all() {
78//! println!(
79//! "{:?}: {}",
80//! cap.device_type,
81//! if cap.is_available { "available" } else { "unavailable" },
82//! );
83//! }
84//! ```
85//!
86//! Each pipeline constructor accepts a [`DeviceType`]. Pass
87//! [`DeviceType::Cpu`] to force the pure-Rust path, or pick a specific
88//! GPU backend when you know the deployment target.
89//!
90//! ## Pipeline ecosystem
91//!
92//! All pipelines live under [`pipelines`] and implement
93//! [`TypedPipeline`]. Each is gated behind its own feature so apps only
94//! compile what they use:
95//!
96//! | Pipeline | Feature | Input | Output | Reference model |
97//! |---------------------------------------|---------------------|-------------------|-------------------------------|-------------------|
98//! | `pipelines::SceneClassifier` | `scene-classifier` | 224×224 RGB frame | `Vec<SceneClassification>` | Places365/ResNet |
99//! | `pipelines::ShotBoundaryDetector` | `shot-boundary` | 48×27 RGB window | `Vec<ShotBoundary>` | TransNet V2 |
100//! | `pipelines::AestheticScorer` | `aesthetic-score` | 224×224 RGB frame | [`AestheticScore`] | NIMA |
101//! | `pipelines::ObjectDetector` | `object-detector` | 640×640 RGB frame | `Vec<Detection>` | YOLOv8 (80 COCO) |
102//! | `pipelines::FaceEmbedder` | `face-embedder` | 112×112 RGB face | [`FaceEmbedding`] (512-dim) | ArcFace |
103//!
104//! Value types ([`AestheticScore`], [`Detection`], [`FaceEmbedding`]) are
105//! always re-exported at the crate root so callers can handle results
106//! even if they only consume them from another crate.
107//!
108//! ## WebAssembly (`wasm32-unknown-unknown`)
109//!
110//! `oximedia-ml` is validated for the WASM target on every release. The
111//! support matrix is:
112//!
113//! | Feature set | `wasm32-unknown-unknown` |
114//! |--------------------------------------------------------------------------|--------------------------|
115//! | *default* (no features) | builds |
116//! | `onnx` | builds |
117//! | `onnx` + any subset of `scene-classifier`/`shot-boundary`/`aesthetic-score`/`object-detector`/`face-embedder`/`all-pipelines` | builds |
118//! | `webgpu` (wgpu browser backend) | builds |
119//! | `directml` (stub on non-Windows) | builds |
120//! | `cuda` | **does not build** |
121//!
122//! The `cuda` feature transitively depends on `oxicuda-driver`, which uses
123//! [`libloading`] to bind the NVIDIA driver at runtime. `libloading` gates
124//! its `Library` type behind `cfg(any(unix, windows))`, so the crate will
125//! never compile on `wasm32-unknown-unknown`. This is a fundamental
126//! property of GPU driver loading rather than a limitation of this crate,
127//! so `cuda` is treated as a **native-only** feature.
128//!
129//! Everything on WASM executes the pure-Rust CPU path ([`DeviceType::Cpu`]),
130//! which is what browsers actually want anyway — the WebGPU backend is
131//! opted into by enabling the `webgpu` feature. There is no mock inference
132//! path; if the `onnx` feature is disabled, `OnnxModel::load` returns
133//! [`MlError::FeatureDisabled`] as on native.
134//!
135//! [`libloading`]: https://crates.io/crates/libloading
136
137#![cfg_attr(not(test), deny(clippy::unwrap_used))]
138#![cfg_attr(not(test), deny(clippy::expect_used))]
139
140pub mod cache;
141pub mod device;
142pub mod error;
143pub mod model;
144pub mod pipeline;
145pub mod pipelines;
146pub mod postprocess;
147pub mod preprocess;
148pub mod zoo;
149
150pub use cache::{ModelCache, DEFAULT_CAPACITY};
151pub use device::{DeviceCapabilities, DeviceType};
152pub use error::{MlError, MlResult};
153pub use model::{load_auto, ModelInfo, OnnxModel, TensorDType, TensorSpec};
154pub use pipeline::{PipelineInfo, PipelineTask, TypedPipeline};
155pub use pipelines::{AestheticScore, Detection, FaceEmbedding};
156pub use postprocess::{
157 argmax, cosine_similarity, iou, l2_normalize, nms, sigmoid, sigmoid_slice, softmax, top_k,
158 BoundingBox,
159};
160pub use preprocess::{ImagePreprocessor, InputRange, PixelLayout, TensorLayout};
161pub use zoo::{ModelEntry, ModelZoo};