Skip to main content

Module onnx

Module onnx 

Source
Expand description

ONNX Runtime local embedding provider.

Provides local embedding inference using ONNX Runtime, eliminating the need for an external API. Supports sentence-transformer models such as all-MiniLM-L6-v2 exported to ONNX format.

§Feature gating

When compiled without the onnx feature the module provides a stub that validates the model path but returns Error::Embedding from embed() and embed_batch().

When compiled with the onnx feature the module loads the ONNX session and a HuggingFace tokenizer, then performs real local inference with mean-pooling and L2 normalisation.

[features]
onnx = ["dep:ort", "dep:tokenizers", "dep:ndarray"]

[dependencies]
ort = { version = "2", optional = true }
tokenizers = { version = "0.21", optional = true, default-features = false }
ndarray = { version = "0.16", optional = true }

§Example (stub)

use mnemo_core::embedding::onnx::OnnxEmbedding;
use mnemo_core::embedding::EmbeddingProvider;

// Will succeed only if the path exists on disk.
let provider = OnnxEmbedding::new("/models/all-MiniLM-L6-v2.onnx", 384)
    .expect("model path must exist");

assert_eq!(provider.dimensions(), 384);
assert_eq!(provider.model_path(), "/models/all-MiniLM-L6-v2.onnx");

Structs§

OnnxEmbedding
ONNX-based local embedding provider.