Module onnx

Expand description

ONNX Runtime embedding backend.

Loads a local ONNX model (all-MiniLM-L6-v2 quantized) and provides embedding inference. Supports automatic model download and caching.