OpenCLIP embedding in Rust
Easily run pre-trained open_clip compatible embedding models in Rust via ONNX Runtime.
Features
- Run CLIP models in Rust via ONNX.
- Should support any model compatible with
open_clip( Python). - Automatic model downloading: Just provide the Hugging Face model ID (has to point to HuggingFace repo with ONNX
files &
open_clip_config.json). - Python is only needed if you want to convert new models yourself.
Prerequisites
- Rust & Cargo.
- (Optional) uv - Only if you want to convert models from HuggingFace to ONNX.
- (Optional) If you have to link dynamically (on Windows) - onnxruntime.
Usage: Embedding text & image
Option 1: Clip struct
The Clip struct is built for ease of use, handling both vision and text together, with convenience functions for
similarity rankings.
use Clip;
async
Input image: Poekie
Outputs:
A photo of a cat: 99.99%
A photo of a dog: 0.01%
A photo of a beignet: 0.00%
Option 2: Individual vision & text embedders
Use VisionEmbedder or TextEmbedder standalone to just produce embeddings from images & text.
use ;
Examples
Run the included examples (ensure you have exported the relevant model first):
# Simple generic example
cargo run --example basic
# Semantic image search demo
cargo run --example search
Model support
This crate is implemented with ort, it runs ONNX models. I've uploaded the following
ONNX Clip Embedding models to HuggingFace. To get an idea of the speed / quality tradeoff for these models, I've
benchmarked them, and put them alongside the ImageNet zero-shot accuracy score.
| Model ID | ImageNet Zero-Shot Accuracy | Vision Embedding (ms)* | Text Embedding (ms)* |
|---|---|---|---|
| RuteNL/ViT-gopt-16-SigLIP2-384-ONNX | 85.0% | 2354 | 128 |
| RuteNL/DFN5B-CLIP-ViT-H-14-378-ONNX | 84.4% | 1860 | 131 |
| RuteNL/ViT-SO400M-16-SigLIP2-384-ONNX | 84.1% | 988 | 136 |
| RuteNL/MobileCLIP2-S3-OpenCLIP-ONNX | 80.7% | 116 | 35 |
| RuteNL/MobileCLIP2-S4-OpenCLIP-ONNX | 79.4% | 192 | 38 |
| RuteNL/MobileCLIP2-S2-OpenCLIP-ONNX | 77.2% | 75 | 19 |
* Embedding speed measured on my CPU, vision embedding includes 10-20 ms preprocessing.
Source for MobileCLIP ImageNet acc.
Source for other ImageNet accuracy numbers.
Other models
If you need a model that hasn't been converted to ONNX on HuggingFace yet, you can easily convert any open_clip
compatible model
yourself, using pull_onnx.py from this repo.
- Make sure you have uv.
- Run
uv run pull_onnx.py --id timm/vit_base_patch32_clip_224.openai - After the Python script is done, you can run the following in your Rust code:
let clip = Clip::from_local_id("timm/vit_base_patch32_clip_224.openai").build()?
I've tested the following models to work with pull_onnx.py & this crate:
- timm/MobileCLIP2-S4-OpenCLIP *
- timm/ViT-SO400M-16-SigLIP2-384 *
- timm/ViT-SO400M-14-SigLIP-384 *
- timm/vit_base_patch32_clip_224.openai *
- Marqo/marqo-fashionSigLIP *
- laion/CLIP-ViT-B-32-laion2B-s34B-b79K
- microsoft/BiomedCLIP-PubMedBERT_256-vit_base_patch16_224
- imageomics/bioclip
- timm/PE-Core-bigG-14-448
* Verified equal embedding outputs compared
to reference Python implemenation
Execution Providers (Nvidia, AMD, Intel, Mac, Arm, etc.)
Since this is implemented with ort, many execution providers are available to enable hardware acceleration. You can
enable an execution provider in this crate with cargo features. A full list of execution providers is
available here.
To enable cuda, add the "cuda" feature,
and pass the CUDA execution provider when creating the embedder:
use Clip;
use ;
use Path;
async
Features
- [default]
hf-hub- Enable functionfrom_hfto fetch a model from HuggingFace, relies ontokio. - [default]
fast_image_resize- Use fast_image_resize instead ofimageto resize image for preprocessing. Is about 77% faster, but has slightly more differences thanimagecompared to PIL, which affects the embedding outputs slightly. load-dynamic- Link ONNXRuntime dynamically instead of statically. See section below, orortcrate features documentation for more info.- And more
ortforwarded features, see cargo.toml for a list of these, and seeortdocs for their explanation.
Troubleshooting
If it doesn't build on Windows due to onnxruntime problems
Try using the feature load-dynamic and point to the onnxruntime dll as described below.
[When using load-dynamic feature] ONNX Runtime Library Not Found
OnnxRuntime is dynamically loaded, so if it's not found correctly, then download the correct onnxruntime library from GitHub Releases.
Then put the dll/so/dylib location in your PATH, or point the ORT_DYLIB_PATH env var to it.
PowerShell example:
- Adjust path to where the dll is.
$env:ORT_DYLIB_PATH = "C:/Apps/onnxruntime/lib/onnxruntime.dll"
Shell example:
export ORT_DYLIB_PATH="/usr/local/lib/libonnxruntime.so"