Features
- Supports synchronous usage. No dependency on Tokio.
- Uses @pykeio/ort for performant ONNX inference.
- Uses @huggingface/tokenizers for fast encodings.
Not looking for Rust?
- Python: fastembed
- Go: fastembed-go
- JavaScript: fastembed-js
Models
Text Embedding
- BAAI/bge-small-en-v1.5 - Default
- BAAI/bge-base-en-v1.5
- BAAI/bge-large-en-v1.5
- BAAI/bge-small-zh-v1.5
- BAAI/bge-large-zh-v1.5
- BAAI/bge-m3
- sentence-transformers/all-MiniLM-L6-v2
- sentence-transformers/all-MiniLM-L12-v2
- sentence-transformers/all-mpnet-base-v2
- sentence-transformers/paraphrase-MiniLM-L12-v2
- sentence-transformers/paraphrase-multilingual-mpnet-base-v2
- nomic-ai/nomic-embed-text-v1
- nomic-ai/nomic-embed-text-v1.5 - pairs with
nomic-embed-vision-v1.5for image-to-text search - intfloat/multilingual-e5-small
- intfloat/multilingual-e5-base
- intfloat/multilingual-e5-large
- mixedbread-ai/mxbai-embed-large-v1
- Alibaba-NLP/gte-base-en-v1.5
- Alibaba-NLP/gte-large-en-v1.5
- lightonai/ModernBERT-embed-large
- Qdrant/clip-ViT-B-32-text - pairs with
clip-ViT-B-32-visionfor image-to-text search - jinaai/jina-embeddings-v2-base-code
- jinaai/jina-embeddings-v2-base-en
- google/embeddinggemma-300m
- Qwen/Qwen3-Embedding-0.6B - requires
qwen3feature (candle backend) - Qwen/Qwen3-Embedding-4B - requires
qwen3feature (candle backend) - Qwen/Qwen3-Embedding-8B - requires
qwen3feature (candle backend) - snowflake/snowflake-arctic-embed-xs
- snowflake/snowflake-arctic-embed-s
- snowflake/snowflake-arctic-embed-m
- snowflake/snowflake-arctic-embed-m-long
- snowflake/snowflake-arctic-embed-l
Quantized versions are also available for several models above (append Q to the model enum variant, e.g., EmbeddingModel::BGESmallENV15Q).
Sparse Text Embedding
- prithivida/Splade_PP_en_v1 - Default
- BAAI/bge-m3
Image Embedding
- Qdrant/clip-ViT-B-32-vision - Default
- Qdrant/resnet50-onnx
- Qdrant/Unicom-ViT-B-16
- Qdrant/Unicom-ViT-B-32
- nomic-ai/nomic-embed-vision-v1.5
Reranking
- BAAI/bge-reranker-base - Default
- BAAI/bge-reranker-v2-m3
- jinaai/jina-reranker-v1-turbo-en
- jinaai/jina-reranker-v2-base-multiligual
✊ Support
To support the library, please donate to our primary upstream dependency, ort - The Rust wrapper for the ONNX runtime.
Installation
Run the following in your project directory:
Or add the following line to your Cargo.toml:
[]
= "5"
Usage
Text Embeddings
use ;
// With default options
let mut model = try_new?;
// With custom options
let mut model = try_new?;
let documents = vec!;
// Generate embeddings with the default batch size, 256
let embeddings = model.embed?;
println!; // -> Embeddings length: 4
println!; // -> Embedding dimension: 384
Qwen3 Embeddings
Qwen3 embedding models are available behind the qwen3 feature flag (candle backend).
[]
= { = "5", = ["qwen3"] }
use ;
use Qwen3TextEmbedding;
let device = Cpu;
let model = from_hf?;
let embeddings = model.embed?;
println!;
Sparse Text Embeddings
use ;
// With default options
let mut model = try_new?;
// With custom options
let mut model = try_new?;
let documents = vec!;
// Generate embeddings with the default batch size, 256
let embeddings: = model.embed?;
Image Embeddings
use ;
// With default options
let mut model = try_new?;
// With custom options
let mut model = try_new?;
let images = vec!;
// Generate embeddings with the default batch size, 256
let embeddings = model.embed?;
println!; // -> Embeddings length: 2
println!; // -> Embedding dimension: 512
Candidates Reranking
use ;
// With default options
let mut model = try_new?;
// With custom options
let mut model = try_new?;
let documents = vec!;
// Rerank with the default batch size, 256 and return document contents
let results = model.rerank?;
println!;
Alternatively, local model files can be used for inference via the try_new_from_user_defined(...) methods of respective structs.