🍕 Features
- Supports synchronous usage. No dependency on Tokio.
- Uses @pykeio/ort for performant ONNX inference.
- Uses @huggingface/tokenizers for fast encodings.
- Supports batch embeddings generation with parallelism using @rayon-rs/rayon.
The default model is Flag Embedding, which is top of the MTEB leaderboard.
🔍 Not looking for Rust?
- Python 🐍: fastembed
- Go 🐳: fastembed-go
- JavaScript 🌐: fastembed-js
🤖 Models
Text Embedding
- BAAI/bge-base-en-v1.5
- BAAI/bge-small-en-v1.5 - Default
- BAAI/bge-large-en-v1.5
- BAAI/bge-small-zh-v1.5
- sentence-transformers/all-MiniLM-L6-v2
- sentence-transformers/paraphrase-MiniLM-L12-v2
- sentence-transformers/paraphrase-multilingual-mpnet-base-v2
- nomic-ai/nomic-embed-text-v1
- nomic-ai/nomic-embed-text-v1.5
- intfloat/multilingual-e5-small
- intfloat/multilingual-e5-base
- intfloat/multilingual-e5-large
- mixedbread-ai/mxbai-embed-large-v1
Reranking
🚀 Installation
Run the following command in your project directory:
Or add the following line to your Cargo.toml:
[]
= "3"
📖 Usage
Generating Text Embeddings
use ;
// With default InitOptions
let model = try_new?;
// With custom InitOptions
let model = try_new?;
let documents = vec!;
// Generate embeddings with the default batch size, 256
let embeddings = model.embed?;
println!; // -> Embeddings length: 4
println!; // -> Embedding dimension: 384
Candidates Reranking
use ;
let model = try_new
.unwrap;
let documents = vec!;
// Rerank with the default batch size
let results = model.rerank;
println!;
Alternatively, raw .onnx
files can be loaded through the UserDefinedEmbeddingModel
struct (for "bring your own" text embedding models) using TextEmbedding::try_new_from_user_defined(...)
.
🚒 Under the hood
Why fast?
It's important we justify the "fast" in FastEmbed. FastEmbed is fast because:
- Quantized model weights
- ONNX Runtime which allows for inference on CPU, GPU, and other dedicated runtimes
Why light?
- No hidden dependencies via Huggingface Transformers
Why accurate?
- Better than OpenAI Ada-002
- Top of the Embedding leaderboards e.g. MTEB
📄 LICENSE
Apache 2.0 © 2024