embedd
Trait-based embedding interface for Rust with pluggable backends.
Provides TextEmbedder, AsyncTextEmbedder, ImageEmbedder, AudioEmbedder, and
extension traits for token-level and sparse embeddings. Backends are feature-gated so
dependents only pull what they need.
Features
Sync backends (ureq)
| Feature | Backend | Requires |
|---|---|---|
candle-hf |
Local BERT/JinaBERT/DistilBERT via Candle | CPU (no GPU needed) |
fastembed |
fastembed dense + sparse (ONNX) | downloads models |
openai |
OpenAI-compatible API | API key + network |
tei |
TEI server | running TEI instance |
hf-inference |
HF Inference API | HF token + network |
Async backends (reqwest)
| Feature | Backend |
|---|---|
async-openai |
OpenAI-compatible API |
async-tei |
TEI server |
async-hf-inference |
HF Inference API |
Other
| Feature | What it does |
|---|---|
serde |
Serde derives on core types |
qdrant |
Qdrant vector DB integration |
ort-tokenizers |
ONNX Runtime (stub) |
burn-backend |
Burn (stub) |
siglip |
SigLIP image (stub) |
Quick start
[]
= { = "0.1", = ["fastembed"] }
use ;
use FastembedEmbedder;
let embedder = new_default.unwrap;
let vec = embedder.embed_text.unwrap;
println!;
Sparse embeddings
use ;
use FastembedSparseEmbedder;
let sparse = new_default.unwrap;
let vecs = sparse.embed_sparse.unwrap;
// Each vec is Vec<(term_id, weight)>
Async
use ;
use AsyncOpenAiEmbedder;
let embedder = new;
let vec = embedder.embed_text.await.unwrap;
Trait overview
TextEmbedder--embed_texts(&[String], EmbedMode) -> Vec<Vec<f32>>+ single-text convenienceembed_text(&str, EmbedMode) -> Vec<f32>.AsyncTextEmbedder-- async counterpart, object-safe viaBoxFuture.SparseEmbedder-- sparse lexical embeddings ((term_id, weight)pairs).ImageEmbedder--embed_images(&[Vec<u8>]) -> Vec<Vec<f32>>.TokenEmbedder-- multi-vector (late interaction) embeddings.
Wrappers: PromptedTextEmbedder (instruction prefix), L2NormalizedTextEmbedder,
TruncateDimTextEmbedder (matryoshka truncation), BatchingTextEmbedder (batch size control).
Compose via apply_scoping_policy, apply_normalization_policy, apply_output_dim.
Candle architectures
The candle-hf backend auto-detects model architecture from config.json:
model_type |
Architecture | Notes |
|---|---|---|
bert |
BERT | default fallback |
bert + ALiBi |
JinaBERT | detected via position_embedding_type |
distilbert |
DistilBERT | no token_type_ids |
xlm-roberta |
XLM-RoBERTa | multilingual |
modernbert |
ModernBERT | RoPE, sliding window attention |
Related crates
- innr -- SIMD vector ops, binary quantization, matryoshka truncation
- vicinity -- approximate nearest neighbor search
- rankops -- score fusion, reranking (MaxSim, MMR, DPP)
License
MIT OR Apache-2.0