embedd
Embedding interfaces and local backends. A shared TextEmbedder trait
across local (fastembed, candle) and remote (OpenAI, TEI, HF Inference)
providers.
[]
= { = "0.2", = ["fastembed"] }
The trait
Any backend implements TextEmbedder. Swap by changing the feature flag and constructor;
nothing else changes.
Quick start
Local ONNX inference via fastembed:
use ;
use FastembedEmbedder;
let embedder = new_default?;
let vec = embedder.embed_text?;
println!;
Remote via OpenAI-compatible API:
use ;
use OpenAiEmbedder; // sync
let embedder = new;
let vec = embedder.embed_text?;
Async remote:
use ;
use AsyncOpenAiEmbedder;
let embedder = new;
let vec = embedder.embed_text.await?;
Backends
Sync (ureq)
| Feature | Backend | Notes |
|---|---|---|
fastembed |
fastembed dense + sparse (ONNX) | downloads models on first use |
candle-hf |
Local BERT/JinaBERT/DistilBERT/ModernBERT | CPU inference, no download |
openai |
OpenAI-compatible API | API key + network |
tei |
TEI server | running TEI instance |
hf-inference |
HF Inference API | HF token + network |
Async (reqwest)
| Feature | Backend |
|---|---|
async-openai |
OpenAI-compatible API |
async-tei |
TEI server |
async-hf-inference |
HF Inference API |
Traits
TextEmbedder--embed_texts(&[String], EmbedMode) -> Vec<Vec<f32>>+ single-text convenience.AsyncTextEmbedder-- async counterpart, object-safe viaBoxFuture.SparseEmbedder-- sparse lexical embeddings ((term_id, weight)pairs).ImageEmbedder--embed_images(&[Vec<u8>]) -> Vec<Vec<f32>>.TokenEmbedder-- multi-vector (late interaction) embeddings.
Wrappers: PromptedTextEmbedder (instruction prefix), L2NormalizedTextEmbedder,
TruncateDimTextEmbedder (matryoshka truncation), BatchingTextEmbedder (batch size control).
Sparse embeddings
use ;
use FastembedSparseEmbedder;
let sparse = new_default?;
let vecs = sparse.embed_sparse?;
// Each vec is Vec<(term_id, weight)>
Candle architectures
The candle-hf backend auto-detects model architecture from config.json:
model_type |
Architecture | Notes |
|---|---|---|
bert |
BERT | default fallback |
bert + ALiBi |
JinaBERT | detected via position_embedding_type |
distilbert |
DistilBERT | no token_type_ids |
xlm-roberta |
XLM-RoBERTa | multilingual |
modernbert |
ModernBERT | RoPE, sliding window attention |
Planned
- Burn backend (stub exists, implementation pending)
- SigLIP image backend (stub exists)
Related
- innr -- SIMD vector ops, binary quantization, matryoshka truncation
- vicinity -- approximate nearest neighbor search
- rankops -- score fusion, reranking (MaxSim, MMR, DPP)
License
MIT OR Apache-2.0