Skip to main content

Crate embedrs

Crate embedrs 

Source
Expand description

§embedrs

Unified embedding solution — cloud APIs + local inference through one interface. Opinionated defaults backed by benchmark data.

§Design: 用就要好用 (if we build it, it must be great)

  • local() → all-MiniLM-L6-v2 (23MB, free, no API key)
  • cloud(key) → OpenAI text-embedding-3-small (best discrimination, cheapest)
  • Both produce the same EmbedResult — write code once, switch backends in one line

Defaults chosen by 8-dimension benchmark across 8 models. See benchrs.

§Quick start

// cloud — one key, done
let client = embedrs::cloud("sk-...");
let result = client.embed(vec!["hello world".into()]).await?;
println!("dimensions: {}", result.embeddings[0].len());

With the local feature enabled:

// local — zero config, free, 23MB model downloaded on first use
let client = embedrs::local();
let result = client.embed(vec!["hello world".into()]).await?;

§Batch embedding

let client = embedrs::cloud("sk-...");
let texts: Vec<String> = (0..5000).map(|i| format!("text {i}")).collect();
let result = client.embed_batch(texts)
    .concurrency(5)
    .await?;

§Provider fallback

Chain fallback providers for automatic failover:

let client = embedrs::Client::openai("sk-...")
    .with_fallback(embedrs::Client::cohere("cohere-key"));
// if OpenAI fails, automatically tries Cohere
let result = client.embed(vec!["hello".into()]).await?;

§Cost tracking

Enable the cost-tracking feature to get estimated cost per request via tiktoken pricing data. The Usage::cost field will be Some(f64) for models with pricing info, None otherwise.

embedrs = { version = "0.2", features = ["cost-tracking"] }

§Error handling

All fallible operations return Result<T>. Match on Error variants for fine-grained control:

§Similarity

let a = vec![1.0, 0.0, 0.0];
let b = vec![0.0, 1.0, 0.0];
let sim = embedrs::cosine_similarity(&a, &b);
assert!(sim.abs() < 1e-6);

Re-exports§

pub use backoff::BackoffConfig;
pub use client::Client;
pub use client::EmbedResult;
pub use error::Error;
pub use error::Result;
pub use similarity::cosine_similarity;
pub use similarity::dot_product;
pub use similarity::euclidean_distance;
pub use usage::Usage;

Modules§

backoff
Exponential backoff policy with jitter for retryable HTTP failures.
batch
Batch builder — chunk large inputs and dispatch with bounded concurrency.
client
Unified Client facade over cloud providers and local inference.
error
Crate-wide error type and Result alias.
locallocal
Local on-device embedding inference (candle + bundled model registry).
prelude
Prelude for convenient imports.
similarity
Vector similarity primitives (cosine, dot product, Euclidean distance).
usage
Token usage and (optional) cost accounting for embedding requests.

Enums§

InputType
The type of input being embedded, used by providers that support it.

Functions§

cloud
Create a cloud embedding client with the recommended default provider (OpenAI text-embedding-3-small).
locallocal
Create a local embedding client with the recommended default model (all-MiniLM-L6-v2).