Crate embedrs

Expand description

§embedrs

Unified embedding solution — cloud APIs + local inference through one interface. Opinionated defaults backed by benchmark data.

§Design: 用就要好用 (if we build it, it must be great)

local() → all-MiniLM-L6-v2 (23MB, free, no API key)
cloud(key) → OpenAI text-embedding-3-small (best discrimination, cheapest)
Both produce the same EmbedResult — write code once, switch backends in one line

Defaults chosen by 8-dimension benchmark across 8 models. See benchrs.

§Quick start

// cloud — one key, done
let client = embedrs::cloud("sk-...");
let result = client.embed(vec!["hello world".into()]).await?;
println!("dimensions: {}", result.embeddings[0].len());

With the local feature enabled:

// local — zero config, free, 23MB model downloaded on first use
let client = embedrs::local();
let result = client.embed(vec!["hello world".into()]).await?;

§Batch embedding

let client = embedrs::cloud("sk-...");
let texts: Vec<String> = (0..5000).map(|i| format!("text {i}")).collect();
let result = client.embed_batch(texts)
    .concurrency(5)
    .await?;

§Provider fallback

Chain fallback providers for automatic failover:

let client = embedrs::Client::openai("sk-...")
    .with_fallback(embedrs::Client::cohere("cohere-key"));
// if OpenAI fails, automatically tries Cohere
let result = client.embed(vec!["hello".into()]).await?;

§Cost tracking

Enable the cost-tracking feature to get estimated cost per request via tiktoken pricing data. The Usage::cost field will be Some(f64) for models with pricing info, None otherwise.

embedrs = { version = "0.2", features = ["cost-tracking"] }

§Error handling

All fallible operations return Result<T>. Match on Error variants for fine-grained control:

Error::Api — API returned an error status (e.g., 429 rate limit, 401 unauthorized)
Error::Timeout — request exceeded the configured timeout
Error::Http — network-level failure
Error::Json — response body could not be parsed
Error::InputTooLarge — input exceeded the provider’s batch size limit

§Similarity

let a = vec![1.0, 0.0, 0.0];
let b = vec![0.0, 1.0, 0.0];
let sim = embedrs::cosine_similarity(&a, &b);
assert!(sim.abs() < 1e-6);

Re-exports§

pub use backoff::BackoffConfig;
pub use client::Client;
pub use client::EmbedResult;
pub use error::Error;
pub use error::Result;
pub use similarity::cosine_similarity;
pub use similarity::dot_product;
pub use similarity::euclidean_distance;
pub use usage::Usage;

Modules§

backoff: Exponential backoff policy with jitter for retryable HTTP failures.
batch: Batch builder — chunk large inputs and dispatch with bounded concurrency.
client: Unified Client facade over cloud providers and local inference.
error: Crate-wide error type and Result alias.
locallocal: Local on-device embedding inference (candle + bundled model registry).
prelude: Prelude for convenient imports.
similarity: Vector similarity primitives (cosine, dot product, Euclidean distance).
usage: Token usage and (optional) cost accounting for embedding requests.

Enums§

InputType: The type of input being embedded, used by providers that support it.

Functions§

cloud: Create a cloud embedding client with the recommended default provider (OpenAI text-embedding-3-small).
locallocal: Create a local embedding client with the recommended default model (all-MiniLM-L6-v2).

Crate embedrs

Crate embedrs Copy item path

§embedrs

§Design: 用就要好用 (if we build it, it must be great)

§Quick start

§Batch embedding

§Provider fallback

§Cost tracking

§Error handling

§Similarity

Re-exports§

Modules§

Enums§

Functions§

Crate embedrs