Skip to main content

Crate seasoning

Crate seasoning 

Source
Expand description

§Seasoning

Retrieval-focused embedding and reranking infrastructure with explicit model semantics, rate limiting, retries, and optional local llama.cpp execution.

Config-driven local setups accept the llama.cpp, llamacpp, llama-cpp, or llama_cpp dialect spellings when converting into Dialect::LlamaCpp.

Seasoning separates backend/runtime selection from retrieval formatting: Dialect selects transport or local execution, ModelFamily selects retrieval-family formatting, and EmbeddingRole identifies whether a semantic embedding input is a query or document.

Embedding execution keeps a semantic public API. The crate formats and prepares the final model payload internally after the API boundary.

§Embeddings

use std::time::Duration;

use std::sync::Arc;

use secrecy::SecretString;
use seasoning::EmbeddingProvider;
use seasoning::embedding::{
    Client as EmbedClient, Dialect, EmbedderConfig, EmbeddingInput, EmbeddingRole,
    ModelFamily, RemoteEmbedderConfig, Tokenizer,
};

let embedder = EmbedClient::new(EmbedderConfig::remote(
    ModelFamily::Qwen3,
    Tokenizer::Tiktoken {
        encoding: "cl100k_base".to_string(),
        tokenizer: Arc::new(tiktoken_rs::cl100k_base().map_err(|e| seasoning::Error::InvalidConfiguration { message: e.to_string() })?),
    },
    "Qwen/Qwen3-Embedding-0.6B",
    None,
    RemoteEmbedderConfig {
        api_key: Some(SecretString::from("YOUR_API_KEY")),
        base_url: "https://api.deepinfra.com/v1/openai".to_string(),
        timeout: Duration::from_secs(10),
        dialect: Dialect::DeepInfra,
        embedding_dim: 1024,
        requests_per_minute: 1000,
        max_concurrent_requests: 50,
        tokens_per_minute: 1_000_000,
    },
)?)?;

let inputs = vec![EmbeddingInput {
    role: EmbeddingRole::Query,
    text: "memory-safe systems programming".to_string(),
    title: None,
    token_count: 4,
}];

let _ = embedder.embed(&inputs).await?;

§Reranking

use std::time::Duration;

use secrecy::SecretString;
use seasoning::RerankingProvider;
use seasoning::embedding::{Dialect, ModelFamily};
use seasoning::reranker::{Client as RerankerClient, RerankerConfig};

let reranker = RerankerClient::new(RerankerConfig {
    api_key: Some(SecretString::from("YOUR_API_KEY")),
    base_url: "https://api.deepinfra.com/v1".to_string(),
    timeout: Duration::from_secs(10),
    dialect: Dialect::DeepInfra,
    model_family: ModelFamily::Qwen3,
    model: "Qwen/Qwen3-Reranker-0.6B".to_string(),
    instruction: None,
    requests_per_minute: 1000,
    max_concurrent_requests: 50,
    tokens_per_minute: 1_000_000,
})?;

let query = seasoning::RerankQuery {
    text: "memory-safe systems programming".to_string(),
    token_count: 4,
};
let documents = vec![seasoning::RerankDocument {
    text: "Rust uses ownership and borrowing".to_string(),
    token_count: 6,
}];

let scores = reranker.rerank(&query, &documents).await?;
assert_eq!(scores.len(), documents.len());

Modules§

batching
embedding
Text embedding generation with rate limiting and retrieval-aware formatting.
reranker
Document reranking based on query relevance.
service

Structs§

AppConfig
Top-level application config for embedding and reranking clients.
BatchItem
One semantic embedding item for crate::service::EmbedderService.
EmbedOutput
Output from an embedding request.
Embedding
Embedding client configuration.
EmbeddingInput
Input for a single embedding request.
RerankDocument
RerankQuery
Reranker
Reranker client configuration.

Enums§

AddDecision
Batching strategy response for a newly added item.
Dialect
Backend/runtime dialect for embedding and reranking execution.
EmbeddingRole
Retrieval role for an embedding input.
Error
Crate-wide error type.
ModelFamily
Retrieval-family semantics used to format embedding and reranking inputs.
Tokenizer
Preloaded tokenizer instances used by the embedding model layer.

Traits§

BatchingStrategy
Strategy interface for token-aware batch assembly.
EmbeddingProvider
Trait for embedding providers.
RerankingProvider
Trait for reranking providers.

Type Aliases§

ProviderDialect
Backwards-compatible alias for the previous public name.
Result
Crate-wide result alias.