Skip to main content

EmbeddingProvider

Trait EmbeddingProvider 

Source
pub trait EmbeddingProvider:
    Send
    + Sync
    + Clone {
    // Required methods
    fn name(&self) -> &'static str;
    fn dimension(&self) -> usize;
    fn generate(
        &self,
        text: &str,
    ) -> impl Future<Output = Result<Vec<f32>, AppError>> + Send;

    // Provided methods
    fn max_batch_size(&self) -> usize { ... }
    fn generate_batch(
        &self,
        texts: &[String],
    ) -> impl Future<Output = Result<Vec<Vec<f32>>, AppError>> + Send { ... }
}
Expand description

Provider for generating text embeddings.

Implementations convert text into vector representations for semantic search. Different providers may produce vectors of different dimensions:

  • Gemini text-embedding-004: 768 dimensions
  • OpenAI text-embedding-3-small: 1536 dimensions
  • OpenAI text-embedding-3-large: 3072 dimensions

Required Methods§

Source

fn name(&self) -> &'static str

Returns the provider identifier for logging and configuration.

§Examples
  • "gemini" for Google Gemini
  • "openai" for OpenAI
Source

fn dimension(&self) -> usize

Returns the embedding dimension this provider generates.

This value must match the database column dimension for vector storage. Mismatched dimensions will cause insertion failures.

Source

fn generate( &self, text: &str, ) -> impl Future<Output = Result<Vec<f32>, AppError>> + Send

Generates an embedding vector for the given text.

§Arguments
  • text - The text to embed
§Returns

A vector of floating-point values representing the text embedding. The vector length must equal self.dimension().

Provided Methods§

Source

fn max_batch_size(&self) -> usize

Maximum number of texts supported per batch API call.

The harvest pipeline uses min(config.embedding_batch_size, max_batch_size()) to ensure batches never exceed provider limits.

§Defaults

Returns 1 (single-item batches). Providers with native batch support should override to enable efficient batching.

Source

fn generate_batch( &self, texts: &[String], ) -> impl Future<Output = Result<Vec<Vec<f32>>, AppError>> + Send

Generates embeddings for multiple texts in a batch.

The default implementation calls generate() sequentially. Providers with native batch API support should override for efficiency.

§Arguments
  • texts - Slice of texts to embed
§Returns

A vector of embedding vectors, one per input text.

Dyn Compatibility§

This trait is not dyn compatible.

In older versions of Rust, dyn compatibility was called "object safety", so this trait is not object safe.

Implementors§