Module embedding

Expand description

Text embeddings.

§Embedding Module

This module provides types and traits for working with text embeddings.

§What are Embeddings?

Embeddings are dense vector representations of text that capture semantic meaning. They transform human-readable text into numerical vectors that machine learning models can process effectively. Similar texts produce similar embedding vectors, making them useful for:

Semantic search: Finding relevant documents based on meaning rather than exact keywords
Text similarity: Measuring how similar two pieces of text are
Classification: Categorizing text based on content
Clustering: Grouping similar texts together
Recommendation systems: Finding related content

§Embedding Models

An embedding model is a neural network that has been trained to convert text into meaningful vector representations. Different models have different characteristics:

Dimension: The length of the embedding vector (e.g., 768, 1536)
Domain: Some models are optimized for specific types of content
Performance: Trade-offs between speed, accuracy, and resource usage

Popular embedding models include:

OpenAI’s text-embedding-ada-002 (1536 dimensions)
Sentence Transformers like all-MiniLM-L6-v2 (384 dimensions)
Cohere’s embedding models

§Usage

This module provides the EmbeddingModel trait that abstracts over different embedding implementations, allowing you to switch between providers while maintaining the same interface.

use ai_types::EmbeddingModel;

async fn example<T: EmbeddingModel>(model: &T) -> ai_types::Result<()> {
    // Get the embedding dimension
    let dim = model.dim();
    println!("Model produces {}-dimensional embeddings", dim);

    // Convert text to embedding
    let embedding = model.embed("Hello, world!").await?;
    assert_eq!(embedding.len(), dim);

    Ok(())
}

Traits§

EmbeddingModel: Converts text to vector representations.

Type Aliases§

Embedding: A type alias for an embedding vector of 32-bit floats.