Module embedding

Module embedding 

Source
Expand description

§Embedding Module

This module provides functionality for interacting with the OpenAI Embeddings API. It allows you to convert text into numerical vector representations (embeddings) that capture semantic meaning, enabling various NLP tasks such as semantic search, clustering, and similarity comparison.

§Key Features

  • Text Embedding Generation: Convert single or multiple texts into vector embeddings
  • Multiple Input Formats: Support for single text strings or arrays of texts
  • Flexible Encoding: Support for both float and base64 encoding formats
  • Various Model Support: Compatible with OpenAI’s embedding models (e.g., text-embedding-3-small, text-embedding-3-large)
  • Multi-dimensional Output: Support for 1D, 2D, and 3D embedding vectors

§Quick Start

use openai_tools::embedding::request::Embedding;

#[tokio::main]
async fn main() -> Result<(), Box<dyn std::error::Error>> {
    // Initialize the embedding client
    let mut embedding = Embedding::new()?;
     
    // Configure the model and input text
    embedding
        .model("text-embedding-3-small")
        .input_text("Hello, world!");
     
    // Generate embedding
    let response = embedding.embed().await?;
     
    // Access the embedding vector
    let vector = response.data[0].embedding.as_1d().unwrap();
    println!("Embedding dimension: {}", vector.len());
    Ok(())
}

§Usage Examples

§Single Text Embedding

use openai_tools::embedding::request::Embedding;

#[tokio::main]
async fn main() -> Result<(), Box<dyn std::error::Error>> {
    let mut embedding = Embedding::new()?;
     
    embedding
        .model("text-embedding-3-small")
        .input_text("The quick brown fox jumps over the lazy dog.");
     
    let response = embedding.embed().await?;
     
    // The response contains embedding data
    assert_eq!(response.object, "list");
    assert_eq!(response.data.len(), 1);
     
    let vector = response.data[0].embedding.as_1d().unwrap();
    println!("Generated embedding with {} dimensions", vector.len());
    Ok(())
}

§Batch Text Embedding

use openai_tools::embedding::request::Embedding;

#[tokio::main]
async fn main() -> Result<(), Box<dyn std::error::Error>> {
    let mut embedding = Embedding::new()?;
     
    // Embed multiple texts at once
    let texts = vec![
        "Hello, world!",
        "こんにちは、世界!",
        "Bonjour le monde!",
    ];
     
    embedding
        .model("text-embedding-3-small")
        .input_text_array(texts);
     
    let response = embedding.embed().await?;
     
    // Each input text gets its own embedding
    for (i, data) in response.data.iter().enumerate() {
        let vector = data.embedding.as_1d().unwrap();
        println!("Text {}: {} dimensions", i, vector.len());
    }
    Ok(())
}

§Using Different Encoding Formats

use openai_tools::embedding::request::Embedding;

#[tokio::main]
async fn main() -> Result<(), Box<dyn std::error::Error>> {
    let mut embedding = Embedding::new()?;
     
    embedding
        .model("text-embedding-3-small")
        .input_text("Sample text for embedding")
        .encoding_format("float"); // or "base64"
     
    let response = embedding.embed().await?;
    println!("Model used: {}", response.model);
    println!("Token usage: {:?}", response.usage);
    Ok(())
}

§Supported Models

ModelDimensionsDescription
text-embedding-3-small1536Efficient model for most use cases
text-embedding-3-large3072Higher quality embeddings for demanding tasks
text-embedding-ada-0021536Legacy model (still supported)

§Response Structure

The embedding response contains:

  • object: Always “list” for embedding responses
  • data: Array of embedding objects, each containing:
    • object: Type identifier (“embedding”)
    • embedding: The vector representation (1D, 2D, or 3D)
    • index: Position in the input array
  • model: The model used for embedding
  • usage: Token usage information

Modules§

request
OpenAI Embeddings API Request Module
response
OpenAI Embeddings API Response Types