Struct Embedder

Source

pub struct Embedder { /* private fields */ }

Expand description

Text embedding generator using a configurable model (default: BGE-large-en-v1.5)

Automatically downloads the model from HuggingFace Hub on first use. Detects GPU availability and uses CUDA/TensorRT when available.

§Example

use cqs::Embedder;
use cqs::embedder::ModelConfig;

let embedder = Embedder::new(ModelConfig::resolve(None, None))?;
let embedding = embedder.embed_query("parse configuration file")?;
println!("Embedding dimension: {}", embedding.len()); // 768

Implementations§

Source §

impl Embedder

Source

pub fn new(model_config: ModelConfig) -> Result<Self, EmbedderError>

Create a new embedder with lazy model loading.

When force_cpu is false, automatically detects GPU and uses CUDA/TensorRT when available, falling back to CPU if no GPU is found. When force_cpu is true, always uses CPU – use this for single-query embedding where CPU is faster than GPU due to CUDA context setup overhead.

Note: Model download and ONNX session are lazy-loaded on first embedding request. This avoids HuggingFace API calls for commands that don’t need embeddings.

Source

pub fn new_cpu(model_config: ModelConfig) -> Result<Self, EmbedderError>

Create a CPU-only embedder with lazy model loading.

Convenience wrapper for new() — use this for single-query embedding where CPU is faster than GPU due to CUDA context setup overhead.

Source

pub fn model_config(&self) -> &ModelConfig

Get the model configuration

Source

pub fn token_count(&self, text: &str) -> Result<usize, EmbedderError>

Counts the number of tokens in the given text using the configured tokenizer.

§Arguments

text - The text string to tokenize and count

§Returns

Returns Ok(usize) containing the number of tokens in the text, or Err(EmbedderError) if tokenization fails.

§Errors

Returns EmbedderError::Tokenizer if the tokenizer is unavailable or if encoding the text fails.

Source

pub fn token_counts_batch( &self, texts: &[&str], ) -> Result<Vec<usize>, EmbedderError>

Count tokens for multiple texts in a single batch.

Uses encode_batch for potentially better throughput than individual token_count calls when processing many texts.

Source

pub fn split_into_windows( &self, text: &str, max_tokens: usize, overlap: usize, ) -> Result<Vec<(String, u32)>, EmbedderError>

Split text into overlapping windows of max_tokens with overlap tokens of context. Returns Vec of (window_content, window_index). If text fits in max_tokens, returns single window with index 0.

§Panics

Panics if overlap >= max_tokens / 2 as this creates exponential window count.

Source

pub fn embed_documents( &self, texts: &[&str], ) -> Result<Vec<Embedding>, EmbedderError>

Embed documents (code chunks). Adds model-specific document prefix.

Large inputs are processed in batches of 64 to cap GPU memory usage.

Source

pub fn embed_query(&self, text: &str) -> Result<Embedding, EmbedderError>

Source

pub fn provider(&self) -> ExecutionProvider

Get the execution provider being used

Source

pub fn clear_session(&self)

Clear the ONNX session to free memory (~500MB).

The session will be lazily re-initialized on the next embedding request. Use this in long-running processes during idle periods to reduce memory footprint.

§Safety constraint

Must only be called during idle periods – not while embedding is in progress. Watch mode guarantees single-threaded access.

Source

pub fn warm(&self) -> Result<(), EmbedderError>

Warm up the model with a dummy inference

Source

pub fn embedding_dim(&self) -> usize

Returns the embedding dimension detected from the model. Falls back to the model config’s declared dimension if no inference has been run yet.

Auto Trait Implementations§

§

impl UnwindSafe for Embedder

Blanket Implementations§

Source §

impl<T> Any for T
where T: 'static + ?Sized,

Source §

fn type_id(&self) -> TypeId

Gets the TypeId of self. Read more

Source §

impl<T> Borrow<T> for T
where T: ?Sized,

Source §

fn borrow(&self) -> &T

Immutably borrows from an owned value. Read more

Source §

impl<T> BorrowMut<T> for T
where T: ?Sized,

Source §

fn borrow_mut(&mut self) -> &mut T

Mutably borrows from an owned value. Read more

Source §

impl<T> From<T> for T

Source §

fn from(t: T) -> T

Returns the argument unchanged.

Source §

impl<T> Instrument for T

Source §

fn instrument(self, span: Span) -> Instrumented<Self>

Instruments this type with the provided Span, returning an Instrumented wrapper. Read more

Source §

fn in_current_span(self) -> Instrumented<Self>

Instruments this type with the current Span, returning an Instrumented wrapper. Read more

Source §

impl<T, U> Into for T
where U: From<T>,

Source §

fn into(self) -> U

Calls U::from(self).

That is, this conversion is whatever the implementation of From<T> for U chooses to do.

Source §

impl<T> IntoEither for T

Source §

fn into_either(self, into_left: bool) -> Either<Self, Self>

Converts self into a Left variant of Either<Self, Self> if into_left is true. Converts self into a Right variant of Either<Self, Self> otherwise. Read more

Source §

fn into_either_with<F>(self, into_left: F) -> Either<Self, Self>
where F: FnOnce(&Self) -> bool,

Converts self into a Left variant of Either<Self, Self> if into_left(&self) returns true. Converts self into a Right variant of Either<Self, Self> otherwise. Read more

Source §