pub trait MultiVectorEmbedder: Send + Sync {
// Required methods
fn embed_tokens(&self, text: &str) -> Result<MultiVectorEmbedding>;
fn token_dimension(&self) -> usize;
fn max_tokens(&self) -> usize;
fn model_id(&self) -> &str;
// Provided method
fn embed_tokens_batch(
&self,
texts: &[&str],
) -> Result<Vec<MultiVectorEmbedding>> { ... }
}Expand description
Trait for models that produce token-level embeddings.
Unlike single-vector embedders (which produce one embedding per text), multi-vector embedders produce one embedding per token, enabling fine-grained late interaction scoring.
§Example
ⓘ
use aprender_rag::multivector::{MultiVectorEmbedder, MockMultiVectorEmbedder};
let embedder = MockMultiVectorEmbedder::new(128, 512);
let embedding = embedder.embed_tokens("hello world").unwrap();
assert_eq!(embedding.num_tokens(), 2);
assert_eq!(embedding.dim(), 128);Required Methods§
Sourcefn embed_tokens(&self, text: &str) -> Result<MultiVectorEmbedding>
fn embed_tokens(&self, text: &str) -> Result<MultiVectorEmbedding>
Sourcefn token_dimension(&self) -> usize
fn token_dimension(&self) -> usize
Get the token embedding dimension.
Sourcefn max_tokens(&self) -> usize
fn max_tokens(&self) -> usize
Get the maximum tokens per document.
Provided Methods§
Sourcefn embed_tokens_batch(
&self,
texts: &[&str],
) -> Result<Vec<MultiVectorEmbedding>>
fn embed_tokens_batch( &self, texts: &[&str], ) -> Result<Vec<MultiVectorEmbedding>>
Batch embed multiple texts.
The default implementation calls embed_tokens sequentially.
Implementations may override for more efficient batching.
Dyn Compatibility§
This trait is dyn compatible.
In older versions of Rust, dyn compatibility was called "object safety".
Implementations on Foreign Types§
Source§impl<E: MultiVectorEmbedder + ?Sized> MultiVectorEmbedder for Box<E>
Trait implementation for boxed embedders.
impl<E: MultiVectorEmbedder + ?Sized> MultiVectorEmbedder for Box<E>
Trait implementation for boxed embedders.