Skip to main content

Module matryoshka

Module matryoshka 

Source
Expand description

Matryoshka Embedding Support

Implements support for Matryoshka Representation Learning (MRL) embeddings, which allow a single embedding model to produce representations at multiple dimensionalities. The key insight is that the first d dimensions of a larger embedding contain meaningful information at that lower dimensionality.

References:

  • Kusupati et al. (2022): “Matryoshka Representation Learning”
  • Li et al. (2024): “2D Matryoshka Sentence Embeddings”

Features:

  • Truncation to any dimension <= original
  • Adaptive dimension selection based on accuracy/speed tradeoff
  • Cascaded search: coarse filtering at low dim, refinement at high dim

Structs§

AdaptiveDimensionSelector
Adaptive dimension selector based on query characteristics.
MatryoshkaConfig
Configuration for Matryoshka embedding handling.
MatryoshkaEmbedding
A Matryoshka embedding that can be truncated to different dimensions.
MatryoshkaIndex
Index that supports Matryoshka embeddings with cascaded search.
MatryoshkaStats
Statistics for Matryoshka index operations.