pub struct SparseVectorConfig {
pub index_size: IndexSize,
pub weight_quantization: WeightQuantization,
pub weight_threshold: f32,
pub block_size: usize,
pub posting_list_pruning: Option<f32>,
pub query_config: Option<SparseQueryConfig>,
}Expand description
Configuration for sparse vector storage
Research-validated optimizations for learned sparse retrieval (SPLADE, uniCOIL, etc.):
- Weight threshold (0.01-0.05): Removes ~30-50% of postings with minimal nDCG impact
- Posting list pruning (0.1): Keeps top 10% per dimension, 50-70% index reduction, <1% nDCG loss
- Query pruning (top 10-20 dims): 30-50% latency reduction, <2% nDCG loss
- UInt8 quantization: 4x compression, 1-2% nDCG loss (optimal trade-off)
Fields§
§index_size: IndexSizeSize of dimension/term indices
weight_quantization: WeightQuantizationQuantization for weights (see WeightQuantization docs for trade-offs)
weight_threshold: f32Minimum weight threshold - weights below this value are not indexed
Research recommendation (Guo et al., 2022; SPLADE v2):
- 0.01-0.05 for SPLADE models removes ~30-50% of postings
- Minimal impact on nDCG@10 (<1% loss)
- Major reduction in index size and query latency
block_size: usizeBlock size for posting lists (must be power of 2, default 128 for SIMD) Larger blocks = better compression, smaller blocks = faster seeks
posting_list_pruning: Option<f32>Static pruning: fraction of postings to keep per inverted list (SEISMIC-style) Lists are sorted by weight descending and truncated to top fraction.
Research recommendation (SPLADE v2, Formal et al., 2021):
- None = keep all postings (default, exact)
- Some(0.1) = keep top 10% of postings per dimension
- 50-70% index size reduction
- <1% nDCG@10 loss
- Exploits “concentration of importance” in learned representations
Applied only during initial segment build, not during merge.
query_config: Option<SparseQueryConfig>Query-time configuration (tokenizer, weighting)
Implementations§
Source§impl SparseVectorConfig
impl SparseVectorConfig
Sourcepub fn splade() -> Self
pub fn splade() -> Self
SPLADE-optimized config with research-validated defaults
Optimized for SPLADE, uniCOIL, and similar learned sparse retrieval models. Based on research findings from:
- Pati (2025): UInt8 quantization = 4x compression, 1-2% nDCG loss
- Formal et al. (2021): SPLADE v2 posting list pruning
- Qiao et al. (2023): Query dimension pruning and approximate search
- Guo et al. (2022): Weight thresholding for efficiency
Expected performance vs. full precision baseline:
- Index size: ~15-25% of original (combined effect of all optimizations)
- Query latency: 40-60% faster
- Effectiveness: 2-4% nDCG@10 loss (typically acceptable for production)
Vocabulary: ~30K dimensions (fits in u16)
Sourcepub fn compact() -> Self
pub fn compact() -> Self
Compact config: Maximum compression (experimental)
Uses aggressive UInt4 quantization for smallest possible index size. Expected trade-offs:
- Index size: ~10-15% of Float32 baseline
- Effectiveness: ~3-5% nDCG@10 loss
Recommended for: Memory-constrained environments, cache-heavy workloads
Sourcepub fn full_precision() -> Self
pub fn full_precision() -> Self
Full precision config: No compression, baseline effectiveness
Use for: Research baselines, when effectiveness is critical
Sourcepub fn conservative() -> Self
pub fn conservative() -> Self
Conservative config: Mild optimizations, minimal effectiveness loss
Balances compression and effectiveness with conservative defaults. Expected trade-offs:
- Index size: ~40-50% of Float32 baseline
- Query latency: ~20-30% faster
- Effectiveness: <1% nDCG@10 loss
Recommended for: Production deployments prioritizing effectiveness
Sourcepub fn with_weight_threshold(self, threshold: f32) -> Self
pub fn with_weight_threshold(self, threshold: f32) -> Self
Set weight threshold (builder pattern)
Sourcepub fn with_pruning(self, fraction: f32) -> Self
pub fn with_pruning(self, fraction: f32) -> Self
Set posting list pruning fraction (builder pattern) e.g., 0.1 = keep top 10% of postings per dimension
Sourcepub fn bytes_per_entry(&self) -> f32
pub fn bytes_per_entry(&self) -> f32
Bytes per entry (index + weight)
Sourcepub fn from_byte(b: u8) -> Option<Self>
pub fn from_byte(b: u8) -> Option<Self>
Deserialize config from a single byte Note: weight_threshold, block_size and query_config are not serialized in the byte
Sourcepub fn with_block_size(self, size: usize) -> Self
pub fn with_block_size(self, size: usize) -> Self
Set block size (builder pattern) Must be power of 2, recommended: 64, 128, 256
Sourcepub fn with_query_config(self, config: SparseQueryConfig) -> Self
pub fn with_query_config(self, config: SparseQueryConfig) -> Self
Set query configuration (builder pattern)
Trait Implementations§
Source§impl Clone for SparseVectorConfig
impl Clone for SparseVectorConfig
Source§fn clone(&self) -> SparseVectorConfig
fn clone(&self) -> SparseVectorConfig
1.0.0 · Source§fn clone_from(&mut self, source: &Self)
fn clone_from(&mut self, source: &Self)
source. Read moreSource§impl Debug for SparseVectorConfig
impl Debug for SparseVectorConfig
Source§impl Default for SparseVectorConfig
impl Default for SparseVectorConfig
Source§impl<'de> Deserialize<'de> for SparseVectorConfig
impl<'de> Deserialize<'de> for SparseVectorConfig
Source§fn deserialize<__D>(__deserializer: __D) -> Result<Self, __D::Error>where
__D: Deserializer<'de>,
fn deserialize<__D>(__deserializer: __D) -> Result<Self, __D::Error>where
__D: Deserializer<'de>,
Source§impl PartialEq for SparseVectorConfig
impl PartialEq for SparseVectorConfig
Source§impl Serialize for SparseVectorConfig
impl Serialize for SparseVectorConfig
impl StructuralPartialEq for SparseVectorConfig
Auto Trait Implementations§
impl Freeze for SparseVectorConfig
impl RefUnwindSafe for SparseVectorConfig
impl Send for SparseVectorConfig
impl Sync for SparseVectorConfig
impl Unpin for SparseVectorConfig
impl UnwindSafe for SparseVectorConfig
Blanket Implementations§
Source§impl<T> BorrowMut<T> for Twhere
T: ?Sized,
impl<T> BorrowMut<T> for Twhere
T: ?Sized,
Source§fn borrow_mut(&mut self) -> &mut T
fn borrow_mut(&mut self) -> &mut T
Source§impl<T> CloneToUninit for Twhere
T: Clone,
impl<T> CloneToUninit for Twhere
T: Clone,
Source§impl<T> IntoEither for T
impl<T> IntoEither for T
Source§fn into_either(self, into_left: bool) -> Either<Self, Self>
fn into_either(self, into_left: bool) -> Either<Self, Self>
self into a Left variant of Either<Self, Self>
if into_left is true.
Converts self into a Right variant of Either<Self, Self>
otherwise. Read moreSource§fn into_either_with<F>(self, into_left: F) -> Either<Self, Self>
fn into_either_with<F>(self, into_left: F) -> Either<Self, Self>
self into a Left variant of Either<Self, Self>
if into_left(&self) returns true.
Converts self into a Right variant of Either<Self, Self>
otherwise. Read moreSource§impl<T> Pointable for T
impl<T> Pointable for T
Source§impl<SS, SP> SupersetOf<SS> for SPwhere
SS: SubsetOf<SP>,
impl<SS, SP> SupersetOf<SS> for SPwhere
SS: SubsetOf<SP>,
Source§fn to_subset(&self) -> Option<SS>
fn to_subset(&self) -> Option<SS>
self from the equivalent element of its
superset. Read moreSource§fn is_in_subset(&self) -> bool
fn is_in_subset(&self) -> bool
self is actually part of its subset T (and can be converted to it).Source§fn to_subset_unchecked(&self) -> SS
fn to_subset_unchecked(&self) -> SS
self.to_subset but without any property checks. Always succeeds.Source§fn from_subset(element: &SS) -> SP
fn from_subset(element: &SS) -> SP
self to the equivalent element of its superset.