pub struct SparseVectorConfig {Show 13 fields
pub format: SparseFormat,
pub index_size: IndexSize,
pub weight_quantization: WeightQuantization,
pub weight_threshold: f32,
pub block_size: usize,
pub bmp_block_size: u32,
pub max_bmp_grid_bytes: u64,
pub bmp_superblock_size: u32,
pub pruning: Option<f32>,
pub query_config: Option<SparseQueryConfig>,
pub dims: Option<u32>,
pub max_weight: Option<f32>,
pub min_terms: usize,
}Expand description
Configuration for sparse vector storage
Research-validated optimizations for learned sparse retrieval (SPLADE, uniCOIL, etc.):
- Weight threshold (0.01-0.05): Removes ~30-50% of postings with minimal nDCG impact
- Posting list pruning (0.1): Keeps top 10% per dimension, 50-70% index reduction, <1% nDCG loss
- Query pruning (top 10-20 dims): 30-50% latency reduction, <2% nDCG loss
- UInt8 quantization: 4x compression, 1-2% nDCG loss (optimal trade-off)
Fields§
§format: SparseFormatIndex format: MaxScore (DAAT) or BMP (BAAT)
index_size: IndexSizeSize of dimension/term indices
weight_quantization: WeightQuantizationQuantization for weights (see WeightQuantization docs for trade-offs)
weight_threshold: f32Minimum weight threshold - weights below this value are not indexed
Research recommendation (Guo et al., 2022; SPLADE v2):
- 0.01-0.05 for SPLADE models removes ~30-50% of postings
- Minimal impact on nDCG@10 (<1% loss)
- Major reduction in index size and query latency
block_size: usizeBlock size for posting lists (must be power of 2, default 128 for SIMD) Larger blocks = better compression, smaller blocks = faster seeks. Used by MaxScore format only.
bmp_block_size: u32BMP block size: number of consecutive doc_ids per block (must be power of 2). Default 64. Only used when format = Bmp. Smaller = better pruning granularity, larger = less overhead.
max_bmp_grid_bytes: u64Maximum BMP grid memory in bytes. If the grid (num_dims × num_blocks) would exceed this, bmp_block_size is automatically increased to cap memory. Default: 256MB. Set to 0 to disable the cap.
bmp_superblock_size: u32BMP superblock size: number of consecutive blocks grouped for hierarchical pruning (Carlson et al., SIGIR 2025). Must be power of 2. Default 64. Set to 0 to disable superblock pruning (flat BMP scoring). Only used when format = Bmp.
pruning: Option<f32>Static pruning: fraction of postings to keep per inverted list (SEISMIC-style) Lists are sorted by weight descending and truncated to top fraction.
Research recommendation (SPLADE v2, Formal et al., 2021):
- None = keep all postings (default, exact)
- Some(0.1) = keep top 10% of postings per dimension
- 50-70% index size reduction
- <1% nDCG@10 loss
- Exploits “concentration of importance” in learned representations
Applied only during initial segment build, not during merge.
query_config: Option<SparseQueryConfig>Query-time configuration (tokenizer, weighting)
dims: Option<u32>Fixed vocabulary size (number of dimensions) for BMP format.
When set, all BMP segments use the same grid dimensions (rows = dims), enabling zero-copy block-copy merge. The grid is indexed by dim_id directly (no dim_ids Section C needed).
Required for BMP V12 format. Typical values:
- SPLADE/BERT: 30522 or 105879 (WordPiece / Unigram vocabulary)
- uniCOIL: 30522
- Custom models: set to vocabulary size
If None, BMP builder derives dims from observed data (V10 behavior).
max_weight: Option<f32>Fixed max weight scale for BMP format.
When set, all BMP segments use the same quantization scale
(max_weight_scale = max_weight), eliminating rescaling during merge.
For SPLADE models: 5.0 (covers typical weight range 0-5). If None, BMP builder derives scale from data (V10 behavior).
min_terms: usizeMinimum number of postings in a dimension before pruning and weight_threshold filtering are applied. Protects dimensions with very few postings from losing most of their signal.
Default: 4. Set to 0 to always apply pruning/filtering.
Implementations§
Source§impl SparseVectorConfig
impl SparseVectorConfig
Sourcepub fn splade() -> Self
pub fn splade() -> Self
SPLADE-optimized config with research-validated defaults
Optimized for SPLADE, uniCOIL, and similar learned sparse retrieval models. Based on research findings from:
- Pati (2025): UInt8 quantization = 4x compression, 1-2% nDCG loss
- Formal et al. (2021): SPLADE v2 posting list pruning
- Qiao et al. (2023): Query dimension pruning and approximate search
- Guo et al. (2022): Weight thresholding for efficiency
Expected performance vs. full precision baseline:
- Index size: ~15-25% of original (combined effect of all optimizations)
- Query latency: 40-60% faster
- Effectiveness: 2-4% nDCG@10 loss (typically acceptable for production)
Vocabulary: ~30K dimensions (fits in u16)
Sourcepub fn splade_bmp() -> Self
pub fn splade_bmp() -> Self
SPLADE-optimized config with BMP (Block-Max Pruning) format
Same optimization settings as splade() but uses the BMP block-at-a-time
format (Mallia, Suel & Tonellotto, SIGIR 2024) instead of MaxScore.
BMP divides the document space into fixed-size blocks and processes them
in decreasing upper-bound order, enabling aggressive early termination.
Sourcepub fn compact() -> Self
pub fn compact() -> Self
Compact config: Maximum compression (experimental)
Uses aggressive UInt4 quantization for smallest possible index size. Expected trade-offs:
- Index size: ~10-15% of Float32 baseline
- Effectiveness: ~3-5% nDCG@10 loss
Recommended for: Memory-constrained environments, cache-heavy workloads
Sourcepub fn full_precision() -> Self
pub fn full_precision() -> Self
Full precision config: No compression, baseline effectiveness
Use for: Research baselines, when effectiveness is critical
Sourcepub fn conservative() -> Self
pub fn conservative() -> Self
Conservative config: Mild optimizations, minimal effectiveness loss
Balances compression and effectiveness with conservative defaults. Expected trade-offs:
- Index size: ~40-50% of Float32 baseline
- Query latency: ~20-30% faster
- Effectiveness: <1% nDCG@10 loss
Recommended for: Production deployments prioritizing effectiveness
Sourcepub fn with_weight_threshold(self, threshold: f32) -> Self
pub fn with_weight_threshold(self, threshold: f32) -> Self
Set weight threshold (builder pattern)
Sourcepub fn with_pruning(self, fraction: f32) -> Self
pub fn with_pruning(self, fraction: f32) -> Self
Set posting list pruning fraction (builder pattern) e.g., 0.1 = keep top 10% of postings per dimension
Sourcepub fn bytes_per_entry(&self) -> f32
pub fn bytes_per_entry(&self) -> f32
Bytes per entry (index + weight)
Sourcepub fn to_byte(&self) -> u8
pub fn to_byte(&self) -> u8
Serialize config to a single byte.
Layout: bits 7-4 = IndexSize, bit 3 = format (0=MaxScore, 1=BMP), bits 2-0 = WeightQuantization
Sourcepub fn from_byte(b: u8) -> Option<Self>
pub fn from_byte(b: u8) -> Option<Self>
Deserialize config from a single byte.
Note: weight_threshold, block_size, bmp_block_size, and query_config are not serialized in the byte — they come from the schema.
Sourcepub fn with_block_size(self, size: usize) -> Self
pub fn with_block_size(self, size: usize) -> Self
Set block size (builder pattern) Must be power of 2, recommended: 64, 128, 256
Sourcepub fn with_query_config(self, config: SparseQueryConfig) -> Self
pub fn with_query_config(self, config: SparseQueryConfig) -> Self
Set query configuration (builder pattern)
Trait Implementations§
Source§impl Clone for SparseVectorConfig
impl Clone for SparseVectorConfig
Source§fn clone(&self) -> SparseVectorConfig
fn clone(&self) -> SparseVectorConfig
1.0.0 · Source§fn clone_from(&mut self, source: &Self)
fn clone_from(&mut self, source: &Self)
source. Read moreSource§impl Debug for SparseVectorConfig
impl Debug for SparseVectorConfig
Source§impl Default for SparseVectorConfig
impl Default for SparseVectorConfig
Source§impl<'de> Deserialize<'de> for SparseVectorConfig
impl<'de> Deserialize<'de> for SparseVectorConfig
Source§fn deserialize<__D>(__deserializer: __D) -> Result<Self, __D::Error>where
__D: Deserializer<'de>,
fn deserialize<__D>(__deserializer: __D) -> Result<Self, __D::Error>where
__D: Deserializer<'de>,
Source§impl PartialEq for SparseVectorConfig
impl PartialEq for SparseVectorConfig
Source§impl Serialize for SparseVectorConfig
impl Serialize for SparseVectorConfig
impl StructuralPartialEq for SparseVectorConfig
Auto Trait Implementations§
impl Freeze for SparseVectorConfig
impl RefUnwindSafe for SparseVectorConfig
impl Send for SparseVectorConfig
impl Sync for SparseVectorConfig
impl Unpin for SparseVectorConfig
impl UnsafeUnpin for SparseVectorConfig
impl UnwindSafe for SparseVectorConfig
Blanket Implementations§
Source§impl<T> BorrowMut<T> for Twhere
T: ?Sized,
impl<T> BorrowMut<T> for Twhere
T: ?Sized,
Source§fn borrow_mut(&mut self) -> &mut T
fn borrow_mut(&mut self) -> &mut T
Source§impl<T> CloneToUninit for Twhere
T: Clone,
impl<T> CloneToUninit for Twhere
T: Clone,
Source§impl<T> IntoEither for T
impl<T> IntoEither for T
Source§fn into_either(self, into_left: bool) -> Either<Self, Self>
fn into_either(self, into_left: bool) -> Either<Self, Self>
self into a Left variant of Either<Self, Self>
if into_left is true.
Converts self into a Right variant of Either<Self, Self>
otherwise. Read moreSource§fn into_either_with<F>(self, into_left: F) -> Either<Self, Self>
fn into_either_with<F>(self, into_left: F) -> Either<Self, Self>
self into a Left variant of Either<Self, Self>
if into_left(&self) returns true.
Converts self into a Right variant of Either<Self, Self>
otherwise. Read moreSource§impl<T> Pointable for T
impl<T> Pointable for T
Source§impl<SS, SP> SupersetOf<SS> for SPwhere
SS: SubsetOf<SP>,
impl<SS, SP> SupersetOf<SS> for SPwhere
SS: SubsetOf<SP>,
Source§fn to_subset(&self) -> Option<SS>
fn to_subset(&self) -> Option<SS>
self from the equivalent element of its
superset. Read moreSource§fn is_in_subset(&self) -> bool
fn is_in_subset(&self) -> bool
self is actually part of its subset T (and can be converted to it).Source§fn to_subset_unchecked(&self) -> SS
fn to_subset_unchecked(&self) -> SS
self.to_subset but without any property checks. Always succeeds.Source§fn from_subset(element: &SS) -> SP
fn from_subset(element: &SS) -> SP
self to the equivalent element of its superset.