Module ivf

Module ivf 

Source
Expand description

IVF-PQ (Inverted File Index with Product Quantization)

Memory-efficient ANN search for large-scale datasets (1M+ vectors).

§Algorithm Overview

  1. Clustering (IVF): Partition vectors into clusters using k-means
  2. Product Quantization (PQ): Compress vectors by quantizing sub-vectors
  3. Search: Query nearest clusters, then search compressed vectors

§Benefits

  • Memory: 8-16x compression (768D → 64-96 bytes)
  • Speed: Search only relevant partitions (nprobe parameter)
  • Scalability: Handles 1M+ vectors efficiently

§Example

use oxify_vector::ivf::{IvfPqIndex, IvfPqConfig};
use std::collections::HashMap;

let config = IvfPqConfig::default()
    .with_nclusters(256)
    .with_nsubvectors(64)
    .with_nprobe(16);

let mut index = IvfPqIndex::new(config);

let mut vectors = HashMap::new();
vectors.insert("doc1".to_string(), vec![0.1; 768]);
vectors.insert("doc2".to_string(), vec![0.2; 768]);

index.build(&vectors)?;

let query = vec![0.15; 768];
let results = index.search(&query, 10)?;

Structs§

IvfPqConfig
IVF-PQ configuration
IvfPqIndex
IVF-PQ index for memory-efficient ANN search
IvfPqStats
IVF-PQ index statistics