next-plaid 0.1.0

CPU-based PLAID implementation for multi-vector search using ndarray
Documentation

next-plaid

A CPU-based Rust implementation of the PLAID algorithm for efficient multi-vector search (late interaction retrieval).

Crates.io Documentation License

Features

  • Pure Rust: No Python or GPU dependencies required
  • CPU Optimized: Uses ndarray with rayon for parallel processing
  • BLAS Acceleration: Optional Accelerate (macOS) or OpenBLAS backends for 3.6x faster indexing
  • Metadata Filtering: Optional SQLite-based metadata storage for filtered search

Installation

[dependencies]
next-plaid = "0.1"

For NPY file support (required for index persistence):

[dependencies]
next-plaid = { version = "0.1", features = ["npy"] }

BLAS Acceleration (Recommended)

macOS (Apple Accelerate framework):

[dependencies]
next-plaid = { version = "0.1", features = ["npy", "accelerate"] }

Linux (OpenBLAS):

[dependencies]
next-plaid = { version = "0.1", features = ["npy", "openblas"] }

Note: OpenBLAS requires the system library (apt install libopenblas-dev on Ubuntu).

Metadata Filtering (Optional)

[dependencies]
next-plaid = { version = "0.1", features = ["npy", "filtering"] }

Quick Start

Creating an Index

use next_plaid::{Index, IndexConfig};
use ndarray::Array2;

// Your document embeddings (list of [num_tokens, dim] arrays)
let embeddings: Vec<Array2<f32>> = load_embeddings();

// Create index with automatic centroid computation
let config = IndexConfig::default();
let index = Index::create_with_kmeans(&embeddings, "path/to/index", &config)?;

Searching

use next_plaid::{Index, SearchParameters};

// Load the index
let index = Index::load("path/to/index")?;

// Search parameters
let params = SearchParameters {
    batch_size: 128,
    n_full_scores: 1024,
    top_k: 10,
    n_ivf_probe: 32,
};

// Single query
let query: Array2<f32> = get_query_embeddings();
let result = index.search(&query, &params, None)?;

println!("Top results: {:?}", result.passage_ids);
println!("Scores: {:?}", result.scores);

Filtered Search with Metadata

use next_plaid::{Index, SearchParameters, filtering};
use serde_json::json;

// Create metadata database
let metadata = vec![
    json!({"title": "Doc 1", "category": "science", "year": 2023}),
    json!({"title": "Doc 2", "category": "history", "year": 2022}),
];
filtering::create("path/to/index", &metadata)?;

// Query metadata to get document subset
let subset = filtering::where_condition(
    "path/to/index",
    "category = ? AND year >= ?",
    &[json!("science"), json!(2023)],
)?;

// Search only within the filtered subset
let result = index.search(&query, &params, Some(&subset))?;

Performance

SciFact Benchmark (5,183 documents, 1.2M tokens)

Operation next-plaid FastPlaid Speedup
Index + Update 12.19s 19.46s 1.60x faster
Search (300 queries) 16.38s 85.85s 5.2x faster
Total 28.57s 105.31s 3.7x faster

Memory Usage

Operation next-plaid FastPlaid Savings
Peak 480 MB 3,361 MB 86% less

Feature Flags

Feature Description
npy NPY file support for index persistence
filtering SQLite-based metadata filtering
accelerate macOS Accelerate framework for BLAS
openblas OpenBLAS for BLAS acceleration

License

Apache-2.0