iqdb-flat 0.4.0

Brute-force exact nearest-neighbor search and recall ground truth - part of the iQDB family.
Documentation
  • Exact search — scans every vector, computes the true distance, returns the top-k; always correct, never approximate
  • Ground truth — the reference every approximate index (HNSW, IVF) is measured against via recall@k
  • Deterministic — one smaller-is-nearer ordering across all five metrics, with a stable insertion-order tiebreaker and NaN-safe selection
  • Fast where it counts — SIMD distance via iqdb-distance, a bounded-heap O(n log k) top-k, amortized O(1) insert/delete, and an optional rayon parallel scan
  • Right for small data — the obvious choice under ~10k vectors, where graph or partition overhead is not justified

Installation

[dependencies]
iqdb-flat = "0.4"

# Optional rayon-backed parallel scan for large in-memory corpora:
# iqdb-flat = { version = "0.4", features = ["parallel"] }

Quick Start

use std::sync::Arc;

use iqdb_flat::{FlatConfig, FlatIndex};
use iqdb_index::{Index, IndexCore};
use iqdb_types::{DistanceMetric, SearchParams, VectorId};

fn main() -> iqdb_types::Result<()> {
    // A 2-D Euclidean index. `FlatConfig` is a unit struct — nothing to tune.
    let mut idx = FlatIndex::new(2, DistanceMetric::Euclidean, FlatConfig)?;

    idx.insert(VectorId::from(1u64), Arc::<[f32]>::from(&[0.0, 0.0][..]), None)?;
    idx.insert(VectorId::from(2u64), Arc::<[f32]>::from(&[3.0, 4.0][..]), None)?;
    idx.insert(VectorId::from(3u64), Arc::<[f32]>::from(&[1.0, 0.0][..]), None)?;

    let hits = idx.search(&[0.0, 0.0], &SearchParams::new(2, DistanceMetric::Euclidean))?;

    assert_eq!(hits.len(), 2);
    assert_eq!(hits[0].id, VectorId::U64(1)); // distance 0.0
    assert_eq!(hits[1].id, VectorId::U64(3)); // distance 1.0
    Ok(())
}

Filtered search restricts the scan to rows whose metadata matches, evaluated before distance work so a selective filter skips proportionally:

# use iqdb_flat::{FlatConfig, FlatIndex};
# use iqdb_index::{Index, IndexCore};
use iqdb_types::{DistanceMetric, Filter, SearchParams, Value};

# fn demo(idx: &FlatIndex) -> iqdb_types::Result<()> {
let params = SearchParams {
    filter: Some(Filter::eq("lang", Value::String("rust".into()))),
    ..SearchParams::new(10, DistanceMetric::Euclidean)
};
let hits = idx.search(&[0.0, 0.0], &params)?;
# Ok(())
# }

The complete surface — every method, parameter, error, and more examples — is in docs/API.md.

Performance

  • Distance is delegated to iqdb-distance, which dispatches to SIMD kernels (AVX2/NEON) where the target allows; flat never reimplements a metric.
  • Top-k uses a bounded max-heap of size k keyed by (distance, sequence)O(n log k), NaN-safe via f32::total_cmp.
  • Insert / delete are amortized O(1): a HashMap id→position map for duplicate checks, swap_remove for deletion, and a monotonic sequence stamp so reordering never disturbs the tiebreaker.
  • No-filter search allocates a fixed number of buffers independent of corpus size (locked in by tests/no_alloc.rs).
  • parallel feature adds a rayon chunked scan that is byte-identical to the sequential baseline (tests/parallel_equivalence.rs); small corpora short-circuit to sequential.

Status

v0.4.0 is feature-complete: exact search, top-k, the full Index / IndexCore trait implementation, the optional parallel scan, and metadata pre-filtering all ship and are covered by unit, property, differential, and scale tests. Remaining work to 1.0 is API finalisation, polish, and final benchmarks per the ROADMAP.

Where It Fits

iqdb-flat is a Phase-3 index. It builds on:

  • iqdb-types — vectors, ids, metadata
  • iqdb-distance — the distance kernels
  • iqdb-index — implements the Index trait
  • iqdb-eval — uses it to generate ground truth

It is unblocked once iqdb-index exists; no external dependency.

Contributing

See dev/DIRECTIVES.md for engineering standards and the definition of done. Before a PR: cargo fmt --all, cargo clippy --all-targets --all-features -- -D warnings, and cargo test --all-features must be clean.