iqdb-flat 0.5.0

Brute-force exact nearest-neighbor search and recall ground truth - part of the iQDB family.
Documentation
  • Exact search — scans every vector, computes the true distance, returns the top-k; always correct, never approximate
  • Ground truth — the reference every approximate index (HNSW, IVF) is measured against via recall@k
  • Deterministic — one smaller-is-nearer ordering across all five metrics, with a stable insertion-order tiebreaker and NaN-safe selection
  • Fast where it counts — SIMD distance via iqdb-distance, a bounded-heap O(n log k) top-k, amortized O(1) insert/delete, and an optional rayon parallel scan
  • Right for small data — the obvious choice under ~10k vectors, where graph or partition overhead is not justified

Installation

[dependencies]
iqdb-flat = "0.5"

# Optional rayon-backed parallel scan for large in-memory corpora:
# iqdb-flat = { version = "0.5", features = ["parallel"] }

Quick Start

use std::sync::Arc;

use iqdb_flat::{FlatConfig, FlatIndex};
use iqdb_index::{Index, IndexCore};
use iqdb_types::{DistanceMetric, SearchParams, VectorId};

fn main() -> iqdb_types::Result<()> {
    // A 2-D Euclidean index. `FlatConfig` is a unit struct — nothing to tune.
    let mut idx = FlatIndex::new(2, DistanceMetric::Euclidean, FlatConfig)?;

    idx.insert(VectorId::from(1u64), Arc::<[f32]>::from(&[0.0, 0.0][..]), None)?;
    idx.insert(VectorId::from(2u64), Arc::<[f32]>::from(&[3.0, 4.0][..]), None)?;
    idx.insert(VectorId::from(3u64), Arc::<[f32]>::from(&[1.0, 0.0][..]), None)?;

    let hits = idx.search(&[0.0, 0.0], &SearchParams::new(2, DistanceMetric::Euclidean))?;

    assert_eq!(hits.len(), 2);
    assert_eq!(hits[0].id, VectorId::U64(1)); // distance 0.0
    assert_eq!(hits[1].id, VectorId::U64(3)); // distance 1.0
    Ok(())
}

Filtered search restricts the scan to rows whose metadata matches, evaluated before distance work so a selective filter skips proportionally:

# use iqdb_flat::{FlatConfig, FlatIndex};
# use iqdb_index::{Index, IndexCore};
use iqdb_types::{DistanceMetric, Filter, SearchParams, Value};

# fn demo(idx: &FlatIndex) -> iqdb_types::Result<()> {
let params = SearchParams {
    filter: Some(Filter::eq("lang", Value::String("rust".into()))),
    ..SearchParams::new(10, DistanceMetric::Euclidean)
};
let hits = idx.search(&[0.0, 0.0], &params)?;
# Ok(())
# }

The complete surface — every method, parameter, error, and more examples — is in docs/API.md.

Performance

  • Distance is delegated to iqdb-distance, which dispatches to SIMD kernels (AVX2/NEON) where the target allows; flat never reimplements a metric.
  • Top-k uses a bounded max-heap of size k keyed by (distance, sequence)O(n log k), NaN-safe via f32::total_cmp.
  • Insert / delete are amortized O(1): a HashMap id→position map for duplicate checks, swap_remove for deletion, and a monotonic sequence stamp so reordering never disturbs the tiebreaker.
  • No-filter search allocates a fixed number of buffers independent of corpus size (locked in by tests/no_alloc.rs).
  • parallel feature adds a rayon chunked scan that is byte-identical to the sequential baseline (tests/parallel_equivalence.rs); small corpora short-circuit to sequential.

Status

v0.5.0 is feature-complete with the public API frozen: exact search, top-k, the full Index / IndexCore trait implementation, the optional parallel scan, and metadata pre-filtering all ship and are covered by unit, property, differential, and scale tests — including a bit-for-bit large-scan oracle check at N = 20,000. The committed surface is recorded in the ROADMAP; only additive, non-breaking changes are made through 1.x. Remaining work to 1.0 is integration against real consumers, polish, and final benchmarks.

Where It Fits

iqdb-flat is a Phase-3 index. It builds on:

  • iqdb-types — vectors, ids, metadata
  • iqdb-distance — the distance kernels
  • iqdb-index — implements the Index trait
  • iqdb-eval — uses it to generate ground truth

It is unblocked once iqdb-index exists; no external dependency.

Standards

Built to the iQDB Rust standard. See REPS.md (Rust Efficiency & Performance Standards) and dev/DIRECTIVES.md for the engineering law and the definition of done. Before a PR: cargo fmt --all, cargo clippy --all-targets --all-features -- -D warnings, and cargo test --all-features must be clean.