iqdb-filter 0.3.0

Canonical metadata-filter evaluator for vector search: validate-on-construction and infallible per-row evaluation - part of the iQDB family.
Documentation
  • Canonical evaluator — one implementation of Filter semantics, shared by every metadata-aware index so query results never diverge
  • Validate once, evaluate manyFilterEvaluator::new checks the filter (depth, In cardinality) a single time; evaluate is then infallible and runs per-row inside the search loop
  • Closed-world semantics — a leaf over an absent field is false, type mismatches are false, NaN orderings are false, and Not of a false leaf is true (the "records without this field" idiom)
  • DoS-hardened — iterative validation that can't stack-overflow, with bounded depth and In width; the library never panics on adversarial input
  • Scan helpersprefilter / postfilter apply the evaluator as lazy, allocation-free iterator adapters over a stream of (key, metadata) pairs
  • Strategy selection — a structural selectivity estimate drives an automatic PreFilter / PostFilter choice, with a tunable threshold
  • First-party only — depends solely on iqdb-types, so it is unblocked today

Installation

[dependencies]
iqdb-filter = "0.3"

Quick start

Build an evaluator once, then test it against each record's metadata:

use iqdb_filter::FilterEvaluator;
use iqdb_types::{Filter, Metadata, Value};

// published == true AND year > 2000
let filter = Filter::and(vec![
    Filter::eq("published", Value::Bool(true)),
    Filter::gt("year", Value::Int(2000)),
]);
let evaluator = FilterEvaluator::new(filter).expect("valid filter");

let meta: Metadata = [
    ("published".to_string(), Value::Bool(true)),
    ("year".to_string(), Value::Int(2026)),
]
.into_iter()
.collect();

assert!(evaluator.evaluate(Some(&meta)));
assert!(!evaluator.evaluate(None)); // no metadata -> every leaf is false

The Not / absent-field idiom selects records that lack a field, or carry it with a non-matching value:

use iqdb_filter::FilterEvaluator;
use iqdb_types::{Filter, Value};

// "records that are not authored by ada" — including records with no author.
let evaluator =
    FilterEvaluator::new(Filter::not(Filter::eq("author", Value::String("ada".into()))))
        .expect("valid filter");

assert!(evaluator.evaluate(None));

Validation rejects pathological filters up front — bounded by the public caps:

use iqdb_filter::{FilterEvaluator, MAX_IN_VALUES};
use iqdb_types::{Filter, IqdbError, Value};

// An `In` set wider than the cap is refused before it can slow a query.
let huge = vec![Value::Int(0); MAX_IN_VALUES + 1];
let err = FilterEvaluator::new(Filter::is_in("tag", huge)).unwrap_err();
assert_eq!(err, IqdbError::InvalidFilter);

Apply a strategy with the scan helpers, or let the selectivity estimate pick one:

use iqdb_filter::{FilterEvaluator, FilterStrategy, choose_strategy};
use iqdb_types::{Filter, Metadata, Value};

let evaluator = FilterEvaluator::new(Filter::eq("lang", Value::String("rust".into())))
    .expect("valid filter");

// `prefilter` keeps the keys of matching candidates, lazily, before scoring.
let rust: Metadata = [("lang".to_string(), Value::String("rust".into()))]
    .into_iter()
    .collect();
let go: Metadata = [("lang".to_string(), Value::String("go".into()))]
    .into_iter()
    .collect();
let rows = [(0_usize, Some(&rust)), (1, Some(&go))];
let kept: Vec<usize> = evaluator.prefilter(rows).collect();
assert_eq!(kept, [0]);

// An equality predicate is narrow, so the selector recommends pre-filtering.
assert_eq!(choose_strategy(&evaluator), FilterStrategy::PreFilter);

Errors

FilterEvaluator::new returns iqdb_types::Result; the only failure is IqdbError::InvalidFilter, returned when a filter exceeds MAX_FILTER_DEPTH nesting or carries an In node wider than MAX_IN_VALUES. After a filter is validated, evaluate is infallible and never panics — including on records with no metadata, type mismatches, and NaN values.

Status

v0.3.0 — the evaluator plus strategy selection. On top of the canonical FilterEvaluator (validate-on-construction, infallible allocation-free per-row evaluate), this release adds the prefilter / postfilter scan helpers, a structural estimate_selectivity, and a tunable selector (choose_strategy / StrategySelector) that resolves PreFilter vs PostFilter. Semantics and the new surface are pinned by unit, integration, and property tests, verified across the CI matrix (Linux, macOS, Windows) on stable and the 1.87 MSRV. The optional inverted MetadataIndex, index-backed selectivity, and InFilter pushdown remain deferred until the first approximate-index consumer lands — see the ROADMAP for the rationale. The full surface is documented in docs/API.md.

Where It Fits

iqdb-filter sits just above the types crate and is consumed by the index layer:

  • iqdb-types — the Filter, Metadata, and Value types it evaluates
  • iqdb-flat / iqdb-hnsw / iqdb-ivf — delegate here for metadata filtering
  • iqdb — exposes filtered search to users

Its only first-party dependency is iqdb-types, so it is unblocked today.

Standards

Built to the iQDB Rust standard. See REPS.md (Rust Efficiency & Performance Standards) and dev/DIRECTIVES.md for the engineering law and the definition of done. Before a PR: cargo fmt --all, cargo clippy --all-targets --all-features -- -D warnings, and cargo test --all-features must be clean.