Skip to main content

iqdb_filter/
lib.rs

1//! # iqdb-filter
2//!
3//! Canonical [`iqdb_types::Filter`] evaluator for the HiveDB **iqdb**
4//! vector-database spine. One place that decides what `Filter` means; every
5//! index that supports metadata filtering delegates to it.
6//!
7//! ## Why this lives outside the index crates
8//!
9//! Filtering used to be inlined in `iqdb-flat`. The moment a second index
10//! (HNSW, IVF) starts honouring filters, two copies of the semantics would
11//! drift — the `Neq(absent)` / `Not(Eq(absent))` rule is exactly the kind of
12//! subtlety that splits between implementations and produces query-result
13//! bugs nobody can attribute. Extracting the evaluator pins one set of
14//! semantics across every consumer.
15//!
16//! ## Public surface
17//!
18//! - [`FilterEvaluator`] — `new(filter) -> Result<Self, IqdbError>` validates
19//!   the filter once (depth, `In` cardinality); `evaluate(metadata) -> bool`
20//!   is infallible on a validated filter. [`FilterEvaluator::prefilter`] and
21//!   [`FilterEvaluator::postfilter`] apply it as lazy, allocation-free scan
22//!   adapters over a stream of `(key, metadata)` pairs.
23//! - [`estimate_selectivity`] — a best-effort, structural estimate of the
24//!   fraction of records a validated filter passes, in `[0.0, 1.0]`.
25//! - [`choose_strategy`] / [`StrategySelector`] — pick a concrete
26//!   [`FilterStrategy`] from the selectivity estimate. The free function uses
27//!   the [`DEFAULT_PREFILTER_THRESHOLD`]; the selector is the Tier-2 builder
28//!   for tuning it.
29//! - [`FilterStrategy`] — vocabulary for how an index applies a filter
30//!   relative to its distance scan. The selector resolves `Auto` down to
31//!   `PreFilter` / `PostFilter`; `InFilter` waits on a graph-index consumer.
32//! - [`MAX_FILTER_DEPTH`] / [`MAX_IN_VALUES`] — documented validation caps,
33//!   `pub const` so callers can quote them in error messages or higher-level
34//!   validation.
35//!
36//! ## Null and absent-field semantics
37//!
38//! The evaluator implements the **closed-world** rule pinned by
39//! [`iqdb_types::Filter`]: every leaf comparison (`Eq`, `Neq`, `Lt`, `Lte`,
40//! `Gt`, `Gte`, `In`) over a field absent from the record's metadata
41//! evaluates to `false`. Type mismatches between a stored value and a literal
42//! also evaluate to `false`. `Value::Float(NaN)` under any ordered comparison
43//! evaluates to `false` (IEEE-754 unordered). `Not` over a `false` leaf is
44//! `true`, which is the idiom for "records without this field, or with a
45//! non-matching value."
46//!
47//! `Neq(absent) → false` and `Not(Eq(absent)) → true` are therefore **not**
48//! interchangeable. The pair is pinned by the conformance tests in
49//! `tests/conformance.rs`.
50//!
51//! ## DoS hardening
52//!
53//! Construction is the validation gate. The walk is iterative (an explicit
54//! stack, not recursion), so `new` cannot itself stack-overflow on
55//! adversarial input. After construction every filter is bounded by
56//! [`MAX_FILTER_DEPTH`], so the recursive [`FilterEvaluator::evaluate`] hot
57//! path runs with a bounded call stack.
58//!
59//! ## Example
60//!
61//! ```
62//! use iqdb_filter::FilterEvaluator;
63//! use iqdb_types::{Filter, Metadata, Value};
64//!
65//! # fn main() -> iqdb_types::Result<()> {
66//! let filter = Filter::and(vec![
67//!     Filter::eq("published", Value::Bool(true)),
68//!     Filter::gt("year", Value::Int(2000)),
69//! ]);
70//! let evaluator = FilterEvaluator::new(filter)?;
71//!
72//! let meta: Metadata = [
73//!     ("published".to_string(), Value::Bool(true)),
74//!     ("year".to_string(), Value::Int(2026)),
75//! ]
76//! .into_iter()
77//! .collect();
78//!
79//! assert!(evaluator.evaluate(Some(&meta)));
80//! assert!(!evaluator.evaluate(None));
81//! # Ok(())
82//! # }
83//! ```
84
85#![cfg_attr(docsrs, feature(doc_cfg))]
86#![deny(warnings)]
87#![deny(missing_docs)]
88#![deny(unsafe_op_in_unsafe_fn)]
89#![deny(unused_must_use)]
90#![deny(unused_results)]
91#![deny(clippy::unwrap_used)]
92#![deny(clippy::expect_used)]
93#![deny(clippy::todo)]
94#![deny(clippy::unimplemented)]
95#![deny(clippy::print_stdout)]
96#![deny(clippy::print_stderr)]
97#![deny(clippy::dbg_macro)]
98#![deny(clippy::unreachable)]
99#![deny(clippy::undocumented_unsafe_blocks)]
100#![forbid(unsafe_code)]
101
102mod eval;
103mod evaluator;
104mod selectivity;
105mod strategy;
106
107pub use crate::evaluator::{FilterEvaluator, MAX_FILTER_DEPTH, MAX_IN_VALUES};
108pub use crate::selectivity::estimate_selectivity;
109pub use crate::strategy::{
110    DEFAULT_PREFILTER_THRESHOLD, FilterStrategy, StrategySelector, choose_strategy,
111};
112
113/// The version of this crate, taken from `Cargo.toml` at compile time.
114///
115/// Exposed so a consumer can report the exact `iqdb-filter` build it links
116/// against — useful in diagnostics and version-skew checks across the iqdb
117/// crate family.
118///
119/// # Examples
120///
121/// ```
122/// // Carries a `major.minor.patch` SemVer core.
123/// let version = iqdb_filter::VERSION;
124/// assert_eq!(version.split('.').count(), 3);
125/// assert!(version.split('.').all(|part| !part.is_empty()));
126/// ```
127pub const VERSION: &str = env!("CARGO_PKG_VERSION");