iqdb_filter/lib.rs
1//! # iqdb-filter
2//!
3//! Canonical [`iqdb_types::Filter`] evaluator for the HiveDB **iqdb**
4//! vector-database spine. One place that decides what `Filter` means; every
5//! index that supports metadata filtering delegates to it.
6//!
7//! ## Why this lives outside the index crates
8//!
9//! Filtering used to be inlined in `iqdb-flat`. The moment a second index
10//! (HNSW, IVF) starts honouring filters, two copies of the semantics would
11//! drift — the `Neq(absent)` / `Not(Eq(absent))` rule is exactly the kind of
12//! subtlety that splits between implementations and produces query-result
13//! bugs nobody can attribute. Extracting the evaluator pins one set of
14//! semantics across every consumer.
15//!
16//! ## Public surface
17//!
18//! - [`FilterEvaluator`] — `new(filter) -> Result<Self, IqdbError>` validates
19//! the filter once (depth, `In` cardinality); `evaluate(metadata) -> bool`
20//! is infallible on a validated filter. [`FilterEvaluator::prefilter`] and
21//! [`FilterEvaluator::postfilter`] apply it as lazy, allocation-free scan
22//! adapters over a stream of `(key, metadata)` pairs.
23//! - [`MetadataIndex`] — an opt-in, per-field inverted index that resolves a
24//! selective `Eq` / `In` predicate to a candidate key set (a superset of the
25//! true matches), and backs a sharper, count-based selectivity estimate.
26//! - [`estimate_selectivity`] — a best-effort, structural estimate of the
27//! fraction of records a validated filter passes, in `[0.0, 1.0]`; the
28//! index-backed counterpart is [`MetadataIndex::estimate_selectivity`].
29//! - [`choose_strategy`] / [`StrategySelector`] — pick a concrete
30//! [`FilterStrategy`] from the selectivity estimate. The free function uses
31//! the [`DEFAULT_PREFILTER_THRESHOLD`]; the selector is the Tier-2 builder
32//! for tuning it.
33//! - [`FilterStrategy`] — vocabulary for how an index applies a filter
34//! relative to its distance scan. The selector resolves `Auto` down to
35//! `PreFilter` / `PostFilter`; `InFilter` waits on a graph-index consumer.
36//! - [`MAX_FILTER_DEPTH`] / [`MAX_IN_VALUES`] — documented validation caps,
37//! `pub const` so callers can quote them in error messages or higher-level
38//! validation.
39//!
40//! ## Null and absent-field semantics
41//!
42//! The evaluator implements the **closed-world** rule pinned by
43//! [`iqdb_types::Filter`]: every leaf comparison (`Eq`, `Neq`, `Lt`, `Lte`,
44//! `Gt`, `Gte`, `In`) over a field absent from the record's metadata
45//! evaluates to `false`. Type mismatches between a stored value and a literal
46//! also evaluate to `false`. `Value::Float(NaN)` under any ordered comparison
47//! evaluates to `false` (IEEE-754 unordered). `Not` over a `false` leaf is
48//! `true`, which is the idiom for "records without this field, or with a
49//! non-matching value."
50//!
51//! `Neq(absent) → false` and `Not(Eq(absent)) → true` are therefore **not**
52//! interchangeable. The pair is pinned by the conformance tests in
53//! `tests/conformance.rs`.
54//!
55//! ## DoS hardening
56//!
57//! Construction is the validation gate. The walk is iterative (an explicit
58//! stack, not recursion), so `new` cannot itself stack-overflow on
59//! adversarial input. After construction every filter is bounded by
60//! [`MAX_FILTER_DEPTH`], so the recursive [`FilterEvaluator::evaluate`] hot
61//! path runs with a bounded call stack.
62//!
63//! ## Example
64//!
65//! ```
66//! use iqdb_filter::FilterEvaluator;
67//! use iqdb_types::{Filter, Metadata, Value};
68//!
69//! # fn main() -> iqdb_types::Result<()> {
70//! let filter = Filter::and(vec![
71//! Filter::eq("published", Value::Bool(true)),
72//! Filter::gt("year", Value::Int(2000)),
73//! ]);
74//! let evaluator = FilterEvaluator::new(filter)?;
75//!
76//! let meta: Metadata = [
77//! ("published".to_string(), Value::Bool(true)),
78//! ("year".to_string(), Value::Int(2026)),
79//! ]
80//! .into_iter()
81//! .collect();
82//!
83//! assert!(evaluator.evaluate(Some(&meta)));
84//! assert!(!evaluator.evaluate(None));
85//! # Ok(())
86//! # }
87//! ```
88
89#![cfg_attr(docsrs, feature(doc_cfg))]
90#![deny(warnings)]
91#![deny(missing_docs)]
92#![deny(unsafe_op_in_unsafe_fn)]
93#![deny(unused_must_use)]
94#![deny(unused_results)]
95#![deny(clippy::unwrap_used)]
96#![deny(clippy::expect_used)]
97#![deny(clippy::todo)]
98#![deny(clippy::unimplemented)]
99#![deny(clippy::print_stdout)]
100#![deny(clippy::print_stderr)]
101#![deny(clippy::dbg_macro)]
102#![deny(clippy::unreachable)]
103#![deny(clippy::undocumented_unsafe_blocks)]
104#![forbid(unsafe_code)]
105
106mod eval;
107mod evaluator;
108mod index;
109mod selectivity;
110mod strategy;
111
112pub use crate::evaluator::{FilterEvaluator, MAX_FILTER_DEPTH, MAX_IN_VALUES};
113pub use crate::index::MetadataIndex;
114pub use crate::selectivity::estimate_selectivity;
115pub use crate::strategy::{
116 DEFAULT_PREFILTER_THRESHOLD, FilterStrategy, StrategySelector, choose_strategy,
117};
118
119/// The version of this crate, taken from `Cargo.toml` at compile time.
120///
121/// Exposed so a consumer can report the exact `iqdb-filter` build it links
122/// against — useful in diagnostics and version-skew checks across the iqdb
123/// crate family.
124///
125/// # Examples
126///
127/// ```
128/// // Carries a `major.minor.patch` SemVer core.
129/// let version = iqdb_filter::VERSION;
130/// assert_eq!(version.split('.').count(), 3);
131/// assert!(version.split('.').all(|part| !part.is_empty()));
132/// ```
133pub const VERSION: &str = env!("CARGO_PKG_VERSION");