<h1 align="center">
<img width="99" alt="Rust logo" src="https://raw.githubusercontent.com/jamesgober/rust-collection/72baabd71f00e14aa9184efcb16fa3deddda3a0a/assets/rust-logo.svg">
<br>
<b>iqdb-filter</b>
<br>
<sub><sup>iQDB HYBRID FILTERING</sup></sub>
</h1>
<div align="center">
<a href="https://crates.io/crates/iqdb-filter"><img alt="Crates.io" src="https://img.shields.io/crates/v/iqdb-filter"></a>
<a href="https://crates.io/crates/iqdb-filter"><img alt="Downloads" src="https://img.shields.io/crates/d/iqdb-filter?color=%230099ff"></a>
<a href="https://docs.rs/iqdb-filter"><img alt="docs.rs" src="https://img.shields.io/docsrs/iqdb-filter"></a>
<a href="https://github.com/jamesgober/iqdb-filter/actions"><img alt="CI" src="https://github.com/jamesgober/iqdb-filter/actions/workflows/ci.yml/badge.svg"></a>
<a href="https://github.com/rust-lang/rfcs/blob/master/text/2495-min-rust-version.md"><img alt="MSRV" src="https://img.shields.io/badge/MSRV-1.87%2B-blue"></a>
</div>
<br>
<div align="left">
<p>
<strong>iqdb-filter</strong> is the metadata-filtering layer for the iQDB vector-database spine. It is the one place that decides what a <code>Filter</code> means: every index that honours metadata filters delegates here, so the semantics can never drift between implementations.
</p>
<p>
It evaluates the <code>Filter</code> expression language defined in <a href="https://crates.io/crates/iqdb-types"><code>iqdb-types</code></a> against a record's metadata, with strict, predictable closed-world rules and validation that bounds every filter before it runs.
</p>
<br>
<hr>
<p>
<strong>MSRV is 1.87+</strong> (Rust 2024 edition). Validate once, evaluate per-row. No panics on hostile input. ~19 ns to evaluate a compound predicate.
</p>
<blockquote>
<strong>Status: stable (1.0).</strong> The public API is committed under SemVer for the 1.x series — no breaking changes until 2.0. See <a href="./CHANGELOG.md"><code>CHANGELOG.md</code></a>.
</blockquote>
</div>
<hr>
<br>
<h2>What it does</h2>
- **Canonical evaluator** — one implementation of `Filter` semantics, shared by every metadata-aware index so query results never diverge
- **Validate once, evaluate many** — `FilterEvaluator::new` checks the filter (depth, `In` cardinality) a single time; `evaluate` is then infallible and runs per-row inside the search loop
- **Closed-world semantics** — a leaf over an absent field is `false`, type mismatches are `false`, `NaN` orderings are `false`, and `Not` of a `false` leaf is `true` (the "records without this field" idiom)
- **DoS-hardened** — iterative validation that can't stack-overflow, with bounded depth and `In` width; the library never panics on adversarial input
- **Scan helpers** — `prefilter` / `postfilter` apply the evaluator as lazy, allocation-free iterator adapters over a stream of `(key, metadata)` pairs
- **Strategy selection** — a selectivity estimate drives an automatic `PreFilter` / `PostFilter` choice, with a tunable threshold
- **Inverted index** — an opt-in, per-field `MetadataIndex` resolves selective `Eq` / `In` predicates to a candidate set (a superset of true matches) and backs a sharper, count-based selectivity estimate
- **First-party only** — depends solely on `iqdb-types`, so it is unblocked today
<br>
## Installation
```toml
[dependencies]
iqdb-filter = "1.0"
```
<br>
## Quick start
Build an evaluator once, then test it against each record's metadata:
```rust
use iqdb_filter::FilterEvaluator;
use iqdb_types::{Filter, Metadata, Value};
// published == true AND year > 2000
let filter = Filter::and(vec![
Filter::eq("published", Value::Bool(true)),
Filter::gt("year", Value::Int(2000)),
]);
let evaluator = FilterEvaluator::new(filter).expect("valid filter");
let meta: Metadata = [
("published".to_string(), Value::Bool(true)),
("year".to_string(), Value::Int(2026)),
]
.into_iter()
.collect();
assert!(evaluator.evaluate(Some(&meta)));
assert!(!evaluator.evaluate(None)); // no metadata -> every leaf is false
```
The `Not` / absent-field idiom selects records that *lack* a field, or carry it with a non-matching value:
```rust
use iqdb_filter::FilterEvaluator;
use iqdb_types::{Filter, Value};
// "records that are not authored by ada" — including records with no author.
let evaluator =
FilterEvaluator::new(Filter::not(Filter::eq("author", Value::String("ada".into()))))
.expect("valid filter");
assert!(evaluator.evaluate(None));
```
Validation rejects pathological filters up front — bounded by the public caps:
```rust
use iqdb_filter::{FilterEvaluator, MAX_IN_VALUES};
use iqdb_types::{Filter, IqdbError, Value};
// An `In` set wider than the cap is refused before it can slow a query.
let huge = vec![Value::Int(0); MAX_IN_VALUES + 1];
let err = FilterEvaluator::new(Filter::is_in("tag", huge)).unwrap_err();
assert_eq!(err, IqdbError::InvalidFilter);
```
Apply a strategy with the scan helpers, or let the selectivity estimate pick one:
```rust
use iqdb_filter::{FilterEvaluator, FilterStrategy, choose_strategy};
use iqdb_types::{Filter, Metadata, Value};
let evaluator = FilterEvaluator::new(Filter::eq("lang", Value::String("rust".into())))
.expect("valid filter");
// `prefilter` keeps the keys of matching candidates, lazily, before scoring.
let rust: Metadata = [("lang".to_string(), Value::String("rust".into()))]
.into_iter()
.collect();
let go: Metadata = [("lang".to_string(), Value::String("go".into()))]
.into_iter()
.collect();
let rows = [(0_usize, Some(&rust)), (1, Some(&go))];
let kept: Vec<usize> = evaluator.prefilter(rows).collect();
assert_eq!(kept, [0]);
// An equality predicate is narrow, so the selector recommends pre-filtering.
assert_eq!(choose_strategy(&evaluator), FilterStrategy::PreFilter);
```
For repeated queries, build an opt-in `MetadataIndex` so a selective predicate resolves to a candidate set instead of scanning every row:
```rust
use iqdb_filter::{FilterEvaluator, MetadataIndex};
use iqdb_types::{Filter, Metadata, Value};
let rows = [
(0_usize, [("lang".to_string(), Value::String("rust".into()))].into_iter().collect::<Metadata>()),
(1, [("lang".to_string(), Value::String("go".into()))].into_iter().collect::<Metadata>()),
(2, [("lang".to_string(), Value::String("rust".into()))].into_iter().collect::<Metadata>()),
];
// Index only the `lang` field.
.expect("valid filter");
// `candidates` returns a superset of true matches; confirm with `evaluate`.
let mut hits: Vec<usize> = match index.candidates(&evaluator) {
Some(cands) => cands,
None => (0..rows.len()).collect(), // unbounded predicate -> full scan
};
hits.sort_unstable();
assert_eq!(hits, [0, 2]);
```
<br>
## Errors
`FilterEvaluator::new` returns `iqdb_types::Result`; the only failure is
`IqdbError::InvalidFilter`, returned when a filter exceeds `MAX_FILTER_DEPTH`
nesting or carries an `In` node wider than `MAX_IN_VALUES`. After a filter is
validated, `evaluate` is infallible and never panics — including on records
with no metadata, type mismatches, and `NaN` values.
<br>
## Status
<code>v1.0.0</code> — **stable.** The full surface — the canonical
`FilterEvaluator` (validate-on-construction, infallible allocation-free per-row
`evaluate`), the `prefilter` / `postfilter` scan helpers, `estimate_selectivity`
+ the selector (`choose_strategy` / `StrategySelector`), and the opt-in per-field
`MetadataIndex` — is committed under SemVer for the 1.x series: no breaking
changes until 2.0. It is exercised by unit, integration, and property tests, a
consumer-simulation suite that builds a filtered top-`k` searcher on the public
API alone, and fuzz targets that drive the no-panic and superset contracts over
unbounded input; all verified across the CI matrix (Linux, macOS, Windows) on
stable and the 1.87 MSRV. The one remaining feature, `InFilter` pushdown into
graph traversal, is additive (`FilterStrategy` is `#[non_exhaustive]`) and will
ship in a later 1.x release when an approximate-index consumer drives it (see the
<a href="./dev/ROADMAP.md"><code>ROADMAP</code></a>). The full surface is
documented in <a href="./docs/API.md"><code>docs/API.md</code></a>.
<hr>
<br>
## Where It Fits
`iqdb-filter` sits just above the types crate and is consumed by the index layer:
- `iqdb-types` — the `Filter`, `Metadata`, and `Value` types it evaluates
- `iqdb-flat` / `iqdb-hnsw` / `iqdb-ivf` — delegate here for metadata filtering
- `iqdb` — exposes filtered search to users
Its only first-party dependency is `iqdb-types`, so it is unblocked today.
<br>
## Standards
Built to the iQDB Rust standard. See <a href="./REPS.md"><code>REPS.md</code></a> (Rust Efficiency & Performance Standards) and <a href="./dev/DIRECTIVES.md"><code>dev/DIRECTIVES.md</code></a> for the engineering law and the definition of done. Before a PR: `cargo fmt --all`, `cargo clippy --all-targets --all-features -- -D warnings`, and `cargo test --all-features` must be clean.
<br>
<div id="license">
<h2>License</h2>
<p>Licensed under either of</p>
<ul>
<li><b>Apache License, Version 2.0</b> — <a href="./LICENSE-APACHE">LICENSE-APACHE</a></li>
<li><b>MIT License</b> — <a href="./LICENSE-MIT">LICENSE-MIT</a></li>
</ul>
<p>at your option.</p>
</div>
<div align="center">
<h2></h2>
<sup>COPYRIGHT <small>©</small> 2026 <strong>JAMES GOBER.</strong></sup>
</div>