Skip to main content

Crate triblespace_search

Crate triblespace_search 

Source
Expand description

Content-addressed BM25 + HNSW indexes on top of triblespace piles. See docs/DESIGN.md for the full design rationale.

Two canonical blob types, loaded zero-copy via anybytes with bit-packed bodies via jerky:

bm25::BM25Builder::build goes direct-to-succinct (sorts keys into a CompressedUniverse first, then accumulates per-term postings in universe-code order — no remap pass). hnsw::HNSWBuilder::build also returns the succinct form directly (delegating through today’s SuccinctHNSWIndex::from_naive internally — the naive intermediate is a necessary buffer because HNSW levels are only revealed incrementally). Naive reference implementations live under testing — see testing::BM25Index, testing::HNSWIndex, and testing::FlatIndex for oracles + benchmarks. Reach them via BM25Builder::build_naive() / HNSWBuilder::build_naive() / FlatBuilder::build().

Both indexes are rebuilt-and-replaced (no mutation); the caller persists the resulting handle wherever appropriate (branch metadata, commit metadata, a plain trible, or an in-memory cache).

§Query surface

Two constraint shapes plug into find! / and! / pattern!. Both follow the same rule: scoring is not a bound variable. The constraint filters on a fixed score_floor parameter; callers recompute the precise score afterwards if they need it for ranking.

§Quickstart

use triblespace_core::find;
use triblespace_core::id::Id;

use triblespace_search::bm25::BM25Builder;
use triblespace_search::succinct::SuccinctBM25Index;
use triblespace_search::tokens::hash_tokens;

// 1. Build an in-memory index.
let mut b: BM25Builder = BM25Builder::new();
b.insert(Id::new([1; 16]).unwrap(), hash_tokens("the quick brown fox"));
b.insert(Id::new([2; 16]).unwrap(), hash_tokens("the lazy brown dog"));
b.insert(Id::new([3; 16]).unwrap(), hash_tokens("quick silver fox"));

// 2. Build a succinct BM25 index in a single pass.
let idx: SuccinctBM25Index = b.build();

// 3. Filter through the engine — constraint binds `doc`
//    only; `score_floor = 0.0` means "any matching doc".
let terms = hash_tokens("fox");
let docs: Vec<(Id,)> = find!(
    (doc: Id),
    idx.matches(doc, &terms, 0.0)
).collect();
assert_eq!(docs.len(), 2);

See the examples/ directory for runnable walkthroughs: compose_bm25_and_pattern / multi_term_bm25_search (BM25 + pattern joins), compose_hnsw_and_pattern (vector similarity + pattern), hybrid_search (all three composed in one find!), and phrase_search for the typed-tokenizer pattern.

Modules§

bm25
BM25-style lexical / associative retrieval.
constraint
Triblespace query-engine integration.
hnsw
Approximate nearest-neighbour search over caller-supplied embeddings.
ring
Fixed-predicate 2-ring for unlabeled graphs.
schemas
Inline and blob encodings minted for triblespace-search.
succinct
Jerky-backed succinct building blocks for the index blobs.
testing
Reference implementations for tests and benchmarks.
tokens
Opt-in helpers for turning strings into the 32-byte triblespace Inlines that bm25::BM25Index uses as term ids.