Expand description
Approximate nearest neighbors via random projection forest.
Builds n_trees binary trees by recursively splitting the data at
random hyperplanes. At query time, each tree nominates a candidate
leaf; the union of candidates is re-scored exactly to produce the
final k-NN list.
Deterministic for a given seed (uses [SplitMix64]).
Complexity:
- Build: O(N · d · trees · log N)
- Query: O(trees · log N · d + |candidates| · d)
Designed for cosine similarity (all vectors are L2-normalized
internally). Reusable across UMAP graph construction,
GraphModularity scoring, and downstream consumers (globetrot
similarity lookup).