pub fn search_semantic(
query_embedding: &[f32],
chunk_embeddings: &Array2<f32>,
top_k: usize,
selector: Option<&[usize]>,
) -> Vec<(usize, f32)>Expand description
Pure semantic search: rank every chunk by dot product against the query embedding, then take the top-k after optional selector mask.
Math: scores = chunk_embeddings @ query_embedding top-k by select_nth_unstable_by, then sort the survivors.
chunk_embeddings is row-major [n_chunks, hidden_dim]; with the
cpu-accelerate feature ndarray’s .dot() dispatches to Accelerate’s
cblas_sgemv, which is vendor-tuned and near memory-bandwidth-bound
(1 GB read per query at ~250 GB/s = ~4 ms theoretical floor on 1M
chunks at 256 dim). Earlier scalar pointer-chasing path took 583
ms per query (profile: samply v1, 2026-05-21).
Top-k uses select_nth_unstable_by (O(N) average) instead of a
full sort (O(N log N)) — at 1M chunks selecting top-100 that’s
~1M ops vs ~20M.