Skip to main content

Module hybrid

Module hybrid 

Source
Expand description

Hybrid search: RRF fusion of semantic + BM25, then boosts and rerank.

Port of ~/src/semble/src/semble/search.py. Three entry points:

  • search_semantic — cosine similarity over the dense index.
  • search_bm25 — BM25 scoring (re-exported from the bm25 module).
  • search_hybrid — fuses both ranked lists via Reciprocal Rank Fusion (k=60), over-fetching top_k * 5 candidates, then applies ripvec’s boost_multi_chunk_files + apply_query_boost + the penalty-aware rerank_topk.

Constants§

RRF_K
Reciprocal Rank Fusion smoothing constant. Matches Python _RRF_K = 60 from search.py:11.

Functions§

search_hybrid
Hybrid search: alpha-weighted RRF fusion of semantic + BM25, followed by file-coherence + query boosts and the penalty-aware reranker. Mirrors search.py:search_hybrid.
search_semantic
Pure semantic search: rank every chunk by dot product against the query embedding, then take the top-k after optional selector mask.