ripvec-core 1.0.4

Semantic code + document search engine. Cacheless static-embedding + cross-encoder rerank by default; optional ModernBERT/BGE transformer engines with GPU backends. Tree-sitter chunking, hybrid BM25 + PageRank, composable ranking layers.
Documentation
//! Engine-agnostic searchable-index trait.
//!
//! Both [`HybridIndex`](crate::hybrid::HybridIndex) (transformer
//! engines) and [`RipvecIndex`](crate::encoder::ripvec::index::RipvecIndex)
//! (the cacheless ripvec engine) expose the same operational surface
//! to downstream consumers: a slice of chunks, the embedding row for
//! a chunk by index, and a search method that returns `(chunk_idx,
//! score)` pairs ranked descending.
//!
//! Naming the surface as a trait lets LSP / MCP code (navigation,
//! symbols, hover, references, calls) take `&dyn SearchableIndex`
//! instead of `&HybridIndex` and work transparently across engines.
//! Without it, every engine swap requires touching every LSP module.

use crate::chunk::CodeChunk;
use crate::hybrid::SearchMode;

/// Engine-agnostic searchable index.
///
/// Implementations: [`HybridIndex`](crate::hybrid::HybridIndex) for
/// transformer engines,
/// [`RipvecIndex`](crate::encoder::ripvec::index::RipvecIndex) for
/// the ripvec engine.
pub trait SearchableIndex: Send + Sync {
    /// Borrow the indexed chunks.
    fn chunks(&self) -> &[CodeChunk];

    /// Search by text query.
    ///
    /// Returns `(chunk_idx, score)` pairs ranked descending. Score is
    /// normalized to `[0, 1]` regardless of mode so callers can apply
    /// a single threshold consistently.
    fn search(&self, query_text: &str, top_k: usize, mode: SearchMode) -> Vec<(usize, f32)>;

    /// Search by similarity to an existing chunk's embedding.
    ///
    /// Caller passes the chunk index whose embedding should be used
    /// as the query vector. The canonical `goto_definition` pattern:
    /// the LSP layer identifies the chunk at the cursor, then asks
    /// the index for structurally similar chunks elsewhere.
    ///
    /// If `chunk_idx` is out of range or the engine cannot provide
    /// an embedding for it (keyword-only mode, embedding row not
    /// stored), implementations fall back to text-only search via
    /// [`Self::search`].
    fn search_from_chunk(
        &self,
        chunk_idx: usize,
        query_text: &str,
        top_k: usize,
        mode: SearchMode,
    ) -> Vec<(usize, f32)>;
}