Skip to main content

VectorEncoder

Trait VectorEncoder 

Source
pub trait VectorEncoder: Send + Sync {
    // Required methods
    fn embed_root(
        &self,
        root: &Path,
        cfg: &SearchConfig,
        profiler: &Profiler,
    ) -> Result<(Vec<CodeChunk>, Vec<Vec<f32>>)>;
    fn hidden_dim(&self) -> usize;
    fn identity(&self) -> &str;
}
Expand description

Trait that abstracts text/chunks → embedding vectors.

The implementation owns its full pipeline (walk, chunk, encode).

§Object safety

dyn VectorEncoder is constructible. Methods take &self and use only concrete return types — no associated types or generic methods.

§Thread safety

Send + Sync is required because the encoder is shared across the indexing pipeline’s rayon and channel-based workers.

Required Methods§

Source

fn embed_root( &self, root: &Path, cfg: &SearchConfig, profiler: &Profiler, ) -> Result<(Vec<CodeChunk>, Vec<Vec<f32>>)>

Walk root, chunk every supported file, and embed every chunk.

Returns the chunks and their embeddings in parallel order: chunk i has embedding embeddings[i]. The ripvec engine uses an AST-merge chunker and projects chunks onto CodeChunk.

cfg carries pipeline tuning (walk filters, etc.).

§Errors

Returns an error if file walking, chunking, or inference fails.

Source

fn hidden_dim(&self) -> usize

Hidden dimension of the emitted embeddings.

Used by SearchIndex for the embedding matrix shape and by the cache layer to refuse dimension-mismatched loads.

Source

fn identity(&self) -> &str

Stable identifier used as the cache-manifest key.

For the ripvec engine, the Model2Vec repo string (e.g. "minishlab/potion-code-16M"). Consulted for logging and diagnostics.

Implementors§