pub struct StaticEncoder { /* private fields */ }Expand description
CPU-only static encoder.
Owns a loaded StaticEmbedModel plus identity metadata. The
embedder is constructed by main.rs::load_pipeline via
StaticEncoder::from_pretrained, passing either a local path
containing the Model2Vec files or (planned) an HF repo ID.
Implementations§
Source§impl StaticEncoder
impl StaticEncoder
Sourcepub fn encode_query(&self, query: &str) -> Vec<f32>
pub fn encode_query(&self, query: &str) -> Vec<f32>
Encode a query string into a single embedding row.
Used by RipvecIndex::search for hybrid/semantic dispatch.
Sourcepub fn from_pretrained(model_repo: &str) -> Result<Self>
pub fn from_pretrained(model_repo: &str) -> Result<Self>
Load a model by HuggingFace repo ID or local path.
Two acceptance shapes:
- Local path — if
model_reponames an existing directory, load directly from it. Used by the parity test fixture path (/tmp/potion-base-32M) and any user pre-staging files. - HuggingFace repo ID — otherwise treat as
org/repo, downloadconfig.json/tokenizer.json/model.safetensorsviahf-hubinto~/.cache/huggingface/hub/, and load from there. Matchesload_classic_cpu/load_modernbert_cpu’s behaviour so the user-facing API is consistent: bare--model ripvecwith no--model-repoflag works.
§Errors
Propagates the underlying I/O, download, or parse error if the files cannot be obtained or the safetensors layout is unrecognized.
Trait Implementations§
Source§impl VectorEncoder for StaticEncoder
impl VectorEncoder for StaticEncoder
Source§fn embed_root(
&self,
root: &Path,
cfg: &SearchConfig,
profiler: &Profiler,
) -> Result<(Vec<CodeChunk>, Vec<Vec<f32>>)>
fn embed_root( &self, root: &Path, cfg: &SearchConfig, profiler: &Profiler, ) -> Result<(Vec<CodeChunk>, Vec<Vec<f32>>)>
Three-stage bounded-queue pipeline:
- Chunk producer — rayon
par_iterover the file list. Each file is read, parsed by tree-sitter (or line-merged on fallback), and emitted as(CodeChunk, String)pairs into a bounded channel of capacityPIPELINE_BATCH_SIZE * 8. - Batch accumulator — a single scoped thread drains the
chunk channel, packs
PIPELINE_BATCH_SIZEpairs per batch, and forwards into a bounded channel of capacityPIPELINE_RING_SIZE. - Encode worker — a single scoped thread receives batches
and calls
StaticEmbedModel::encode_batch, whose internalpar_iterlights up rayon for the pool_ids kernel.
Why this shape:
- The previous “chunk all, then embed all” implementation held
the entire
Vec<String>of chunk contents in memory between phases. On the linux corpus that was ~400 MB peak. The bounded queues cap in-flight memory atPIPELINE_BATCH_SIZE * 8 + PIPELINE_RING_SIZE * PIPELINE_BATCH_SIZEchunks regardless of corpus size — under 15 MB. - The chunk phase (13s on linux) is hidden inside the embed phase (70s) instead of serializing before it. Pre-pipeline profile showed user-time at 394s on 82s wall = 4.8x parallelism on 12 cores; pipeline lets idle cores chew on chunking while embed runs.
- Mirrors
embed::embed_all_streaming’s shape so the two pipelines (BERT + semble) share architectural conventions.
Hidden dimension of the emitted embeddings. Read more
Auto Trait Implementations§
impl !Freeze for StaticEncoder
impl RefUnwindSafe for StaticEncoder
impl Send for StaticEncoder
impl Sync for StaticEncoder
impl Unpin for StaticEncoder
impl UnsafeUnpin for StaticEncoder
impl UnwindSafe for StaticEncoder
Blanket Implementations§
Source§impl<T> ArchivePointee for T
impl<T> ArchivePointee for T
Source§type ArchivedMetadata = ()
type ArchivedMetadata = ()
The archived version of the pointer metadata for this type.
Source§fn pointer_metadata(
_: &<T as ArchivePointee>::ArchivedMetadata,
) -> <T as Pointee>::Metadata
fn pointer_metadata( _: &<T as ArchivePointee>::ArchivedMetadata, ) -> <T as Pointee>::Metadata
Converts some archived metadata to the pointer metadata for itself.
Source§impl<T> BorrowMut<T> for Twhere
T: ?Sized,
impl<T> BorrowMut<T> for Twhere
T: ?Sized,
Source§fn borrow_mut(&mut self) -> &mut T
fn borrow_mut(&mut self) -> &mut T
Mutably borrows from an owned value. Read more
Source§impl<T> Downcast for Twhere
T: Any,
impl<T> Downcast for Twhere
T: Any,
Source§fn into_any(self: Box<T>) -> Box<dyn Any>
fn into_any(self: Box<T>) -> Box<dyn Any>
Converts
Box<dyn Trait> (where Trait: Downcast) to Box<dyn Any>, which can then be
downcast into Box<dyn ConcreteType> where ConcreteType implements Trait.Source§fn into_any_rc(self: Rc<T>) -> Rc<dyn Any>
fn into_any_rc(self: Rc<T>) -> Rc<dyn Any>
Converts
Rc<Trait> (where Trait: Downcast) to Rc<Any>, which can then be further
downcast into Rc<ConcreteType> where ConcreteType implements Trait.Source§fn as_any(&self) -> &(dyn Any + 'static)
fn as_any(&self) -> &(dyn Any + 'static)
Converts
&Trait (where Trait: Downcast) to &Any. This is needed since Rust cannot
generate &Any’s vtable from &Trait’s.Source§fn as_any_mut(&mut self) -> &mut (dyn Any + 'static)
fn as_any_mut(&mut self) -> &mut (dyn Any + 'static)
Converts
&mut Trait (where Trait: Downcast) to &Any. This is needed since Rust cannot
generate &mut Any’s vtable from &mut Trait’s.Source§impl<T> DowncastSend for T
impl<T> DowncastSend for T
Source§impl<T> DowncastSync for T
impl<T> DowncastSync for T
Source§impl<T> Instrument for T
impl<T> Instrument for T
Source§fn instrument(self, span: Span) -> Instrumented<Self>
fn instrument(self, span: Span) -> Instrumented<Self>
Source§fn in_current_span(self) -> Instrumented<Self>
fn in_current_span(self) -> Instrumented<Self>
Source§impl<T> IntoEither for T
impl<T> IntoEither for T
Source§fn into_either(self, into_left: bool) -> Either<Self, Self>
fn into_either(self, into_left: bool) -> Either<Self, Self>
Converts
self into a Left variant of Either<Self, Self>
if into_left is true.
Converts self into a Right variant of Either<Self, Self>
otherwise. Read moreSource§fn into_either_with<F>(self, into_left: F) -> Either<Self, Self>
fn into_either_with<F>(self, into_left: F) -> Either<Self, Self>
Converts
self into a Left variant of Either<Self, Self>
if into_left(&self) returns true.
Converts self into a Right variant of Either<Self, Self>
otherwise. Read moreSource§impl<T> LayoutRaw for T
impl<T> LayoutRaw for T
Source§fn layout_raw(_: <T as Pointee>::Metadata) -> Result<Layout, LayoutError>
fn layout_raw(_: <T as Pointee>::Metadata) -> Result<Layout, LayoutError>
Returns the layout of the type.
Source§impl<T, N1, N2> Niching<NichedOption<T, N1>> for N2
impl<T, N1, N2> Niching<NichedOption<T, N1>> for N2
Source§unsafe fn is_niched(niched: *const NichedOption<T, N1>) -> bool
unsafe fn is_niched(niched: *const NichedOption<T, N1>) -> bool
Returns whether the given value has been niched. Read more
Source§fn resolve_niched(out: Place<NichedOption<T, N1>>)
fn resolve_niched(out: Place<NichedOption<T, N1>>)
Writes data to
out indicating that a T is niched.