Skip to main content

LocalProvider

Struct LocalProvider 

Source
pub struct LocalProvider(/* private fields */);
Available on crate feature semantic only.
Expand description

Local ONNX embedding provider.

Cheap to clone (Arc under the hood). All inference is serialized behind an internal mutex so the same provider can be shared across threads — for high-throughput scenarios, use one LocalProvider per worker.

Implementations§

Source§

impl LocalProvider

Source

pub fn builder() -> LocalProviderBuilder

Start a LocalProviderBuilder.

Use the builder when you need a non-default pooling strategy, custom prefixes, custom max sequence length, or a hand-tuned thread count.

Source

pub fn from_pretrained(model_id: &str) -> Result<Self>

Construct from a Hugging Face Hub model id.

Downloads the ONNX model + tokenizer on first use via hf-hub, caching to the user’s HF cache directory. Looks up the appropriate pooling strategy and query / document prefixes from internal tables (BGE → Cls + query prefix, E5 → Mean + query/ passage prefixes, …).

§Arguments
  • model_id — HF identifier such as "BAAI/bge-small-en-v1.5".
§Errors

Returns crate::Error::Config if the repo lacks an ONNX file or tokenizer.json; crate::Error::Onnx for ort load failures; crate::Error::Tokenizer for tokenizer parse failures.

§Example
use txtfp::semantic::LocalProvider;
let p = LocalProvider::from_pretrained("BAAI/bge-small-en-v1.5")?;
Source

pub fn from_onnx( onnx_path: &Path, tokenizer_path: &Path, pooling: Pooling, ) -> Result<Self>

Construct from explicit ONNX + tokenizer paths.

Use this for self-hosted models or air-gapped deployments where downloading from the HF Hub is not an option.

§Arguments
  • onnx_path — path to a .onnx graph file.
  • tokenizer_path — path to a tokenizer.json (HF tokenizer format).
  • pooling — output pooling strategy (BGE → Pooling::Cls, E5/MiniLM → Pooling::Mean).
§Errors

Returns crate::Error::Onnx if the model fails to load, crate::Error::Tokenizer if the tokenizer JSON is invalid.

§Example
use std::path::Path;
use txtfp::semantic::{LocalProvider, Pooling};

let p = LocalProvider::from_onnx(
    Path::new("/srv/models/embedder.onnx"),
    Path::new("/srv/models/tokenizer.json"),
    Pooling::Cls,
)?;
Source

pub fn embed_document(&self, input: &str) -> Result<Embedding>

Embed input as a document.

Prepends the model’s document prefix (e.g. "passage: " for E5) when applicable, then runs tokenize + inference + pooling. The returned Embedding carries model_id = Some(...) so downstream comparisons can detect cross-model leaks.

§Errors

See LocalProvider::embed for the error variants.

Source

pub fn embed_query(&self, input: &str) -> Result<Embedding>

Embed input as a query.

Prepends the model’s query prefix (e.g. "Represent this sentence for searching relevant passages: " for bge-*, "query: " for e5-*). For models that don’t use asymmetric encoding, this is identical to embed_document.

Use embed_query for the search side of a retrieval pipeline and embed_document for the corpus side.

Trait Implementations§

Source§

impl Clone for LocalProvider

Source§

fn clone(&self) -> LocalProvider

Returns a duplicate of the value. Read more
1.0.0 · Source§

fn clone_from(&mut self, source: &Self)

Performs copy-assignment from source. Read more
Source§

impl Debug for LocalProvider

Source§

fn fmt(&self, f: &mut Formatter<'_>) -> Result

Formats the value using the given formatter. Read more
Source§

impl EmbeddingProvider for LocalProvider

Source§

type Input = str

The kind of input this provider consumes.
Source§

fn embed(&self, input: &str) -> Result<Embedding>

Compute an embedding for input. Read more
Source§

fn model_id(&self) -> &str

The model identifier this provider produces. Read more
Source§

fn dimension(&self) -> usize

The output dimensionality. Read more

Auto Trait Implementations§

Blanket Implementations§

Source§

impl<T> Any for T
where T: 'static + ?Sized,

Source§

fn type_id(&self) -> TypeId

Gets the TypeId of self. Read more
Source§

impl<T> ArchivePointee for T

Source§

type ArchivedMetadata = ()

The archived version of the pointer metadata for this type.
Source§

fn pointer_metadata( _: &<T as ArchivePointee>::ArchivedMetadata, ) -> <T as Pointee>::Metadata

Converts some archived metadata to the pointer metadata for itself.
Source§

impl<T> Borrow<T> for T
where T: ?Sized,

Source§

fn borrow(&self) -> &T

Immutably borrows from an owned value. Read more
Source§

impl<T> BorrowMut<T> for T
where T: ?Sized,

Source§

fn borrow_mut(&mut self) -> &mut T

Mutably borrows from an owned value. Read more
Source§

impl<T> CloneToUninit for T
where T: Clone,

Source§

unsafe fn clone_to_uninit(&self, dest: *mut u8)

🔬This is a nightly-only experimental API. (clone_to_uninit)
Performs copy-assignment from self to dest. Read more
Source§

impl<T> From<T> for T

Source§

fn from(t: T) -> T

Returns the argument unchanged.

Source§

impl<T> Instrument for T

Source§

fn instrument(self, span: Span) -> Instrumented<Self>

Instruments this type with the provided Span, returning an Instrumented wrapper. Read more
Source§

fn in_current_span(self) -> Instrumented<Self>

Instruments this type with the current Span, returning an Instrumented wrapper. Read more
Source§

impl<T, U> Into<U> for T
where U: From<T>,

Source§

fn into(self) -> U

Calls U::from(self).

That is, this conversion is whatever the implementation of From<T> for U chooses to do.

Source§

impl<T> IntoEither for T

Source§

fn into_either(self, into_left: bool) -> Either<Self, Self>

Converts self into a Left variant of Either<Self, Self> if into_left is true. Converts self into a Right variant of Either<Self, Self> otherwise. Read more
Source§

fn into_either_with<F>(self, into_left: F) -> Either<Self, Self>
where F: FnOnce(&Self) -> bool,

Converts self into a Left variant of Either<Self, Self> if into_left(&self) returns true. Converts self into a Right variant of Either<Self, Self> otherwise. Read more
Source§

impl<T> LayoutRaw for T

Source§

fn layout_raw(_: <T as Pointee>::Metadata) -> Result<Layout, LayoutError>

Returns the layout of the type.
Source§

impl<T, N1, N2> Niching<NichedOption<T, N1>> for N2
where T: SharedNiching<N1, N2>, N1: Niching<T>, N2: Niching<T>,

Source§

unsafe fn is_niched(niched: *const NichedOption<T, N1>) -> bool

Returns whether the given value has been niched. Read more
Source§

fn resolve_niched(out: Place<NichedOption<T, N1>>)

Writes data to out indicating that a T is niched.
Source§

impl<T> Pointable for T

Source§

const ALIGN: usize

The alignment of pointer.
Source§

type Init = T

The type for initializers.
Source§

unsafe fn init(init: <T as Pointable>::Init) -> usize

Initializes a with the given initializer. Read more
Source§

unsafe fn deref<'a>(ptr: usize) -> &'a T

Dereferences the given pointer. Read more
Source§

unsafe fn deref_mut<'a>(ptr: usize) -> &'a mut T

Mutably dereferences the given pointer. Read more
Source§

unsafe fn drop(ptr: usize)

Drops the object pointed to by the given pointer. Read more
Source§

impl<T> Pointee for T

Source§

type Metadata = ()

The metadata type for pointers and references to this type.
Source§

impl<T> PolicyExt for T
where T: ?Sized,

Source§

fn and<P, B, E>(self, other: P) -> And<T, P>
where T: Policy<B, E>, P: Policy<B, E>,

Create a new Policy that returns Action::Follow only if self and other return Action::Follow. Read more
Source§

fn or<P, B, E>(self, other: P) -> Or<T, P>
where T: Policy<B, E>, P: Policy<B, E>,

Create a new Policy that returns Action::Follow if either self or other returns Action::Follow. Read more
Source§

impl<T> Same for T

Source§

type Output = T

Should always be Self
Source§

impl<T> ToOwned for T
where T: Clone,

Source§

type Owned = T

The resulting type after obtaining ownership.
Source§

fn to_owned(&self) -> T

Creates owned data from borrowed data, usually by cloning. Read more
Source§

fn clone_into(&self, target: &mut T)

Uses borrowed data to replace owned data, usually by cloning. Read more
Source§

impl<T, U> TryFrom<U> for T
where U: Into<T>,

Source§

type Error = Infallible

The type returned in the event of a conversion error.
Source§

fn try_from(value: U) -> Result<T, <T as TryFrom<U>>::Error>

Performs the conversion.
Source§

impl<T, U> TryInto<U> for T
where U: TryFrom<T>,

Source§

type Error = <U as TryFrom<T>>::Error

The type returned in the event of a conversion error.
Source§

fn try_into(self) -> Result<U, <U as TryFrom<T>>::Error>

Performs the conversion.
Source§

impl<V, T> VZip<V> for T
where V: MultiLane<T>,

Source§

fn vzip(self) -> V

Source§

impl<T> WithSubscriber for T

Source§

fn with_subscriber<S>(self, subscriber: S) -> WithDispatch<Self>
where S: Into<Dispatch>,

Attaches the provided Subscriber to this type, returning a WithDispatch wrapper. Read more
Source§

fn with_current_subscriber(self) -> WithDispatch<Self>

Attaches the current default Subscriber to this type, returning a WithDispatch wrapper. Read more