pub struct SentenceEncoder { /* private fields */ }Expand description
Projects sequences of token embeddings to a single sentence-level vector.
Internally the encoder applies:
- Pooling — aggregate token embeddings with the chosen strategy.
- Projection — a learnable
embedding_dim × projection_dimlinear layer (bias included) maps the pooled vector to the output space. - Optional L2 normalisation to unit length.
Weights are initialised from a deterministic LCG seeded by seed.
Implementations§
Source§impl SentenceEncoder
impl SentenceEncoder
Sourcepub fn new(
embedding_dim: usize,
projection_dim: usize,
pooling: PoolingStrategy,
seed: u64,
) -> Self
pub fn new( embedding_dim: usize, projection_dim: usize, pooling: PoolingStrategy, seed: u64, ) -> Self
Create a new SentenceEncoder with LCG-initialised weights.
§Parameters
embedding_dim— dimensionality of token embeddings fed toencode.projection_dim— output dimensionality of sentence embeddings.pooling— pooling strategy.seed— deterministic PRNG seed.
Sourcepub fn with_normalize(self, normalize: bool) -> Self
pub fn with_normalize(self, normalize: bool) -> Self
Enable or disable L2 normalisation of output embeddings.
Sourcepub fn encode(&self, token_embeddings: &[Vec<f64>]) -> Result<Vec<f64>>
pub fn encode(&self, token_embeddings: &[Vec<f64>]) -> Result<Vec<f64>>
Encode a sequence of token embeddings into a single sentence vector.
Returns a Vec<f64> of length projection_dim.
§Errors
Returns an error when token_embeddings is empty or any token
embedding has a dimension other than embedding_dim.
Sourcepub fn cosine_similarity(a: &[f64], b: &[f64]) -> f64
pub fn cosine_similarity(a: &[f64], b: &[f64]) -> f64
Cosine similarity between two sentence embeddings.
Returns a value in [-1, 1]. Returns 0.0 when either vector has
zero norm.
Sourcepub fn normalize(v: &mut [f64])
pub fn normalize(v: &mut [f64])
L2-normalise a vector in place. A zero-norm vector is left unchanged.
Sourcepub fn projection_dim(&self) -> usize
pub fn projection_dim(&self) -> usize
The output (projection) dimension.
Sourcepub fn embedding_dim(&self) -> usize
pub fn embedding_dim(&self) -> usize
The input (token embedding) dimension.
Trait Implementations§
Auto Trait Implementations§
impl Freeze for SentenceEncoder
impl RefUnwindSafe for SentenceEncoder
impl Send for SentenceEncoder
impl Sync for SentenceEncoder
impl Unpin for SentenceEncoder
impl UnsafeUnpin for SentenceEncoder
impl UnwindSafe for SentenceEncoder
Blanket Implementations§
Source§impl<T> BorrowMut<T> for Twhere
T: ?Sized,
impl<T> BorrowMut<T> for Twhere
T: ?Sized,
Source§fn borrow_mut(&mut self) -> &mut T
fn borrow_mut(&mut self) -> &mut T
Source§impl<T> IntoEither for T
impl<T> IntoEither for T
Source§fn into_either(self, into_left: bool) -> Either<Self, Self>
fn into_either(self, into_left: bool) -> Either<Self, Self>
self into a Left variant of Either<Self, Self>
if into_left is true.
Converts self into a Right variant of Either<Self, Self>
otherwise. Read moreSource§fn into_either_with<F>(self, into_left: F) -> Either<Self, Self>
fn into_either_with<F>(self, into_left: F) -> Either<Self, Self>
self into a Left variant of Either<Self, Self>
if into_left(&self) returns true.
Converts self into a Right variant of Either<Self, Self>
otherwise. Read moreSource§impl<T> Pointable for T
impl<T> Pointable for T
Source§impl<SS, SP> SupersetOf<SS> for SPwhere
SS: SubsetOf<SP>,
impl<SS, SP> SupersetOf<SS> for SPwhere
SS: SubsetOf<SP>,
Source§fn to_subset(&self) -> Option<SS>
fn to_subset(&self) -> Option<SS>
self from the equivalent element of its
superset. Read moreSource§fn is_in_subset(&self) -> bool
fn is_in_subset(&self) -> bool
self is actually part of its subset T (and can be converted to it).Source§fn to_subset_unchecked(&self) -> SS
fn to_subset_unchecked(&self) -> SS
self.to_subset but without any property checks. Always succeeds.Source§fn from_subset(element: &SS) -> SP
fn from_subset(element: &SS) -> SP
self to the equivalent element of its superset.