Skip to main content

SentenceEncoder

Struct SentenceEncoder 

Source
pub struct SentenceEncoder { /* private fields */ }
Expand description

Projects sequences of token embeddings to a single sentence-level vector.

Internally the encoder applies:

  1. Pooling — aggregate token embeddings with the chosen strategy.
  2. Projection — a learnable embedding_dim × projection_dim linear layer (bias included) maps the pooled vector to the output space.
  3. Optional L2 normalisation to unit length.

Weights are initialised from a deterministic LCG seeded by seed.

Implementations§

Source§

impl SentenceEncoder

Source

pub fn new( embedding_dim: usize, projection_dim: usize, pooling: PoolingStrategy, seed: u64, ) -> Self

Create a new SentenceEncoder with LCG-initialised weights.

§Parameters
  • embedding_dim — dimensionality of token embeddings fed to encode.
  • projection_dim — output dimensionality of sentence embeddings.
  • pooling — pooling strategy.
  • seed — deterministic PRNG seed.
Source

pub fn with_normalize(self, normalize: bool) -> Self

Enable or disable L2 normalisation of output embeddings.

Source

pub fn encode(&self, token_embeddings: &[Vec<f64>]) -> Result<Vec<f64>>

Encode a sequence of token embeddings into a single sentence vector.

Returns a Vec<f64> of length projection_dim.

§Errors

Returns an error when token_embeddings is empty or any token embedding has a dimension other than embedding_dim.

Source

pub fn cosine_similarity(a: &[f64], b: &[f64]) -> f64

Cosine similarity between two sentence embeddings.

Returns a value in [-1, 1]. Returns 0.0 when either vector has zero norm.

Source

pub fn similarity_matrix( &self, sentences: &[Vec<Vec<f64>>], ) -> Result<Vec<Vec<f64>>>

Encode multiple sentences and return the n × n cosine-similarity matrix.

Each element of sentences is a Vec<Vec<f64>> (token embeddings for one sentence).

§Errors

Propagates any error from encode.

Source

pub fn normalize(v: &mut [f64])

L2-normalise a vector in place. A zero-norm vector is left unchanged.

Source

pub fn projection_dim(&self) -> usize

The output (projection) dimension.

Source

pub fn embedding_dim(&self) -> usize

The input (token embedding) dimension.

Trait Implementations§

Source§

impl Debug for SentenceEncoder

Source§

fn fmt(&self, f: &mut Formatter<'_>) -> Result

Formats the value using the given formatter. Read more

Auto Trait Implementations§

Blanket Implementations§

Source§

impl<T> Any for T
where T: 'static + ?Sized,

Source§

fn type_id(&self) -> TypeId

Gets the TypeId of self. Read more
Source§

impl<T> Borrow<T> for T
where T: ?Sized,

Source§

fn borrow(&self) -> &T

Immutably borrows from an owned value. Read more
Source§

impl<T> BorrowMut<T> for T
where T: ?Sized,

Source§

fn borrow_mut(&mut self) -> &mut T

Mutably borrows from an owned value. Read more
Source§

impl<T> From<T> for T

Source§

fn from(t: T) -> T

Returns the argument unchanged.

Source§

impl<T, U> Into<U> for T
where U: From<T>,

Source§

fn into(self) -> U

Calls U::from(self).

That is, this conversion is whatever the implementation of From<T> for U chooses to do.

Source§

impl<T> IntoEither for T

Source§

fn into_either(self, into_left: bool) -> Either<Self, Self>

Converts self into a Left variant of Either<Self, Self> if into_left is true. Converts self into a Right variant of Either<Self, Self> otherwise. Read more
Source§

fn into_either_with<F>(self, into_left: F) -> Either<Self, Self>
where F: FnOnce(&Self) -> bool,

Converts self into a Left variant of Either<Self, Self> if into_left(&self) returns true. Converts self into a Right variant of Either<Self, Self> otherwise. Read more
Source§

impl<T> Pointable for T

Source§

const ALIGN: usize

The alignment of pointer.
Source§

type Init = T

The type for initializers.
Source§

unsafe fn init(init: <T as Pointable>::Init) -> usize

Initializes a with the given initializer. Read more
Source§

unsafe fn deref<'a>(ptr: usize) -> &'a T

Dereferences the given pointer. Read more
Source§

unsafe fn deref_mut<'a>(ptr: usize) -> &'a mut T

Mutably dereferences the given pointer. Read more
Source§

unsafe fn drop(ptr: usize)

Drops the object pointed to by the given pointer. Read more
Source§

impl<T> Same for T

Source§

type Output = T

Should always be Self
Source§

impl<SS, SP> SupersetOf<SS> for SP
where SS: SubsetOf<SP>,

Source§

fn to_subset(&self) -> Option<SS>

The inverse inclusion map: attempts to construct self from the equivalent element of its superset. Read more
Source§

fn is_in_subset(&self) -> bool

Checks if self is actually part of its subset T (and can be converted to it).
Source§

fn to_subset_unchecked(&self) -> SS

Use with care! Same as self.to_subset but without any property checks. Always succeeds.
Source§

fn from_subset(element: &SS) -> SP

The inclusion map: converts self to the equivalent element of its superset.
Source§

impl<T, U> TryFrom<U> for T
where U: Into<T>,

Source§

type Error = Infallible

The type returned in the event of a conversion error.
Source§

fn try_from(value: U) -> Result<T, <T as TryFrom<U>>::Error>

Performs the conversion.
Source§

impl<T, U> TryInto<U> for T
where U: TryFrom<T>,

Source§

type Error = <U as TryFrom<T>>::Error

The type returned in the event of a conversion error.
Source§

fn try_into(self) -> Result<U, <U as TryFrom<T>>::Error>

Performs the conversion.
Source§

impl<V, T> VZip<V> for T
where V: MultiLane<T>,

Source§

fn vzip(self) -> V