pub struct PcaProjection { /* private fields */ }Expand description
Corpus-fitted projection via spherical PCA.
Finds the 3 principal directions of maximum angular variance in the embedding space, then projects new embeddings onto them. This preserves angular (cosine similarity) relationships as faithfully as possible in 3 dimensions.
Fitting: O(N·n·k·iters) where N=corpus size, n=dimension, k=3. Projection: O(n) per embedding.
Implementations§
Source§impl PcaProjection
impl PcaProjection
Sourcepub fn fit(
embeddings: &[Embedding],
radial: RadialStrategy,
) -> Result<Self, ProjectionError>
pub fn fit( embeddings: &[Embedding], radial: RadialStrategy, ) -> Result<Self, ProjectionError>
Fit the top-3 principal components on embeddings.
Returns ProjectionError::EmptyCorpus if the slice is empty,
ProjectionError::DimensionTooLow if dim < 3, and
ProjectionError::InconsistentDimension if any row’s
dimensionality disagrees with the first. Previously these paths
panicked via assert!, which surfaced as a PanicException in
Python / WASM bindings.
pub fn fit_default(embeddings: &[Embedding]) -> Result<Self, ProjectionError>
Sourcepub fn fit_weighted(
embeddings: &[Embedding],
weights: &[f64],
radial: RadialStrategy,
) -> Result<Self, ProjectionError>
pub fn fit_weighted( embeddings: &[Embedding], weights: &[f64], radial: RadialStrategy, ) -> Result<Self, ProjectionError>
Fit the top-3 principal components with per-sample weights.
Weighted PCA finds the top eigenvectors of the weighted
covariance matrix Σ wᵢ (xᵢ − μ_w)(xᵢ − μ_w)ᵀ / Σ wᵢ, where
μ_w = Σ wᵢ xᵢ / Σ wᵢ. With uniform weights this collapses to
the same answer as Self::fit.
The intended use is rebalancing covariance estimates over
imbalanced corpora. Setting wᵢ = 1 / sqrt(|category(i)|) gives
a category of size m total covariance mass m · (1/√m) = √m,
compressing category influence from linear to square-root in its
size. For exactly equal per-category mass use
wᵢ = 1 / |category(i)|; the square-root compromise keeps large
categories’ internal variance structure from being washed out
entirely while still letting small categories register.
Returns the same error variants as Self::fit, plus
ProjectionError::SliceLengthMismatch when weights.len() != embeddings.len(). Negative weights are treated as zero.
Sourcepub fn with_volumetric(self, enabled: bool) -> Self
pub fn with_volumetric(self, enabled: bool) -> Self
Enable volumetric mode: r comes from the PCA projection magnitude instead of the embedding magnitude. Points distribute through the full 3D volume rather than clustering on the sphere surface.
Sourcepub fn explained_variance_ratio(&self) -> f64
pub fn explained_variance_ratio(&self) -> f64
The fraction of total variance captured by the top-3 PCA components. A global quality metric for the projection — higher means less information lost.
Trait Implementations§
Source§impl Clone for PcaProjection
impl Clone for PcaProjection
Source§fn clone(&self) -> PcaProjection
fn clone(&self) -> PcaProjection
1.0.0 (const: unstable) · Source§fn clone_from(&mut self, source: &Self)
fn clone_from(&mut self, source: &Self)
source. Read moreSource§impl From<PcaProjection> for ConfiguredProjection
impl From<PcaProjection> for ConfiguredProjection
Source§fn from(p: PcaProjection) -> Self
fn from(p: PcaProjection) -> Self
Source§impl Projection for PcaProjection
impl Projection for PcaProjection
fn project(&self, embedding: &Embedding) -> SphericalPoint
Source§fn project_rich(&self, embedding: &Embedding) -> ProjectedPoint
fn project_rich(&self, embedding: &Embedding) -> ProjectedPoint
fn dimensionality(&self) -> usize
Auto Trait Implementations§
impl !RefUnwindSafe for PcaProjection
impl !UnwindSafe for PcaProjection
impl Freeze for PcaProjection
impl Send for PcaProjection
impl Sync for PcaProjection
impl Unpin for PcaProjection
impl UnsafeUnpin for PcaProjection
Blanket Implementations§
Source§impl<T> BorrowMut<T> for Twhere
T: ?Sized,
impl<T> BorrowMut<T> for Twhere
T: ?Sized,
Source§fn borrow_mut(&mut self) -> &mut T
fn borrow_mut(&mut self) -> &mut T
Source§impl<T> CloneToUninit for Twhere
T: Clone,
impl<T> CloneToUninit for Twhere
T: Clone,
Source§impl<T> Instrument for T
impl<T> Instrument for T
Source§fn instrument(self, span: Span) -> Instrumented<Self>
fn instrument(self, span: Span) -> Instrumented<Self>
Source§fn in_current_span(self) -> Instrumented<Self>
fn in_current_span(self) -> Instrumented<Self>
Source§impl<T> IntoEither for T
impl<T> IntoEither for T
Source§fn into_either(self, into_left: bool) -> Either<Self, Self>
fn into_either(self, into_left: bool) -> Either<Self, Self>
self into a Left variant of Either<Self, Self>
if into_left is true.
Converts self into a Right variant of Either<Self, Self>
otherwise. Read moreSource§fn into_either_with<F>(self, into_left: F) -> Either<Self, Self>
fn into_either_with<F>(self, into_left: F) -> Either<Self, Self>
self into a Left variant of Either<Self, Self>
if into_left(&self) returns true.
Converts self into a Right variant of Either<Self, Self>
otherwise. Read more