Skip to main content

KernelPcaProjection

Struct KernelPcaProjection 

Source
pub struct KernelPcaProjection { /* private fields */ }
Expand description

Corpus-fitted projection via kernel PCA with a Gaussian (RBF) kernel.

§Mathematical background

Standard PCA finds the 3 directions of maximum linear variance. Kernel PCA first maps data into an infinite-dimensional feature space F via the kernel trick, then performs PCA there. With the Gaussian kernel k(x, y) = exp(−‖xy‖²/(2σ²)), every data point Φ(x) lies on a hypersphere S in F (since k(x,x) = 1 for all x). This is a natural fit for SphereQL’s spherical geometry.

The key advantage over linear PCA: kernel PCA captures non-linear manifold structure (curved clusters, rings, spirals) that linear PCA crushes flat. For embedding spaces with complex semantic geometry, this preserves more meaningful neighborhood relationships.

§Limit behaviour

  • σ → ∞: kernel PCA converges to standard PCA (Hoffmann, Appendix A).
  • σ → 0: all points become orthogonal in F; PCA is meaningless.

§Complexity

  • Fitting: O(n²·d) to build the kernel matrix + O(n²·q·iters) for power iteration on the n×n centered kernel matrix.
  • Projection: O(n·d) per embedding (n kernel evaluations).
  • Memory: O(n·d) for training data + O(n) per eigenvector.

§References

  • Schölkopf, Smola, Müller. “Nonlinear component analysis as a kernel eigenvalue problem.” Neural Computation 10 (1998) 1299–1319.
  • Hoffmann. “Kernel PCA for novelty detection.” Pattern Recognition 40 (2007) 863–874.

Implementations§

Source§

impl KernelPcaProjection

Source

pub fn fit(embeddings: &[Embedding], radial: RadialStrategy) -> Self

Fit kernel PCA with automatic σ selection.

σ is set to the median pairwise Euclidean distance on the normalised embeddings divided by √2, so that the kernel value at the median distance is exp(−1) ≈ 0.37. This is a standard heuristic in the kernel methods literature.

Source

pub fn fit_with_sigma( embeddings: &[Embedding], sigma: f64, radial: RadialStrategy, ) -> Self

Fit kernel PCA with an explicit kernel width σ.

Use this when you have domain knowledge about the appropriate scale, or when benchmarking different σ values.

Source

pub fn fit_default(embeddings: &[Embedding]) -> Self

Convenience: fit with default radial strategy and auto σ.

Source

pub fn with_volumetric(self, enabled: bool) -> Self

Enable volumetric mode: r comes from the kernel PCA projection magnitude instead of the embedding magnitude.

Source

pub fn sigma(&self) -> f64

The Gaussian kernel width used for this projection.

Source

pub fn num_training_points(&self) -> usize

Number of training points stored (needed for kernel evaluations).

Source

pub fn explained_variance_ratio(&self) -> f64

The fraction of total feature-space variance captured by the top-3 kernel principal components.

Analogous to PcaProjection::explained_variance_ratio() but in the (infinite-dimensional) Gaussian feature space.

Source

pub fn eigenvalues(&self) -> [f64; 3]

The top-3 eigenvalues of the centred kernel matrix.

Trait Implementations§

Source§

impl Clone for KernelPcaProjection

Source§

fn clone(&self) -> KernelPcaProjection

Returns a duplicate of the value. Read more
1.0.0 · Source§

fn clone_from(&mut self, source: &Self)

Performs copy-assignment from source. Read more
Source§

impl Projection for KernelPcaProjection

Source§

fn project(&self, embedding: &Embedding) -> SphericalPoint

Source§

fn project_rich(&self, embedding: &Embedding) -> ProjectedPoint

Project with rich metadata: certainty, intensity, projection magnitude.
Source§

fn dimensionality(&self) -> usize

Auto Trait Implementations§

Blanket Implementations§

Source§

impl<T> Any for T
where T: 'static + ?Sized,

Source§

fn type_id(&self) -> TypeId

Gets the TypeId of self. Read more
Source§

impl<T> Borrow<T> for T
where T: ?Sized,

Source§

fn borrow(&self) -> &T

Immutably borrows from an owned value. Read more
Source§

impl<T> BorrowMut<T> for T
where T: ?Sized,

Source§

fn borrow_mut(&mut self) -> &mut T

Mutably borrows from an owned value. Read more
Source§

impl<T> CloneToUninit for T
where T: Clone,

Source§

unsafe fn clone_to_uninit(&self, dest: *mut u8)

🔬This is a nightly-only experimental API. (clone_to_uninit)
Performs copy-assignment from self to dest. Read more
Source§

impl<T> From<T> for T

Source§

fn from(t: T) -> T

Returns the argument unchanged.

Source§

impl<T, U> Into<U> for T
where U: From<T>,

Source§

fn into(self) -> U

Calls U::from(self).

That is, this conversion is whatever the implementation of From<T> for U chooses to do.

Source§

impl<T> ToOwned for T
where T: Clone,

Source§

type Owned = T

The resulting type after obtaining ownership.
Source§

fn to_owned(&self) -> T

Creates owned data from borrowed data, usually by cloning. Read more
Source§

fn clone_into(&self, target: &mut T)

Uses borrowed data to replace owned data, usually by cloning. Read more
Source§

impl<T, U> TryFrom<U> for T
where U: Into<T>,

Source§

type Error = Infallible

The type returned in the event of a conversion error.
Source§

fn try_from(value: U) -> Result<T, <T as TryFrom<U>>::Error>

Performs the conversion.
Source§

impl<T, U> TryInto<U> for T
where U: TryFrom<T>,

Source§

type Error = <U as TryFrom<T>>::Error

The type returned in the event of a conversion error.
Source§

fn try_into(self) -> Result<U, <U as TryFrom<T>>::Error>

Performs the conversion.