pub struct KernelPcaProjection { /* private fields */ }Expand description
Corpus-fitted projection via kernel PCA with a Gaussian (RBF) kernel.
§Mathematical background
Standard PCA finds the 3 directions of maximum linear variance. Kernel PCA first maps data into an infinite-dimensional feature space F via the kernel trick, then performs PCA there. With the Gaussian kernel k(x, y) = exp(−‖x−y‖²/(2σ²)), every data point Φ(x) lies on a hypersphere S in F (since k(x,x) = 1 for all x). This is a natural fit for SphereQL’s spherical geometry.
The key advantage over linear PCA: kernel PCA captures non-linear manifold structure (curved clusters, rings, spirals) that linear PCA crushes flat. For embedding spaces with complex semantic geometry, this preserves more meaningful neighborhood relationships.
§Limit behaviour
- σ → ∞: kernel PCA converges to standard PCA (Hoffmann, Appendix A).
- σ → 0: all points become orthogonal in F; PCA is meaningless.
§Complexity
- Fitting: O(n²·d) to build the kernel matrix + O(n²·q·iters) for power iteration on the n×n centered kernel matrix.
- Projection: O(n·d) per embedding (n kernel evaluations).
- Memory: O(n·d) for training data + O(n) per eigenvector.
§References
- Schölkopf, Smola, Müller. “Nonlinear component analysis as a kernel eigenvalue problem.” Neural Computation 10 (1998) 1299–1319.
- Hoffmann. “Kernel PCA for novelty detection.” Pattern Recognition 40 (2007) 863–874.
Implementations§
Source§impl KernelPcaProjection
impl KernelPcaProjection
Sourcepub fn fit(
embeddings: &[Embedding],
radial: RadialStrategy,
) -> Result<Self, ProjectionError>
pub fn fit( embeddings: &[Embedding], radial: RadialStrategy, ) -> Result<Self, ProjectionError>
Fit kernel PCA with automatic σ selection.
σ is set to the median pairwise Euclidean distance on the normalised embeddings divided by √2, so that the kernel value at the median distance is exp(−1) ≈ 0.37. This is a standard heuristic in the kernel methods literature.
Sourcepub fn fit_with_sigma(
embeddings: &[Embedding],
sigma: f64,
radial: RadialStrategy,
) -> Result<Self, ProjectionError>
pub fn fit_with_sigma( embeddings: &[Embedding], sigma: f64, radial: RadialStrategy, ) -> Result<Self, ProjectionError>
Fit kernel PCA with an explicit kernel width σ.
Use this when you have domain knowledge about the appropriate scale,
or when benchmarking different σ values. Returns
ProjectionError::InvalidSigma if sigma <= 0.0.
Sourcepub fn fit_default(embeddings: &[Embedding]) -> Result<Self, ProjectionError>
pub fn fit_default(embeddings: &[Embedding]) -> Result<Self, ProjectionError>
Convenience: fit with default radial strategy and auto σ.
Sourcepub fn with_volumetric(self, enabled: bool) -> Self
pub fn with_volumetric(self, enabled: bool) -> Self
Enable volumetric mode: r comes from the kernel PCA projection magnitude instead of the embedding magnitude.
Sourcepub fn num_training_points(&self) -> usize
pub fn num_training_points(&self) -> usize
Number of training points stored (needed for kernel evaluations).
Sourcepub fn explained_variance_ratio(&self) -> f64
pub fn explained_variance_ratio(&self) -> f64
The fraction of total feature-space variance captured by the top-3 kernel principal components.
Analogous to PcaProjection::explained_variance_ratio() but in the
(infinite-dimensional) Gaussian feature space.
Sourcepub fn eigenvalues(&self) -> [f64; 3]
pub fn eigenvalues(&self) -> [f64; 3]
The top-3 eigenvalues of the centred kernel matrix.
Trait Implementations§
Source§impl Clone for KernelPcaProjection
impl Clone for KernelPcaProjection
Source§fn clone(&self) -> KernelPcaProjection
fn clone(&self) -> KernelPcaProjection
1.0.0 (const: unstable) · Source§fn clone_from(&mut self, source: &Self)
fn clone_from(&mut self, source: &Self)
source. Read moreSource§impl From<KernelPcaProjection> for ConfiguredProjection
impl From<KernelPcaProjection> for ConfiguredProjection
Source§fn from(p: KernelPcaProjection) -> Self
fn from(p: KernelPcaProjection) -> Self
Source§impl Projection for KernelPcaProjection
impl Projection for KernelPcaProjection
fn project(&self, embedding: &Embedding) -> SphericalPoint
Source§fn project_rich(&self, embedding: &Embedding) -> ProjectedPoint
fn project_rich(&self, embedding: &Embedding) -> ProjectedPoint
fn dimensionality(&self) -> usize
Auto Trait Implementations§
impl !RefUnwindSafe for KernelPcaProjection
impl !UnwindSafe for KernelPcaProjection
impl Freeze for KernelPcaProjection
impl Send for KernelPcaProjection
impl Sync for KernelPcaProjection
impl Unpin for KernelPcaProjection
impl UnsafeUnpin for KernelPcaProjection
Blanket Implementations§
Source§impl<T> BorrowMut<T> for Twhere
T: ?Sized,
impl<T> BorrowMut<T> for Twhere
T: ?Sized,
Source§fn borrow_mut(&mut self) -> &mut T
fn borrow_mut(&mut self) -> &mut T
Source§impl<T> CloneToUninit for Twhere
T: Clone,
impl<T> CloneToUninit for Twhere
T: Clone,
Source§impl<T> Instrument for T
impl<T> Instrument for T
Source§fn instrument(self, span: Span) -> Instrumented<Self>
fn instrument(self, span: Span) -> Instrumented<Self>
Source§fn in_current_span(self) -> Instrumented<Self>
fn in_current_span(self) -> Instrumented<Self>
Source§impl<T> IntoEither for T
impl<T> IntoEither for T
Source§fn into_either(self, into_left: bool) -> Either<Self, Self>
fn into_either(self, into_left: bool) -> Either<Self, Self>
self into a Left variant of Either<Self, Self>
if into_left is true.
Converts self into a Right variant of Either<Self, Self>
otherwise. Read moreSource§fn into_either_with<F>(self, into_left: F) -> Either<Self, Self>
fn into_either_with<F>(self, into_left: F) -> Either<Self, Self>
self into a Left variant of Either<Self, Self>
if into_left(&self) returns true.
Converts self into a Right variant of Either<Self, Self>
otherwise. Read more