Skip to main content

LinalgKind

Enum LinalgKind 

Source
#[non_exhaustive]
#[repr(u16)]
pub enum LinalgKind {
Show 16 variants Cholesky = 0, Lu = 1, Qr = 2, Svd = 3, Inverse = 4, Eig = 5, Solve = 6, LeastSquares = 7, MatrixExp = 8, BatchedQr = 9, BatchedSvd = 10, Eigh = 11, BatchedSvda = 12, BatchedOrmqr = 13, BatchedQrMaterialize = 14, BatchedOrmqrWy = 15,
}
Expand description

Linear-algebra (dense) op discriminant — covers the cuSOLVER family shipped in Milestone 6.3.

Stored as u16 in crate::KernelSku::op when category == OpCategory::Linalg. Today the four canonical PyTorch / JAX dense linalg ops are wired:

  • Self::CholeskyA = L · L^T (symmetric positive-definite). Batched via cusolverDnSpotrfBatched / cusolverDnDpotrfBatched.
  • Self::LuP · A = L · U. Batched via cusolverDnSgetrfBatched / cusolverDnDgetrfBatched.
  • Self::QrA = Q · R. cuSOLVER has no batched variant; 2-D only.
  • Self::SvdA = U · diag(S) · V^T. cuSOLVER 2-D only.

Dtype coverage is f32 + f64 — cuSOLVER’s dense API does not support f16 / bf16 for these factorizations. Reserved variants (Inverse, Eig, Solve, LeastSquares, MatrixExp) follow in future milestones.

Variants (Non-exhaustive)§

This enum is marked as non-exhaustive
Non-exhaustive enums could have additional variants added in future. Therefore, when matching against variants of non-exhaustive enums, an extra wildcard arm must be added to account for any future variants.
§

Cholesky = 0

Cholesky factorization A = L · L^T (lower) or A = U^T · U (upper). Input must be symmetric positive-definite.

§

Lu = 1

LU factorization with partial pivoting P · A = L · U. Returns the packed LU factors plus an i32 pivot vector.

§

Qr = 2

QR factorization A = Q · R. Computes full Q ([M, M]) and the upper-triangular R ([M, N]) via geqrf + ormqr.

§

Svd = 3

Singular value decomposition A = U · diag(S) · V^T. cuSOLVER 2-D only; full_matrices controls whether U/V^T are full ([M,M] / [N,N]) or thin ([M,K] / [K,N]) where K = min(M, N).

§

Inverse = 4

Matrix inverse A^{-1} via getrf + getrs over an identity RHS. Wired in Milestone 6.9.

§

Eig = 5

General (non-symmetric) eigen-decomposition A · v = λ · v. Wired via cusolverDnXgeev in Milestone 6.12. Always emits complex eigenvalues (and optional left / right complex eigenvectors).

§

Solve = 6

Linear solve A · X = B via getrf + getrs. Wired in Milestone 6.9.

§

LeastSquares = 7

Least-squares solve min ||A·x - b||² via cuSOLVER’s mixed-precision iterative-refinement _gels routine. Wired in Milestone 6.11.

§

MatrixExp = 8

Reserved — matrix exponential / matrix functions.

§

BatchedQr = 9

Batched QR factorization A_b = Q_b · R_b via cusolverDn*geqrfBatched. Wired in Milestone 6.11.

§

BatchedSvd = 10

Batched SVD via Jacobi cusolverDn*gesvdjBatched. Wired in Milestone 6.11.

§

Eigh = 11

Symmetric / Hermitian eigen-decomposition A · v = λ · v (real eigenvalues). Wired via cusolverDn{S,D}syevd / cusolverDn{C,Z}heevd in Milestone 6.12.

§

BatchedSvda = 12

Rectangular batched approximate-SVD via cuSOLVER’s gesvdaStridedBatched. Unlike Self::BatchedSvd (which is square-only Jacobi), this routine accepts arbitrary m × n per batch slot, uses element-strides between slots, and reports per- slot residual Frobenius norms to a host array. Wired in Milestone 6.15.

§

BatchedOrmqr = 13

Bespoke batched-ormqr — applies the implicit Q from a Self::BatchedQr packed output to a batch of matrices C, all slots fused into one CUDA launch. cuSOLVER’s ormqr is non-batched, so in the small-matrix regime where batched-QR is most useful the per-slot launch latency dominates; this bespoke kernel amortizes one launch over the whole batch. Side = Left, op ∈ {N, T} in the trailblazer (Right + complex variants deferred). Wired in Milestone 6.14.

§

BatchedQrMaterialize = 14

Bespoke “materialize dense Q and R from batched-geqrf packed output”. Tiny upper-triangle-copy kernel for R; identity-stage

§

BatchedOrmqrWy = 15

WY-blocked batched-ormqr — applies the implicit Q (or Q^T) from a Self::BatchedQr packed output to a batch of matrices C at GEMM-rates by fusing groups of nb consecutive Householder reflectors into a block reflector (I - V·T·V^T) and applying it via three cuBLAS strided-batched GEMMs per block. Sibling to Self::BatchedOrmqr (the reflector-by-reflector GEMV-rates variant); callers pick by problem size — WY wins decisively for M, N > ~16, the reflector kernel wins for tiny inputs. Side = Left, op ∈ {N, T} in the trailblazer. Wired in Milestone 6.17.

Trait Implementations§

Source§

impl Clone for LinalgKind

Source§

fn clone(&self) -> LinalgKind

Returns a duplicate of the value. Read more
1.0.0 (const: unstable) · Source§

fn clone_from(&mut self, source: &Self)

Performs copy-assignment from source. Read more
Source§

impl Copy for LinalgKind

Source§

impl Debug for LinalgKind

Source§

fn fmt(&self, f: &mut Formatter<'_>) -> Result<(), Error>

Formats the value using the given formatter. Read more
Source§

impl Eq for LinalgKind

Source§

impl Hash for LinalgKind

Source§

fn hash<__H>(&self, state: &mut __H)
where __H: Hasher,

Feeds this value into the given Hasher. Read more
1.3.0 · Source§

fn hash_slice<H>(data: &[Self], state: &mut H)
where H: Hasher, Self: Sized,

Feeds a slice of this type into the given Hasher. Read more
Source§

impl PartialEq for LinalgKind

Source§

fn eq(&self, other: &LinalgKind) -> bool

Tests for self and other values to be equal, and is used by ==.
1.0.0 (const: unstable) · Source§

fn ne(&self, other: &Rhs) -> bool

Tests for !=. The default implementation is almost always sufficient, and should not be overridden without very good reason.
Source§

impl StructuralPartialEq for LinalgKind

Auto Trait Implementations§

Blanket Implementations§

Source§

impl<T> Any for T
where T: 'static + ?Sized,

Source§

fn type_id(&self) -> TypeId

Gets the TypeId of self. Read more
Source§

impl<T> Borrow<T> for T
where T: ?Sized,

Source§

fn borrow(&self) -> &T

Immutably borrows from an owned value. Read more
Source§

impl<T> BorrowMut<T> for T
where T: ?Sized,

Source§

fn borrow_mut(&mut self) -> &mut T

Mutably borrows from an owned value. Read more
Source§

impl<T> CloneToUninit for T
where T: Clone,

Source§

unsafe fn clone_to_uninit(&self, dest: *mut u8)

🔬This is a nightly-only experimental API. (clone_to_uninit)
Performs copy-assignment from self to dest. Read more
Source§

impl<T> From<T> for T

Source§

fn from(t: T) -> T

Returns the argument unchanged.

Source§

impl<T, U> Into<U> for T
where U: From<T>,

Source§

fn into(self) -> U

Calls U::from(self).

That is, this conversion is whatever the implementation of From<T> for U chooses to do.

Source§

impl<T> ToOwned for T
where T: Clone,

Source§

type Owned = T

The resulting type after obtaining ownership.
Source§

fn to_owned(&self) -> T

Creates owned data from borrowed data, usually by cloning. Read more
Source§

fn clone_into(&self, target: &mut T)

Uses borrowed data to replace owned data, usually by cloning. Read more
Source§

impl<T, U> TryFrom<U> for T
where U: Into<T>,

Source§

type Error = Infallible

The type returned in the event of a conversion error.
Source§

fn try_from(value: U) -> Result<T, <T as TryFrom<U>>::Error>

Performs the conversion.
Source§

impl<T, U> TryInto<U> for T
where U: TryFrom<T>,

Source§

type Error = <U as TryFrom<T>>::Error

The type returned in the event of a conversion error.
Source§

fn try_into(self) -> Result<U, <U as TryFrom<T>>::Error>

Performs the conversion.