pub struct BatchedSolver { /* private fields */ }Expand description
Batched matrix factorization engine.
Each thread block handles one matrix. For very small matrices (n <= 16), multiple matrices per thread block. All computation in registers/shared memory.
Implementations§
Source§impl BatchedSolver
impl BatchedSolver
Sourcepub fn new(handle: SolverHandle) -> Self
pub fn new(handle: SolverHandle) -> Self
Creates a new batched solver.
Sourcepub fn handle(&self) -> &SolverHandle
pub fn handle(&self) -> &SolverHandle
Returns a reference to the underlying solver handle.
Sourcepub fn handle_mut(&mut self) -> &mut SolverHandle
pub fn handle_mut(&mut self) -> &mut SolverHandle
Returns a mutable reference to the underlying solver handle.
Sourcepub fn batched_lu<T: GpuFloat>(
&mut self,
matrices: &mut DeviceBuffer<T>,
pivots: &mut DeviceBuffer<i32>,
n: usize,
batch_count: usize,
) -> SolverResult<BatchedResult>
pub fn batched_lu<T: GpuFloat>( &mut self, matrices: &mut DeviceBuffer<T>, pivots: &mut DeviceBuffer<i32>, n: usize, batch_count: usize, ) -> SolverResult<BatchedResult>
Batched LU factorization: factorize batch_count matrices of size n x n.
Input: matrices[batch_count * n * n] (column-major, contiguous).
Output: in-place LU factors, pivots[batch_count * n].
§Errors
Returns SolverError::DimensionMismatch if buffer sizes are incorrect
or matrix size is out of range.
Sourcepub fn batched_qr<T: GpuFloat>(
&mut self,
matrices: &mut DeviceBuffer<T>,
tau: &mut DeviceBuffer<T>,
m: usize,
n: usize,
batch_count: usize,
) -> SolverResult<BatchedResult>
pub fn batched_qr<T: GpuFloat>( &mut self, matrices: &mut DeviceBuffer<T>, tau: &mut DeviceBuffer<T>, m: usize, n: usize, batch_count: usize, ) -> SolverResult<BatchedResult>
Batched QR factorization.
Output: in-place QR factors, tau[batch_count * min(m, n)] (Householder scalars).
§Arguments
matrices— contiguous buffer ofbatch_countmatrices, eachm x n, column-major.tau— output Householder scalars, lengthbatch_count * min(m, n).m— number of rows per matrix.n— number of columns per matrix.batch_count— number of matrices.
§Errors
Returns SolverError::DimensionMismatch if buffer sizes are incorrect.
Sourcepub fn batched_cholesky<T: GpuFloat>(
&mut self,
matrices: &mut DeviceBuffer<T>,
n: usize,
batch_count: usize,
) -> SolverResult<BatchedResult>
pub fn batched_cholesky<T: GpuFloat>( &mut self, matrices: &mut DeviceBuffer<T>, n: usize, batch_count: usize, ) -> SolverResult<BatchedResult>
Batched Cholesky factorization (for SPD matrices).
Output: in-place lower triangular Cholesky factors.
§Arguments
matrices— contiguous buffer ofbatch_countSPD matrices, eachn x n.n— matrix dimension.batch_count— number of matrices.
§Errors
Returns SolverError::DimensionMismatch if buffer sizes are incorrect.
Sourcepub fn batched_solve<T: GpuFloat>(
&mut self,
a_matrices: &mut DeviceBuffer<T>,
b_matrices: &mut DeviceBuffer<T>,
n: usize,
nrhs: usize,
batch_count: usize,
) -> SolverResult<BatchedResult>
pub fn batched_solve<T: GpuFloat>( &mut self, a_matrices: &mut DeviceBuffer<T>, b_matrices: &mut DeviceBuffer<T>, n: usize, nrhs: usize, batch_count: usize, ) -> SolverResult<BatchedResult>
Batched linear solve using LU: solve A_i * X_i = B_i for each i.
First performs batched LU factorization on a_matrices, then uses the
factors to solve the system. Both a_matrices and b_matrices are
modified in-place.
§Arguments
a_matrices— contiguousbatch_countcoefficient matrices (n x n each).b_matrices— contiguousbatch_countRHS matrices (n x nrhs each).n— system dimension.nrhs— number of right-hand sides.batch_count— number of systems to solve.
§Errors
Returns SolverError::DimensionMismatch if dimensions are invalid.