Module sparse_gpu

Expand description

GPU-ready sparse matrix formats and iterative solvers with data-oriented layouts.

Provides CSR, ELLPACK, Hybrid (ELL+COO), and Block-CSR formats along with iterative solvers (CG, BiCGSTAB, preconditioned CG) suitable for GPU offload.

Structs§

BlockCsrMatrix: Block-CSR matrix where every stored entry is a block_size × block_size dense tile.
CsrMatrix: Compressed Sparse Row (CSR) matrix stored as plain f64 arrays.
EllMatrix: ELLPACK-format sparse matrix: rows padded to max_nnz_per_row.
HybridMatrix: Hybrid ELL+COO matrix: regular rows stored in ELL, overflow in COO.
SparseTriplet: Coordinate (COO) format sparse matrix for incremental assembly.

Functions§

assemble_1d_laplacian: Assemble a 1D Laplacian matrix of size n × n (tridiagonal: 2 on diag, -1 off-diag).
axpy: y + alpha * x (AXPY).
bicgstab_solve: BiCGSTAB solver for general (possibly non-symmetric) systems A x = b.
cg_solve: Conjugate Gradient solver for symmetric positive-definite systems A x = b.
compute_nnz_per_row: Compute the number of non-zeros per row for a CSR matrix.
csr_to_ell: Convert a CSR matrix to ELLPACK format (convenience wrapper).
dot: Dot product of two vectors.
extract_diagonal: Extract the main diagonal of a CSR matrix.
frobenius_norm: Compute the Frobenius norm of a sparse matrix: sqrt(sum(a_ij^2)).
jacobi_preconditioned_cg: Conjugate Gradient with Jacobi (diagonal) preconditioner for A x = b (SPD).
norm2: Euclidean norm of a vector.
optimal_ell_row_width: Choose the ELLPACK row width (max non-zeros per row) to minimize padding waste.
scale_vec: Scale every element of x by s.
simulate_spmv_throughput: Estimate SpMV throughput in GFLOPS given matrix dimensions and nnz count.
sparse_lower_triangular_solve: Forward-substitution solve L x = b where L is lower-triangular (CSR).
sparse_upper_triangular_solve: Back-substitution solve U x = b where U is upper-triangular (CSR).
spmv_segmented: Segmented SpMV: processes each row independently to prepare for GPU-style parallel execution. Functionally identical to CsrMatrix::spmv but structured for row-parallel dispatch.

Module sparse_gpu

Module sparse_gpu Copy item path

Structs§

Functions§

Module sparse_gpu