pub unsafe extern "C" fn cusparseSgebsr2gebsr(
handle: cusparseHandle_t,
dirA: cusparseDirection_t,
mb: c_int,
nb: c_int,
nnzb: c_int,
descrA: cusparseMatDescr_t,
bsrSortedValA: *const f32,
bsrSortedRowPtrA: *const c_int,
bsrSortedColIndA: *const c_int,
rowBlockDimA: c_int,
colBlockDimA: c_int,
descrC: cusparseMatDescr_t,
bsrSortedValC: *mut f32,
bsrSortedRowPtrC: *mut c_int,
bsrSortedColIndC: *mut c_int,
rowBlockDimC: c_int,
colBlockDimC: c_int,
pBuffer: *mut c_void,
) -> cusparseStatus_tExpand description
This function converts a sparse matrix in general BSR format that is defined by the three arrays bsrValA, bsrRowPtrA, and bsrColIndA into a sparse matrix in another general BSR format that is defined by arrays bsrValC, bsrRowPtrC, and bsrColIndC.
If rowBlockDimA=1 and colBlockDimA=1, cusparse\[S|D|C|Z\]gebsr2gebsr() is the same as cusparse\[S|D|C|Z\]csr2gebsr().
If rowBlockDimC=1 and colBlockDimC=1, cusparse\[S|D|C|Z\]gebsr2gebsr() is the same as cusparse\[S|D|C|Z\]gebsr2csr().
A is an m*n sparse matrix where m(=mb*rowBlockDim) is the number of rows of A, and n(=nb*colBlockDim) is the number of columns of A. The general BSR format of A contains nnzb(=bsrRowPtrA\[mb\] - bsrRowPtrA\[0\]) nonzero blocks. The matrix C is also general BSR format with a different block size, rowBlockDimC*colBlockDimC. If m is not a multiple of rowBlockDimC, or n is not a multiple of colBlockDimC, zeros are filled in. The number of block rows of C is mc(=(m+rowBlockDimC-1)/rowBlockDimC). The number of block rows of C is nc(=(n+colBlockDimC-1)/colBlockDimC). The number of nonzero blocks of C is nnzc.
The implementation adopts a two-step approach to do the conversion. First, the user allocates bsrRowPtrC of mc+1 elements and uses function cusparseXgebsr2gebsrNnz to determine the number of nonzero block columns per block row of matrix C. Second, the user gathers nnzc (number of non-zero block columns of matrix C) from either (nnzc=*nnzTotalDevHostPtr) or (nnzc=bsrRowPtrC\[mc\]-bsrRowPtrC\[0\]) and allocates bsrValC of nnzc*rowBlockDimC*colBlockDimC elements and bsrColIndC of nnzc integers. Finally the function cusparse\[S|D|C|Z\]gebsr2gebsr() is called to complete the conversion.
The user must call gebsr2gebsr_bufferSize() to know the size of the buffer required by gebsr2gebsr(), allocate the buffer, and pass the buffer pointer to gebsr2gebsr().
The general procedure is as follows:
- The routines require no extra storage if
pBuffer != NULL - The routine supports asynchronous execution if the Stream Ordered Memory Allocator is available
- The routines do not support CUDA graph capture.