pub unsafe extern "C" fn cusolverDnXgetrf(
handle: cusolverDnHandle_t,
params: cusolverDnParams_t,
m: i64,
n: i64,
dataTypeA: cudaDataType,
A: *mut c_void,
lda: i64,
ipiv: *mut i64,
computeType: cudaDataType,
bufferOnDevice: *mut c_void,
workspaceInBytesOnDevice: size_t,
bufferOnHost: *mut c_void,
workspaceInBytesOnHost: size_t,
info: *mut c_int,
) -> cusolverStatus_tExpand description
The helper function below can calculate the sizes needed for pre-allocated buffer.
The function below
computes the LU factorization of a $m \times n$ matrix: $$ P\*A = L\*U $$
where A is a $m \times n$ matrix, P is a permutation matrix, L is a lower triangular matrix with unit diagonal, and U is an upper triangular matrix using the generic API interface.
If LU factorization failed, i.e. matrix A (U) is singular, The output parameter info=i indicates U(i,i) = 0.
If output parameter info = -i (less than zero), the i-th parameter is wrong (not counting handle).
If ipiv is null, no pivoting is performed. The factorization is A=L*U, which is not numerically stable.
No matter LU factorization failed or not, the output parameter ipiv contains pivoting sequence, row i is interchanged with row ipiv(i).
The user has to provide device and host working spaces which are pointed by input parameters bufferOnDevice and bufferOnHost. The input parameters workspaceInBytesOnDevice (and workspaceInBytesOnHost) is size in bytes of the device (and host) working space, and it is returned by cusolverDnXgetrf_bufferSize.
The user can combine cusolverDnXgetrf and cusolverDnGetrs to complete a linear solver.
Currently, cusolverDnXgetrf supports two algorithms. To select legacy implementation, the user has to call cusolverDnSetAdvOptions.
Please visit cuSOLVER Library Samples - Xgetrf for a code example.
Algorithms supported by cusolverDnXgetrf
cusolverAlgMode_t::CUSOLVER_ALG_0 or NULL | Default algorithm. The fastest, requires a large workspace of m*n elements. |
cusolverAlgMode_t::CUSOLVER_ALG_1 | Legacy implementation |
List of input arguments for cusolverDnXgetrf_bufferSize and cusolverDnXgetrf:
The generic API has two different types, dataTypeA is data type of the matrix A, computeType is compute type of the operation. cusolverDnXgetrf only supports the following four combinations.
Valid combination of data type and compute type
| DataTypeA | ComputeType | Meaning |
|---|---|---|
CUDA_R_32F | CUDA_R_32F | SGETRF |
CUDA_R_64F | CUDA_R_64F | DGETRF |
CUDA_C_32F | CUDA_C_32F | CGETRF |
CUDA_C_64F | CUDA_C_64F | ZGETRF |
§Parameters
handle: Handle to the cuSolverDN library context.params: Structure with information collected bycusolverDnSetAdvOptions.m: Number of rows of matrixA.n: Number of columns of matrixA.dataTypeA: Data type of arrayA.A: <type> array of dimensionlda * nwithldais not less thanmax(1,m).lda: Leading dimension of two-dimensional array used to store matrixA.ipiv: Array of size at leastmin(m,n), containing pivot indices.computeType: Data type of computation.bufferOnDevice: Device workspace. Array of typevoidof sizeworkspaceInBytesOnDevicebytes.workspaceInBytesOnDevice: Size in bytes ofbufferOnDevice, returned bycusolverDnXgetrf_bufferSize.bufferOnHost: Host workspace. Array of typevoidof sizeworkspaceInBytesOnHostbytes.workspaceInBytesOnHost: Size in bytes ofbufferOnHost, returned bycusolverDnXgetrf_bufferSize.info: Ifinfo = 0, the LU factorization is successful. ifinfo = -i, thei-thparameter is wrong (not counting handle). Ifinfo = i, theU(i,i) = 0.
§Return value
cusolverStatus_t::CUSOLVER_STATUS_INTERNAL_ERROR: An internal operation failed.cusolverStatus_t::CUSOLVER_STATUS_INVALID_VALUE: Invalid parameters were passed (m,n<0orlda<max(1,m)).cusolverStatus_t::CUSOLVER_STATUS_NOT_INITIALIZED: The library was not initialized.cusolverStatus_t::CUSOLVER_STATUS_SUCCESS: The operation completed successfully.