Function cusolverDnIRSXgesv

Source

pub unsafe extern "C" fn cusolverDnIRSXgesv(
    handle: cusolverDnHandle_t,
    gesv_irs_params: cusolverDnIRSParams_t,
    gesv_irs_infos: cusolverDnIRSInfos_t,
    n: cusolver_int_t,
    nrhs: cusolver_int_t,
    dA: *mut c_void,
    ldda: cusolver_int_t,
    dB: *mut c_void,
    lddb: cusolver_int_t,
    dX: *mut c_void,
    lddx: cusolver_int_t,
    dWorkspace: *mut c_void,
    lwork_bytes: size_t,
    niters: *mut cusolver_int_t,
    d_info: *mut cusolver_int_t,
) -> cusolverStatus_t

Expand description

This function is designed to perform same functionality as cusolverDn<T1><T2>gesv() functions, but wrapped in a more generic and expert interface that gives user more control to parametrize the function as well as it provides more information on output. cusolverDnIRSXgesv allows additional control of the solver parameters such as setting:

the main precision (Inputs/Outputs precision) of the solver
the lowest precision to be used internally by the solver
the refinement solver type
the maximum allowed number of iterations in the refinement phase
the tolerance of the refinement solver
the fallback to main precision
and more

through the configuration parameters structure gesv_irs_params and its helper functions. For more details about what configuration can be set and its meaning please refer to all the functions in the cuSolverDN Helper Function Section that start with cusolverDnIRSParamsxxxx(). Moreover, cusolverDnIRSXgesv provides additional information on the output such as the convergence history (e.g., the residual norms) at each iteration and the number of iterations needed to converge. For more details about what information can be retrieved and its meaning please refer to all the functions in the cuSolverDN Helper Function Section that start with cusolverDnIRSInfosxxxx()

The function returns value describes the results of the solving process. A cusolverStatus_t::CUSOLVER_STATUS_SUCCESS indicates that the function finished with success otherwise, it indicates if one of the API arguments is incorrect, or if the configurations of params/infos structure is incorrect or if the function did not finish with success. More details about the error can be found by checking the niters and the dinfo API parameters. See their description below for further details. User should provide the required workspace allocated on device for the cusolverDnIRSXgesv function. The amount of bytes required for the function can be queried by calling the respective function cusolverDnIRSXgesv_bufferSize. Note that, if the user would like a particular configuration to be set via the params structure, it should be set before the call to cusolverDnIRSXgesv_bufferSize to get the size of the required workspace.

Tensor Float (TF32), introduced with NVIDIA Ampere architecture GPUs, is the most robust tensor core accelerated compute mode for the iterative refinement solver. It is able to solve the widest range of problems in HPC arising from different applications and provides up to 4X and 5X speedup for real and complex systems, respectively. On Volta and Turing architecture GPUs, half precision tensor core acceleration is recommended. In cases where the iterative refinement solver fails to converge to the desired accuracy (main precision, INOUT data precision), it is recommended to use main precision as internal lowest precision.

The following table provides all possible combinations values for the lowest precision corresponding to the Inputs/Outputs data type. Note that if the lowest precision matches the Inputs/Outputs datatype, then the main precision factorization will be used.

Supported Inputs/Outputs data type and lower precision for the IRS solver

Inputs/Outputs Data Type (e.g., main precision)	Supported values for the lowest precision
`cusolverPrecType_t::CUSOLVER_C_64F`	`CUSOLVER_C_64F, CUSOLVER_C_32F, CUSOLVER_C_16F, CUSOLVER_C_16BF, CUSOLVER_C_TF32`
`cusolverPrecType_t::CUSOLVER_C_32F`	`CUSOLVER_C_32F, CUSOLVER_C_16F, CUSOLVER_C_16BF, CUSOLVER_C_TF32`
`cusolverPrecType_t::CUSOLVER_R_64F`	`CUSOLVER_R_64F, CUSOLVER_R_32F, CUSOLVER_R_16F, CUSOLVER_R_16BF, CUSOLVER_R_TF32`
`cusolverPrecType_t::CUSOLVER_R_32F`	`CUSOLVER_R_32F, CUSOLVER_R_16F, CUSOLVER_R_16BF, CUSOLVER_R_TF32`

The cusolverDnIRSXgesv_bufferSize function returns the required workspace buffer size in bytes for the corresponding cusolverDnXgesv() call with the given gesv_irs_params configuration.

n<0
lda<max(1,n)
ldb<max(1,n)
ldx<max(1,n).

§Parameters

handle: Handle to the cusolverDn library context.
gesv_irs_params: Configuration parameters structure, can serve one or more calls to any IRS solver.
gesv_irs_infos: Info structure, where information about a particular solve will be stored. The gesv_irs_infos structure correspond to a particular call. Thus different calls requires different gesv_irs_infos structure otherwise, it will be overwritten.
n: Number of rows and columns of square matrix A. Should be non-negative.
nrhs: Number of right hand sides to solve. Should be non-negative. Note that, nrhs is limited to 1 if the selected IRS refinement solver is cusolverIRSRefinement_t::CUSOLVER_IRS_REFINE_GMRES, cusolverIRSRefinement_t::CUSOLVER_IRS_REFINE_GMRES_GMRES, cusolverIRSRefinement_t::CUSOLVER_IRS_REFINE_CLASSICAL_GMRES.
dA: Matrix A with size n-by-n. Can’t be NULL. On return - will contain the factorization of the matrix A in the main precision (A = P * L * U, where P - permutation matrix defined by vector ipiv, L and U - lower and upper triangular matrices) if the iterative refinement solver was set to cusolverIRSRefinement_t::CUSOLVER_IRS_REFINE_NONE and the lowest precision is equal to the main precision (Inputs/Outputs datatype), or if the iterative refinement solver did not converge and the fallback to main precision was enabled (fallback enabled is the default setting); unchanged otherwise.
ldda: Leading dimension of two-dimensional array used to store matrix A. lda >= n.
dB: Set of right hand sides B of size n-by-nrhs. Can’t be NULL.
lddb: Leading dimension of two-dimensional array used to store matrix of right hand sides B. ldb >= n.
dX: Set of solution vectors X of size n-by-nrhs. Can’t be NULL.
lddx: Leading dimension of two-dimensional array used to store matrix of solution vectors X. ldx >= n.
dWorkspace: Pointer to an allocated workspace in device memory of size lwork_bytes.
lwork_bytes: Size of the allocated device workspace. Should be at least what was returned by cusolverDnIRSXgesv_bufferSize function.
niters: If iter is * <0 : iterative refinement has failed, main precision (Inputs/Outputs precision) factorization has been performed if fallback is enabled. * -1 : taking into account machine parameters, n, nrhs, it is a priori not worth working in lower precision * -2 : overflow of an entry when moving from main to lower precision * -3 : failure during the factorization * -5 : overflow occurred during computation * -maxiter: solver stopped the iterative refinement after reaching maximum allowed iterations. * >0 : iter is a number of iterations solver performed to reach convergence criteria.

§Return value

cusolverStatus_t::CUSOLVER_STATUS_ALLOC_FAILED: CPU memory allocation failed, most likely during the allocation of the residual array that store the residual norms.
cusolverStatus_t::CUSOLVER_STATUS_ARCH_MISMATCH: The IRS solver supports compute capability 7.0 and above. The lowest precision options CUSOLVER_[CR]16BF and CUSOLVER[CR]_TF32 are only available on compute capability 8.0 and above.
cusolverStatus_t::CUSOLVER_STATUS_INTERNAL_ERROR: An internal error occurred, check the dinfo and the niters arguments for more details.
cusolverStatus_t::CUSOLVER_STATUS_INVALID_VALUE: Invalid parameters were passed, for example:

n<0
lda<max(1,n)
ldb<max(1,n)
ldx<max(1,n).

cusolverStatus_t::CUSOLVER_STATUS_INVALID_WORKSPACE: lwork_bytes is smaller than the required workspace. Could happen if the users called cusolverDnIRSXgesv_bufferSize function, then changed some of the configurations setting such as the lowest precision.
cusolverStatus_t::CUSOLVER_STATUS_IRS_INFOS_NOT_INITIALIZED: The information structure gesv_irs_infos was not created.
cusolverStatus_t::CUSOLVER_STATUS_IRS_NOT_SUPPORTED: One of the configuration parameter in the gesv_irs_params structure is not supported. For example if nrhs >1, and refinement solver was set to cusolverIRSRefinement_t::CUSOLVER_IRS_REFINE_GMRES.
cusolverStatus_t::CUSOLVER_STATUS_IRS_OUT_OF_RANGE: Numerical error related to niters <0, see niters description for more details.
cusolverStatus_t::CUSOLVER_STATUS_IRS_PARAMS_INVALID: One of the configuration parameter in the gesv_irs_params structure is not valid.
cusolverStatus_t::CUSOLVER_STATUS_IRS_PARAMS_INVALID_MAXITER: The maxiter configuration parameter in the gesv_irs_params structure is not valid.
cusolverStatus_t::CUSOLVER_STATUS_IRS_PARAMS_INVALID_PREC: The main and/or the lowest precision configuration parameter in the gesv_irs_params structure is not valid, check the table above for the supported combinations.
cusolverStatus_t::CUSOLVER_STATUS_IRS_PARAMS_INVALID_REFINE: The refinement solver configuration parameter in the gesv_irs_params structure is not valid.
cusolverStatus_t::CUSOLVER_STATUS_IRS_PARAMS_NOT_INITIALIZED: The configuration parameter gesv_irs_params structure was not created.
cusolverStatus_t::CUSOLVER_STATUS_NOT_INITIALIZED: The library was not initialized.
cusolverStatus_t::CUSOLVER_STATUS_SUCCESS: The operation completed successfully.

cusolverDnIRSXgesv

Function cusolverDnIRSXgesv Copy item path

§Parameters

§Return value

Function cusolverDnIRSXgesv