PLASMA
Parallel Linear Algebra Software for Multicore Architectures
|
Functions | |
int | plasma_dsgesv (int n, int nrhs, double *pA, int lda, int *ipiv, double *pB, int ldb, double *pX, int ldx, int *iter) |
void | plasma_omp_dsgesv (plasma_desc_t A, int *ipiv, plasma_desc_t B, plasma_desc_t X, plasma_desc_t As, plasma_desc_t Xs, plasma_desc_t R, double *work, double *Rnorm, double *Xnorm, int *iter, plasma_sequence_t *sequence, plasma_request_t *request) |
int | plasma_zcgesv (int n, int nrhs, plasma_complex64_t *pA, int lda, int *ipiv, plasma_complex64_t *pB, int ldb, plasma_complex64_t *pX, int ldx, int *iter) |
void | plasma_omp_zcgesv (plasma_desc_t A, int *ipiv, plasma_desc_t B, plasma_desc_t X, plasma_desc_t As, plasma_desc_t Xs, plasma_desc_t R, double *work, double *Rnorm, double *Xnorm, int *iter, plasma_sequence_t *sequence, plasma_request_t *request) |
int plasma_dsgesv | ( | int | n, |
int | nrhs, | ||
double * | pA, | ||
int | lda, | ||
int * | ipiv, | ||
double * | pB, | ||
int | ldb, | ||
double * | pX, | ||
int | ldx, | ||
int * | iter | ||
) |
Computes the solution to a system of linear equations A * X = B, where A is an n-by-n matrix and X and B are n-by-nrhs matrices.
plasma_dsgesv first factorizes the matrix using plasma_sgetrf and uses this factorization within an iterative refinement procedure to produce a solution with COMPLEX*16 normwise backward error quality (see below). If the approach fails the method falls back to a COMPLEX*16 factorization and solve.
The iterative refinement is not going to be a winning strategy if the ratio COMPLEX performance over COMPLEX*16 performance is too small. A reasonable strategy should take the number of right-hand sides and the size of the matrix into account. This might be done with a call to ILAENV in the future. Up to now, we always try iterative refinement.
The iterative refinement process is stopped if iter > itermax or for all the RHS we have: Rnorm < sqrt(n)*Xnorm*Anorm*eps*BWDmax where:
[in] | n | The number of linear equations, i.e., the order of the matrix A. n >= 0. |
[in] | nrhs | The number of right hand sides, i.e., the number of columns of the matrix B. nrhs >= 0. |
[in,out] | pA | The n-by-n matrix A. On exit, contains the LU factors of A. |
[in] | lda | The leading dimension of the array A. lda >= max(1,n). |
[out] | ipiv | The pivot indices; for 1 <= i <= min(m,n), row i of the matrix was interchanged with row ipiv(i). |
[in] | pB | The n-by-nrhs matrix of right hand side matrix B. This matrix remains unchanged. |
[in] | ldb | The leading dimension of the array B. ldb >= max(1,n). |
[out] | pX | If return value = 0, the n-by-nrhs solution matrix X. |
[in] | ldx | The leading dimension of the array X. ldx >= max(1,n). |
[out] | iter | The number of the iterations in the iterative refinement process, needed for the convergence. If failed, it is set to be -(1+itermax), where itermax = 30. |
PlasmaSuccess | successful exit |
void plasma_omp_dsgesv | ( | plasma_desc_t | A, |
int * | ipiv, | ||
plasma_desc_t | B, | ||
plasma_desc_t | X, | ||
plasma_desc_t | As, | ||
plasma_desc_t | Xs, | ||
plasma_desc_t | R, | ||
double * | work, | ||
double * | Rnorm, | ||
double * | Xnorm, | ||
int * | iter, | ||
plasma_sequence_t * | sequence, | ||
plasma_request_t * | request | ||
) |
Solves a general linear system of equations using iterative refinement with the LU factor computed using plasma_sgetrf. Non-blocking tile version of plasma_dsgesv(). Operates on matrices stored by tiles. All matrices are passed through descriptors. All dimensions are taken from the descriptors. Allows for pipelining of operations at runtime.
[in] | A | Descriptor of matrix A. |
[out] | ipiv | The pivot indices; for 1 <= i <= min(m,n), row i of the matrix was interchanged with row ipiv(i). |
[in] | B | Descriptor of matrix B. |
[in,out] | X | Descriptor of matrix X. |
[out] | As | Descriptor of auxiliary matrix A in single complex precision. |
[out] | Xs | Descriptor of auxiliary matrix X in single complex precision. |
[out] | R | Descriptor of auxiliary remainder matrix R. |
[out] | work | Workspace needed to compute infinity norm of the matrix A. |
[out] | Rnorm | Workspace needed to store the max value in each of resudual vectors. |
[out] | Xnorm | Workspace needed to store the max value in each of currenct solution vectors. |
[out] | iter | The number of the iterations in the iterative refinement process, needed for the convergence. If failed, it is set to be -(1+itermax), where itermax = 30. |
[in] | sequence | Identifies the sequence of function calls that this call belongs to (for completion checks and exception handling purposes). |
[out] | request | Identifies this function call (for exception handling purposes). |
void | Errors are returned by setting sequence->status and request->status to error values. The sequence->status and request->status should never be set to PLASMA_SUCCESS (the initial values) since another async call may be setting a failure value at the same time. |
int plasma_zcgesv | ( | int | n, |
int | nrhs, | ||
plasma_complex64_t * | pA, | ||
int | lda, | ||
int * | ipiv, | ||
plasma_complex64_t * | pB, | ||
int | ldb, | ||
plasma_complex64_t * | pX, | ||
int | ldx, | ||
int * | iter | ||
) |
Computes the solution to a system of linear equations A * X = B, where A is an n-by-n matrix and X and B are n-by-nrhs matrices.
plasma_zcgesv first factorizes the matrix using plasma_cgetrf and uses this factorization within an iterative refinement procedure to produce a solution with COMPLEX*16 normwise backward error quality (see below). If the approach fails the method falls back to a COMPLEX*16 factorization and solve.
The iterative refinement is not going to be a winning strategy if the ratio COMPLEX performance over COMPLEX*16 performance is too small. A reasonable strategy should take the number of right-hand sides and the size of the matrix into account. This might be done with a call to ILAENV in the future. Up to now, we always try iterative refinement.
The iterative refinement process is stopped if iter > itermax or for all the RHS we have: Rnorm < sqrt(n)*Xnorm*Anorm*eps*BWDmax where:
[in] | n | The number of linear equations, i.e., the order of the matrix A. n >= 0. |
[in] | nrhs | The number of right hand sides, i.e., the number of columns of the matrix B. nrhs >= 0. |
[in,out] | pA | The n-by-n matrix A. On exit, contains the LU factors of A. |
[in] | lda | The leading dimension of the array A. lda >= max(1,n). |
[out] | ipiv | The pivot indices; for 1 <= i <= min(m,n), row i of the matrix was interchanged with row ipiv(i). |
[in] | pB | The n-by-nrhs matrix of right hand side matrix B. This matrix remains unchanged. |
[in] | ldb | The leading dimension of the array B. ldb >= max(1,n). |
[out] | pX | If return value = 0, the n-by-nrhs solution matrix X. |
[in] | ldx | The leading dimension of the array X. ldx >= max(1,n). |
[out] | iter | The number of the iterations in the iterative refinement process, needed for the convergence. If failed, it is set to be -(1+itermax), where itermax = 30. |
PlasmaSuccess | successful exit |
void plasma_omp_zcgesv | ( | plasma_desc_t | A, |
int * | ipiv, | ||
plasma_desc_t | B, | ||
plasma_desc_t | X, | ||
plasma_desc_t | As, | ||
plasma_desc_t | Xs, | ||
plasma_desc_t | R, | ||
double * | work, | ||
double * | Rnorm, | ||
double * | Xnorm, | ||
int * | iter, | ||
plasma_sequence_t * | sequence, | ||
plasma_request_t * | request | ||
) |
Solves a general linear system of equations using iterative refinement with the LU factor computed using plasma_cgetrf. Non-blocking tile version of plasma_zcgesv(). Operates on matrices stored by tiles. All matrices are passed through descriptors. All dimensions are taken from the descriptors. Allows for pipelining of operations at runtime.
[in] | A | Descriptor of matrix A. |
[out] | ipiv | The pivot indices; for 1 <= i <= min(m,n), row i of the matrix was interchanged with row ipiv(i). |
[in] | B | Descriptor of matrix B. |
[in,out] | X | Descriptor of matrix X. |
[out] | As | Descriptor of auxiliary matrix A in single complex precision. |
[out] | Xs | Descriptor of auxiliary matrix X in single complex precision. |
[out] | R | Descriptor of auxiliary remainder matrix R. |
[out] | work | Workspace needed to compute infinity norm of the matrix A. |
[out] | Rnorm | Workspace needed to store the max value in each of resudual vectors. |
[out] | Xnorm | Workspace needed to store the max value in each of currenct solution vectors. |
[out] | iter | The number of the iterations in the iterative refinement process, needed for the convergence. If failed, it is set to be -(1+itermax), where itermax = 30. |
[in] | sequence | Identifies the sequence of function calls that this call belongs to (for completion checks and exception handling purposes). |
[out] | request | Identifies this function call (for exception handling purposes). |
void | Errors are returned by setting sequence->status and request->status to error values. The sequence->status and request->status should never be set to PLASMA_SUCCESS (the initial values) since another async call may be setting a failure value at the same time. |