PLASMA
Parallel Linear Algebra Software for Multicore Architectures
gesv: Solves Ax = b using LU factorization (driver)

Functions

int plasma_dsgesv (int n, int nrhs, double *pA, int lda, int *ipiv, double *pB, int ldb, double *pX, int ldx, int *iter)
 
void plasma_omp_dsgesv (plasma_desc_t A, int *ipiv, plasma_desc_t B, plasma_desc_t X, plasma_desc_t As, plasma_desc_t Xs, plasma_desc_t R, double *work, double *Rnorm, double *Xnorm, int *iter, plasma_sequence_t *sequence, plasma_request_t *request)
 
int plasma_zcgesv (int n, int nrhs, plasma_complex64_t *pA, int lda, int *ipiv, plasma_complex64_t *pB, int ldb, plasma_complex64_t *pX, int ldx, int *iter)
 
void plasma_omp_zcgesv (plasma_desc_t A, int *ipiv, plasma_desc_t B, plasma_desc_t X, plasma_desc_t As, plasma_desc_t Xs, plasma_desc_t R, double *work, double *Rnorm, double *Xnorm, int *iter, plasma_sequence_t *sequence, plasma_request_t *request)
 

Detailed Description

Function Documentation

int plasma_dsgesv ( int  n,
int  nrhs,
double *  pA,
int  lda,
int *  ipiv,
double *  pB,
int  ldb,
double *  pX,
int  ldx,
int *  iter 
)

Computes the solution to a system of linear equations A * X = B, where A is an n-by-n matrix and X and B are n-by-nrhs matrices.

plasma_dsgesv first factorizes the matrix using plasma_sgetrf and uses this factorization within an iterative refinement procedure to produce a solution with COMPLEX*16 normwise backward error quality (see below). If the approach fails the method falls back to a COMPLEX*16 factorization and solve.

The iterative refinement is not going to be a winning strategy if the ratio COMPLEX performance over COMPLEX*16 performance is too small. A reasonable strategy should take the number of right-hand sides and the size of the matrix into account. This might be done with a call to ILAENV in the future. Up to now, we always try iterative refinement.

The iterative refinement process is stopped if iter > itermax or for all the RHS we have: Rnorm < sqrt(n)*Xnorm*Anorm*eps*BWDmax where:

  • iter is the number of the current iteration in the iterative refinement process
  • Rnorm is the Infinity-norm of the residual
  • Xnorm is the Infinity-norm of the solution
  • Anorm is the Infinity-operator-norm of the matrix A
  • eps is the machine epsilon returned by DLAMCH('Epsilon'). The values itermax and BWDmax are fixed to 30 and 1.0D+00 respectively.
Parameters
[in]nThe number of linear equations, i.e., the order of the matrix A. n >= 0.
[in]nrhsThe number of right hand sides, i.e., the number of columns of the matrix B. nrhs >= 0.
[in,out]pAThe n-by-n matrix A. On exit, contains the LU factors of A.
[in]ldaThe leading dimension of the array A. lda >= max(1,n).
[out]ipivThe pivot indices; for 1 <= i <= min(m,n), row i of the matrix was interchanged with row ipiv(i).
[in]pBThe n-by-nrhs matrix of right hand side matrix B. This matrix remains unchanged.
[in]ldbThe leading dimension of the array B. ldb >= max(1,n).
[out]pXIf return value = 0, the n-by-nrhs solution matrix X.
[in]ldxThe leading dimension of the array X. ldx >= max(1,n).
[out]iterThe number of the iterations in the iterative refinement process, needed for the convergence. If failed, it is set to be -(1+itermax), where itermax = 30.
Return values
PlasmaSuccesssuccessful exit
See also
plasma_omp_dsgesv
plasma_dsgesv
plasma_dgesv
void plasma_omp_dsgesv ( plasma_desc_t  A,
int *  ipiv,
plasma_desc_t  B,
plasma_desc_t  X,
plasma_desc_t  As,
plasma_desc_t  Xs,
plasma_desc_t  R,
double *  work,
double *  Rnorm,
double *  Xnorm,
int *  iter,
plasma_sequence_t *  sequence,
plasma_request_t *  request 
)

Solves a general linear system of equations using iterative refinement with the LU factor computed using plasma_sgetrf. Non-blocking tile version of plasma_dsgesv(). Operates on matrices stored by tiles. All matrices are passed through descriptors. All dimensions are taken from the descriptors. Allows for pipelining of operations at runtime.

Parameters
[in]ADescriptor of matrix A.
[out]ipivThe pivot indices; for 1 <= i <= min(m,n), row i of the matrix was interchanged with row ipiv(i).
[in]BDescriptor of matrix B.
[in,out]XDescriptor of matrix X.
[out]AsDescriptor of auxiliary matrix A in single complex precision.
[out]XsDescriptor of auxiliary matrix X in single complex precision.
[out]RDescriptor of auxiliary remainder matrix R.
[out]workWorkspace needed to compute infinity norm of the matrix A.
[out]RnormWorkspace needed to store the max value in each of resudual vectors.
[out]XnormWorkspace needed to store the max value in each of currenct solution vectors.
[out]iterThe number of the iterations in the iterative refinement process, needed for the convergence. If failed, it is set to be -(1+itermax), where itermax = 30.
[in]sequenceIdentifies the sequence of function calls that this call belongs to (for completion checks and exception handling purposes).
[out]requestIdentifies this function call (for exception handling purposes).
Return values
voidErrors are returned by setting sequence->status and request->status to error values. The sequence->status and request->status should never be set to PLASMA_SUCCESS (the initial values) since another async call may be setting a failure value at the same time.
See also
plasma_dsgesv
plasma_omp_dsgesv
plasma_omp_dgesv
int plasma_zcgesv ( int  n,
int  nrhs,
plasma_complex64_t *  pA,
int  lda,
int *  ipiv,
plasma_complex64_t *  pB,
int  ldb,
plasma_complex64_t *  pX,
int  ldx,
int *  iter 
)

Computes the solution to a system of linear equations A * X = B, where A is an n-by-n matrix and X and B are n-by-nrhs matrices.

plasma_zcgesv first factorizes the matrix using plasma_cgetrf and uses this factorization within an iterative refinement procedure to produce a solution with COMPLEX*16 normwise backward error quality (see below). If the approach fails the method falls back to a COMPLEX*16 factorization and solve.

The iterative refinement is not going to be a winning strategy if the ratio COMPLEX performance over COMPLEX*16 performance is too small. A reasonable strategy should take the number of right-hand sides and the size of the matrix into account. This might be done with a call to ILAENV in the future. Up to now, we always try iterative refinement.

The iterative refinement process is stopped if iter > itermax or for all the RHS we have: Rnorm < sqrt(n)*Xnorm*Anorm*eps*BWDmax where:

  • iter is the number of the current iteration in the iterative refinement process
  • Rnorm is the Infinity-norm of the residual
  • Xnorm is the Infinity-norm of the solution
  • Anorm is the Infinity-operator-norm of the matrix A
  • eps is the machine epsilon returned by DLAMCH('Epsilon'). The values itermax and BWDmax are fixed to 30 and 1.0D+00 respectively.
Parameters
[in]nThe number of linear equations, i.e., the order of the matrix A. n >= 0.
[in]nrhsThe number of right hand sides, i.e., the number of columns of the matrix B. nrhs >= 0.
[in,out]pAThe n-by-n matrix A. On exit, contains the LU factors of A.
[in]ldaThe leading dimension of the array A. lda >= max(1,n).
[out]ipivThe pivot indices; for 1 <= i <= min(m,n), row i of the matrix was interchanged with row ipiv(i).
[in]pBThe n-by-nrhs matrix of right hand side matrix B. This matrix remains unchanged.
[in]ldbThe leading dimension of the array B. ldb >= max(1,n).
[out]pXIf return value = 0, the n-by-nrhs solution matrix X.
[in]ldxThe leading dimension of the array X. ldx >= max(1,n).
[out]iterThe number of the iterations in the iterative refinement process, needed for the convergence. If failed, it is set to be -(1+itermax), where itermax = 30.
Return values
PlasmaSuccesssuccessful exit
See also
plasma_omp_zcgesv
plasma_dsgesv
plasma_zgesv
void plasma_omp_zcgesv ( plasma_desc_t  A,
int *  ipiv,
plasma_desc_t  B,
plasma_desc_t  X,
plasma_desc_t  As,
plasma_desc_t  Xs,
plasma_desc_t  R,
double *  work,
double *  Rnorm,
double *  Xnorm,
int *  iter,
plasma_sequence_t *  sequence,
plasma_request_t *  request 
)

Solves a general linear system of equations using iterative refinement with the LU factor computed using plasma_cgetrf. Non-blocking tile version of plasma_zcgesv(). Operates on matrices stored by tiles. All matrices are passed through descriptors. All dimensions are taken from the descriptors. Allows for pipelining of operations at runtime.

Parameters
[in]ADescriptor of matrix A.
[out]ipivThe pivot indices; for 1 <= i <= min(m,n), row i of the matrix was interchanged with row ipiv(i).
[in]BDescriptor of matrix B.
[in,out]XDescriptor of matrix X.
[out]AsDescriptor of auxiliary matrix A in single complex precision.
[out]XsDescriptor of auxiliary matrix X in single complex precision.
[out]RDescriptor of auxiliary remainder matrix R.
[out]workWorkspace needed to compute infinity norm of the matrix A.
[out]RnormWorkspace needed to store the max value in each of resudual vectors.
[out]XnormWorkspace needed to store the max value in each of currenct solution vectors.
[out]iterThe number of the iterations in the iterative refinement process, needed for the convergence. If failed, it is set to be -(1+itermax), where itermax = 30.
[in]sequenceIdentifies the sequence of function calls that this call belongs to (for completion checks and exception handling purposes).
[out]requestIdentifies this function call (for exception handling purposes).
Return values
voidErrors are returned by setting sequence->status and request->status to error values. The sequence->status and request->status should never be set to PLASMA_SUCCESS (the initial values) since another async call may be setting a failure value at the same time.
See also
plasma_zcgesv
plasma_omp_dsgesv
plasma_omp_zgesv