SLATE 2024.05.31
Software for Linear Algebra Targeting Exascale
|
Solve \(AX = B\). More...
Functions | |
template<typename scalar_t > | |
int64_t | slate::posv (HermitianMatrix< scalar_t > &A, Matrix< scalar_t > &B, Options const &opts) |
Distributed parallel Cholesky factorization and solve. | |
template<typename scalar_hi , typename scalar_lo > | |
int64_t | slate::posv_mixed (HermitianMatrix< scalar_hi > &A, Matrix< scalar_hi > &B, Matrix< scalar_hi > &X, int &iter, Options const &opts) |
Distributed parallel iterative-refinement Cholesky factorization and solve. | |
template<typename scalar_hi , typename scalar_lo > | |
int64_t | slate::posv_mixed_gmres (HermitianMatrix< scalar_hi > &A, Matrix< scalar_hi > &B, Matrix< scalar_hi > &X, int &iter, Options const &opts) |
Distributed parallel GMRES-IR Cholesky factorization and solve. | |
Solve \(AX = B\).
int64_t slate::posv | ( | HermitianMatrix< scalar_t > & | A, |
Matrix< scalar_t > & | B, | ||
Options const & | opts | ||
) |
Distributed parallel Cholesky factorization and solve.
Computes the solution to a system of linear equations
\[ A X = B, \]
where \(A\) is an n-by-n Hermitian positive definite matrix and \(X\) and \(B\) are n-by-nrhs matrices. The Cholesky decomposition is used to factor \(A\) as
\[ A = L L^H, \]
if \(A\) is stored lower, where \(L\) is a lower triangular matrix, or
\[ A = U^H U, \]
if \(A\) is stored upper, where \(U\) is an upper triangular matrix. The factored form of \(A\) is then used to solve the system of equations \(A X = B\).
Complexity (in real): \(\approx \frac{1}{3} n^{3} + 2 n^{2} r\) flops.
scalar_t | One of float, double, std::complex<float>, std::complex<double>. |
[in,out] | A | On entry, the n-by-n Hermitian positive definite matrix \(A\). On exit, if return value = 0, overwritten by the factor \(U\) or \(L\) from the Cholesky factorization \(A = U^H U\) or \(A = L L^H\). If scalar_t is real, \(A\) can be a SymmetricMatrix object. |
[in,out] | B | On entry, the n-by-nrhs right hand side matrix \(B\). On exit, if return value = 0, the n-by-nrhs solution matrix \(X\). |
[in] | opts | Additional options, as map of name = value pairs. Possible options:
|
int64_t slate::posv_mixed | ( | HermitianMatrix< scalar_hi > & | A, |
Matrix< scalar_hi > & | B, | ||
Matrix< scalar_hi > & | X, | ||
int & | iter, | ||
Options const & | opts | ||
) |
Distributed parallel iterative-refinement Cholesky factorization and solve.
Computes the solution to a system of linear equations
\[ A X = B, \]
where \(A\) is an n-by-n Hermitian positive definite matrix and \(X\) and \(B\) are n-by-nrhs matrices.
posv_mixed first factorizes the matrix using potrf in low precision (single) and uses this factorization within an iterative refinement procedure to produce a solution with high precision (double) normwise backward error quality (see below). If the approach fails, the method falls back to a high precision (double) factorization and solve.
The iterative refinement is not going to be a winning strategy if the ratio of low-precision performance over high-precision performance is too small. A reasonable strategy should take the number of right-hand sides and the size of the matrix into account. This might be automated in the future. Up to now, we always try iterative refinement.
The iterative refinement process is stopped if iter > itermax or for all the RHS, \(1 \le j \le nrhs\), we have: \(\norm{r_j}_{inf} < tol \norm{x_j}_{inf} \norm{A}_{inf},\) where:
scalar_hi | One of double, std::complex<double>. |
scalar_lo | One of float, std::complex<float>. |
[in,out] | A | On entry, the n-by-n Hermitian positive definite matrix \(A\). On exit, if iterative refinement has been successfully used (return value = 0 and iter >= 0, see description below), then \(A\) is unchanged. If high precision (double) factorization has been used (return value = 0 and iter < 0, see description below), then the array \(A\) contains the factor \(U\) or \(L\) from the Cholesky factorization \(A = U^H U\) or \(A = L L^H\). If scalar_t is real, \(A\) can be a SymmetricMatrix. |
[in] | B | On entry, the n-by-nrhs right hand side matrix \(B\). |
[out] | X | On exit, if return value = 0, the n-by-nrhs solution matrix \(X\). |
[out] | iter | > 0: The number of the iterations the iterative refinement process needed for convergence. < 0: Iterative refinement failed; it falls back to a double precision factorization and solve. -3: single precision matrix was not positive definite in potrf. -(itermax+1): iterative refinement failed to converge in itermax iterations. |
[in] | opts | Additional options, as map of name = value pairs. Possible options:
|
int64_t slate::posv_mixed_gmres | ( | HermitianMatrix< scalar_hi > & | A, |
Matrix< scalar_hi > & | B, | ||
Matrix< scalar_hi > & | X, | ||
int & | iter, | ||
Options const & | opts | ||
) |
Distributed parallel GMRES-IR Cholesky factorization and solve.
Computes the solution to a system of linear equations
\[ A X = B, \]
where \(A\) is an n-by-n Hermitian positive definite matrix and \(X\) and \(B\) are n-by-nrhs matrices.
posv_mixed_gmres first factorizes the matrix using potrf in low precision (single) and uses this factorization within a GMRES-IR procedure to produce a solution with high precision (double) normwise backward error quality (see below). If the approach fails, the method falls back to a high precision (double) factorization and solve.
GMRES-IR is not going to be a winning strategy if the ratio of low-precision performance over high-precision performance is too small. A reasonable strategy should take the number of right-hand sides and the the size of the matrix into account. This might be automated in the future. Up to now, we always try iterative refinement.
GMRES-IR process is stopped if iter > itermax or for all the RHS, \(1 \le j \le nrhs\), we have: \(\norm{r_j}_{inf} < tol \norm{x_j}_{inf} \norm{A}_{inf},\) where:
scalar_hi | One of double, std::complex<double>. |
scalar_lo | One of float, std::complex<float>. |
[in,out] | A | On entry, the n-by-n Hermitian positive definite matrix \(A\). On exit, if iterative refinement has been successfully used (return value = 0 and iter >= 0, see description below), then \(A\) is unchanged. If high precision (double) factorization has been used (return value = 0 and iter < 0, see description below), then the array \(A\) contains the factors \(L\) or \(U\) from the Cholesky factorization \(A = U^H U\) or \(A = L L^H\). |
[in] | B | On entry, the n-by-nrhs right hand side matrix \(B\). |
[out] | X | On exit, if return value = 0, the n-by-nrhs solution matrix \(X\). |
[out] | iter | |
[out] | iter | > 0: The number of the iterations the iterative refinement process needed for convergence. < 0: Iterative refinement failed; it falls back to a double precision factorization and solve. -3: single precision matrix was exactly singular in getrf. -(itermax+1): iterative refinement failed to converge in itermax iterations. |
[in] | opts | Additional options, as map of name = value pairs. Possible options:
|