SLATE 2024.05.31
Software for Linear Algebra Targeting Exascale
Loading...
Searching...
No Matches

Factor, forward and back solve. More...

Functions

template<typename scalar_t >
int64_t slate::pbtrf (HermitianBandMatrix< scalar_t > &A, Options const &opts)
 Distributed parallel band Cholesky factorization.
 
template<typename scalar_t >
void slate::pbtrs (HermitianBandMatrix< scalar_t > &A, Matrix< scalar_t > &B, Options const &opts)
 Distributed parallel Cholesky solve.
 

Detailed Description

Factor, forward and back solve.

Function Documentation

◆ pbtrf()

template<typename scalar_t >
int64_t slate::pbtrf ( HermitianBandMatrix< scalar_t > &  A,
Options const &  opts 
)

Distributed parallel band Cholesky factorization.

Computes the Cholesky factorization of a Hermitian positive definite band matrix \(A\).

The factorization has the form

\[ A = L L^H, \]

if \(A\) is stored lower, where \(L\) is a lower triangular band matrix, or

\[ A = U^H U, \]

if \(A\) is stored upper, where \(U\) is an upper triangular band matrix.

Template Parameters
scalar_tOne of float, double, std::complex<float>, std::complex<double>.
Parameters
[in,out]AOn entry, the Hermitian band matrix \(A\) to be factored. Tiles outside the bandwidth do not need to exist. For tiles that are partially outside the bandwidth, data outside the bandwidth should be explicitly set to zero. On exit, the factor \(L\) or \(U\) from the factorization \(A = L L^H\) or \(A = U^H U\).
[in]optsAdditional options, as map of name = value pairs. Possible options:
  • Option::Lookahead: Number of panels to overlap with matrix updates. lookahead >= 0. Default 1.
  • Option::Target: Implementation to target. Possible values:
    • HostTask: OpenMP tasks on CPU host [default].
    • HostNest: nested OpenMP parallel for loop on CPU host.
    • HostBatch: batched BLAS on CPU host.
    • Devices: batched BLAS on GPU device.
Returns
0: successful exit
i > 0: the leading minor of order \(i\) of \(A\) is not positive definite, so the factorization could not be completed.

◆ pbtrs()

template<typename scalar_t >
void slate::pbtrs ( HermitianBandMatrix< scalar_t > &  A,
Matrix< scalar_t > &  B,
Options const &  opts 
)

Distributed parallel Cholesky solve.

Solves a system of linear equations

\[ A X = B \]

with a Hermitian positive definite hermitian matrix \(A\) using the Cholesky factorization \(A = U^H U\) or \(A = L L^H\) computed by potrf.

Template Parameters
scalar_tOne of float, double, std::complex<float>, std::complex<double>.
Parameters
[in]AThe n-by-n triangular factor \(U\) or \(L\) from the Cholesky factorization \(A = U^H U\) or \(A = L L^H\), computed by potrf. If scalar_t is real, \(A\) can be a SymmetricMatrix object.
[in,out]BOn entry, the n-by-nrhs right hand side matrix \(B\). On exit, the n-by-nrhs solution matrix \(X\).
[in]optsAdditional options, as map of name = value pairs. Possible options:
  • Option::Lookahead: Number of panels to overlap with matrix updates. lookahead >= 0. Default 1.
  • Option::Target: Implementation to target. Possible values:
    • HostTask: OpenMP tasks on CPU host [default].
    • HostNest: nested OpenMP parallel for loop on CPU host.
    • HostBatch: batched BLAS on CPU host.
    • Devices: batched BLAS on GPU device.