SLATE 2024.05.31
Software for Linear Algebra Targeting Exascale
Loading...
Searching...
No Matches
trsm: Triangular solve matrix

\(C = A^{-1} B\) or \(C = B A^{-1}\) where \(A\) is triangular More...

Functions

template<typename scalar_t >
void slate::trsm (blas::Side side, scalar_t alpha, TriangularMatrix< scalar_t > &A, Matrix< scalar_t > &B, Options const &opts)
 Distributed parallel triangular matrix-matrix solve.
 
template<typename scalar_t >
void slate::trsmA (blas::Side side, scalar_t alpha, TriangularMatrix< scalar_t > &A, Matrix< scalar_t > &B, Options const &opts)
 Distributed parallel triangular matrix-matrix solve.
 
template<typename scalar_t >
void slate::trsmB (blas::Side side, scalar_t alpha, TriangularMatrix< scalar_t > &A, Matrix< scalar_t > &B, Options const &opts)
 Distributed parallel triangular matrix-matrix solve.
 

Detailed Description

\(C = A^{-1} B\) or \(C = B A^{-1}\) where \(A\) is triangular

Function Documentation

◆ trsm()

template<typename scalar_t >
void slate::trsm ( blas::Side  side,
scalar_t  alpha,
TriangularMatrix< scalar_t > &  A,
Matrix< scalar_t > &  B,
Options const &  opts 
)

Distributed parallel triangular matrix-matrix solve.

Solves one of the triangular matrix equations

\[ A X = \alpha B, \]

or

\[ X A = \alpha B, \]

where alpha is a scalar, B is an m-by-n matrix and A is a unit or non-unit, upper or lower triangular matrix. The matrix X overwrites B. The matrices can be transposed or conjugate-transposed beforehand, e.g.,

auto AT = slate::transpose( A );
slate::trsm( Side::Left, alpha, AT, B );

Complexity (in real): \(m^{2} n\) flops.

Template Parameters
scalar_tOne of float, double, std::complex<float>, std::complex<double>.
Parameters
[in]sideWhether A appears on the left or on the right of X:
  • Side::Left: solve \(A X = \alpha B\)
  • Side::Right: solve \(X A = \alpha B\)
[in]alphaThe scalar alpha.
[in]A
  • If side = left, the m-by-m triangular matrix A;
  • if side = right, the n-by-n triangular matrix A.
[in,out]BOn entry, the m-by-n matrix B. On exit, overwritten by the result X.
[in]optsAdditional options, as map of name = value pairs. Possible options:
  • Option::Lookahead: Number of panels to overlap with matrix updates. lookahead >= 0. Default 1.
  • Option::MethodTrsm: Select the right routine to call. Possible values:
    • Auto: let the routine decides [default]
    • trsmA: select trsmA routine
    • trsmB: select trsmB routine
  • Option::Target: Implementation to target. Possible values:
    • HostTask: OpenMP tasks on CPU host [default].
    • HostNest: nested OpenMP parallel for loop on CPU host.
    • HostBatch: batched BLAS on CPU host.
    • Devices: batched BLAS on GPU device.

◆ trsmA()

template<typename scalar_t >
void slate::trsmA ( blas::Side  side,
scalar_t  alpha,
TriangularMatrix< scalar_t > &  A,
Matrix< scalar_t > &  B,
Options const &  opts 
)

Distributed parallel triangular matrix-matrix solve.

Solves one of the triangular matrix equations

\[ A X = \alpha B, \]

or

\[ X A = \alpha B, \]

where alpha is a scalar, B is an m-by-n matrix and A is a unit or non-unit, upper or lower triangular matrix. The matrix X overwrites B. The matrices can be transposed or conjugate-transposed beforehand, e.g.,

auto AT = slate::transpose( A );
slate::trsmA( Side::Left, alpha, AT, B );

Note: The original trsm computes the solution where B is located. The trsmA is a variant of trsm where the computation is performed where A is located using temporary tiles to represent B, followed by a reduction to get the result into the given B. This approach is well suited in the case of a few right-hand side since it would require less communication.

Template Parameters
scalar_tOne of float, double, std::complex<float>, std::complex<double>.
Parameters
[in]sideWhether A appears on the left or on the right of X:
  • Side::Left: solve \(A X = \alpha B\)
  • Side::Right: solve \(X A = \alpha B\)
[in]alphaThe scalar alpha.
[in]A
  • If side = left, the m-by-m triangular matrix A;
  • if side = right, the n-by-n triangular matrix A.
[in,out]BOn entry, the m-by-n matrix B. On exit, overwritten by the result X.
[in]optsAdditional options, as map of name = value pairs. Possible options:
  • Option::Lookahead: Number of panels to overlap with matrix updates. lookahead >= 0. Default 1.
  • Option::Target: Implementation to target. Possible values:
    • HostTask: OpenMP tasks on CPU host [default].
    • HostNest: nested OpenMP parallel for loop on CPU host.
    • HostBatch: batched BLAS on CPU host.
    • Devices: batched BLAS on GPU device.

◆ trsmB()

template<typename scalar_t >
void slate::trsmB ( blas::Side  side,
scalar_t  alpha,
TriangularMatrix< scalar_t > &  A,
Matrix< scalar_t > &  B,
Options const &  opts 
)

Distributed parallel triangular matrix-matrix solve.

Solves one of the triangular matrix equations

\[ A X = \alpha B, \]

or

\[ X A = \alpha B, \]

where alpha is a scalar, B is an m-by-n matrix and A is a unit or non-unit, upper or lower triangular matrix. The matrix X overwrites B. The matrices can be transposed or conjugate-transposed beforehand, e.g.,

auto AT = slate::transpose( A );
slate::trsm( Side::Left, alpha, AT, B );
Template Parameters
scalar_tOne of float, double, std::complex<float>, std::complex<double>.
Parameters
[in]sideWhether A appears on the left or on the right of X:
  • Side::Left: solve \(A X = \alpha B\)
  • Side::Right: solve \(X A = \alpha B\)
[in]alphaThe scalar alpha.
[in]A
  • If side = left, the m-by-m triangular matrix A;
  • if side = right, the n-by-n triangular matrix A.
[in,out]BOn entry, the m-by-n matrix B. On exit, overwritten by the result X.
[in]optsAdditional options, as map of name = value pairs. Possible options:
  • Option::Lookahead: Number of panels to overlap with matrix updates. lookahead >= 0. Default 1.
  • Option::Target: Implementation to target. Possible values:
    • HostTask: OpenMP tasks on CPU host [default].
    • HostNest: nested OpenMP parallel for loop on CPU host.
    • HostBatch: batched BLAS on CPU host.
    • Devices: batched BLAS on GPU device.