SLATE 2024.05.31
Software for Linear Algebra Targeting Exascale
|
Functions | |
template<Target target, typename scalar_t > | |
void | slate::impl::gemmA (scalar_t alpha, Matrix< scalar_t > &A, Matrix< scalar_t > &B, scalar_t beta, Matrix< scalar_t > &C, Options const &opts) |
void slate::impl::gemmA | ( | scalar_t | alpha, |
Matrix< scalar_t > & | A, | ||
Matrix< scalar_t > & | B, | ||
scalar_t | beta, | ||
Matrix< scalar_t > & | C, | ||
Options const & | opts | ||
) |
Distributed parallel general matrix-matrix multiplication. Designed for situations where A is larger than B or C, so the algorithm does not move A, instead moving B to the location of A and reducing the C matrix. Generic implementation for any target. Dependencies enforce the following behavior: