Functions
template<Target target, typename scalar_t >
void	slate::impl::gemmA (scalar_t alpha, Matrix< scalar_t > &A, Matrix< scalar_t > &B, scalar_t beta, Matrix< scalar_t > &C, Options const &opts)

Detailed Description

Function Documentation

◆ gemmA()

template<Target target, typename scalar_t >

void slate::impl::gemmA	(	scalar_t	alpha,
		Matrix< scalar_t > &	A,
		Matrix< scalar_t > &	B,
		scalar_t	beta,
		Matrix< scalar_t > &	C,
		Options const &	opts
	)

Distributed parallel general matrix-matrix multiplication. Designed for situations where A is larger than B or C, so the algorithm does not move A, instead moving B to the location of A and reducing the C matrix. Generic implementation for any target. Dependencies enforce the following behavior:

bcast communications are serialized,
gemm operations are serialized,
bcasts can get ahead of gemms by the value of lookahead. ColMajor layout is assumed

Functions

Detailed Description

Function Documentation

◆ gemmA()