PLASMA
Parallel Linear Algebra Software for Multicore Architectures

\( C = \alpha A B^T + \alpha B A^T + \beta C \) where \( C \) is Hermitian More...

Functions

int plasma_cher2k (plasma_enum_t uplo, plasma_enum_t trans, int n, int k, plasma_complex32_t alpha, plasma_complex32_t *pA, int lda, plasma_complex32_t *pB, int ldb, float beta, plasma_complex32_t *pC, int ldc)
 
void plasma_omp_cher2k (plasma_enum_t uplo, plasma_enum_t trans, plasma_complex32_t alpha, plasma_desc_t A, plasma_desc_t B, float beta, plasma_desc_t C, plasma_sequence_t *sequence, plasma_request_t *request)
 
int plasma_zher2k (plasma_enum_t uplo, plasma_enum_t trans, int n, int k, plasma_complex64_t alpha, plasma_complex64_t *pA, int lda, plasma_complex64_t *pB, int ldb, double beta, plasma_complex64_t *pC, int ldc)
 
void plasma_omp_zher2k (plasma_enum_t uplo, plasma_enum_t trans, plasma_complex64_t alpha, plasma_desc_t A, plasma_desc_t B, double beta, plasma_desc_t C, plasma_sequence_t *sequence, plasma_request_t *request)
 

Detailed Description

\( C = \alpha A B^T + \alpha B A^T + \beta C \) where \( C \) is Hermitian

Function Documentation

int plasma_cher2k ( plasma_enum_t  uplo,
plasma_enum_t  trans,
int  n,
int  k,
plasma_complex32_t  alpha,
plasma_complex32_t *  pA,
int  lda,
plasma_complex32_t *  pB,
int  ldb,
float  beta,
plasma_complex32_t *  pC,
int  ldc 
)

Performs one of the Hermitian rank 2k operations

\[ C = \alpha A \times B^H + conjg( \alpha ) B \times A^H + \beta C, \]

or

\[ C = \alpha A^H \times B + conjg( \alpha ) B^H \times A + \beta C, \]

where alpha is a complex scalar, beta is a real scalar, C is an n-by-n Hermitian matrix, and A and B are n-by-k matrices in the first case and k-by-n matrices in the second case.

Parameters
[in]uplo
  • PlasmaUpper: Upper triangle of C is stored;
  • PlasmaLower: Lower triangle of C is stored.
[in]trans
  • PlasmaNoTrans:

    \[ C = \alpha A \times B^H + conjg( \alpha ) B \times A^H + \beta C; \]

  • PlasmaConjTrans:

    \[ C = \alpha A^H \times B + conjg( \alpha ) B^H \times A + \beta C. \]

[in]nThe order of the matrix C. n >= zero.
[in]kIf trans = PlasmaNoTrans, number of columns of the A and B matrices; if trans = PlasmaConjTrans, number of rows of the A and B matrices.
[in]alphaThe scalar alpha.
[in]pAAn lda-by-ka matrix. If trans = PlasmaNoTrans, ka = k; if trans = PlasmaConjTrans, ka = n.
[in]ldaThe leading dimension of the array A. If trans = PlasmaNoTrans, lda >= max(1, n); if trans = PlasmaConjTrans, lda >= max(1, k).
[in]pBAn ldb-by-kb matrix. If trans = PlasmaNoTrans, kb = k; if trans = PlasmaConjTrans, kb = n.
[in]ldbThe leading dimension of the array B. If trans = PlasmaNoTrans, ldb >= max(1, n); if trans = PlasmaConjTrans, ldb >= max(1, k).
[in]betaThe scalar beta.
[in,out]pCAn ldc-by-n matrix. On exit, the uplo part of the matrix is overwritten by the uplo part of the updated matrix.
[in]ldcThe leading dimension of the array C. ldc >= max(1, n).
Return values
PlasmaSuccesssuccessful exit
See also
plasma_omp_cher2k
plasma_cher2k
void plasma_omp_cher2k ( plasma_enum_t  uplo,
plasma_enum_t  trans,
plasma_complex32_t  alpha,
plasma_desc_t  A,
plasma_desc_t  B,
float  beta,
plasma_desc_t  C,
plasma_sequence_t *  sequence,
plasma_request_t *  request 
)

Performs rank 2k update. Non-blocking tile version of plasma_cher2k(). May return before the computation is finished. Operates on matrices stored by tiles. All matrices are passed through descriptors. All dimensions are taken from the descriptors. Allows for pipelining of operations at runtime.

Parameters
[in]uplo
  • PlasmaUpper: Upper triangle of C is stored;
  • PlasmaLower: Lower triangle of C is stored.
[in]trans
  • PlasmaNoTrans:

    \[ C = \alpha A \times B^H + conjg( \alpha ) B \times A^H + \beta C; \]

  • PlasmaConjTrans:

    \[ C = \alpha A^H \times B + conjg( \alpha ) B^H \times A + \beta C. \]

[in]alphaThe scalar alpha.
[in]ADescriptor of matrix A.
[in]BDescriptor of matrix B.
[in]betaThe scalar beta.
[in,out]CDescriptor of matrix C.
[in]sequenceIdentifies the sequence of function calls that this call belongs to (for completion checks and exception handling purposes). Check the sequence->status for errors.
[out]requestIdentifies this function call (for exception handling purposes).
Return values
voidErrors are returned by setting sequence->status and request->status to error values. The sequence->status and request->status should never be set to PlasmaSuccess (the initial values) since another async call may be setting a failure value at the same time.
See also
plasma_cher2k
plasma_omp_cher2k
plasma_omp_cher2k
int plasma_zher2k ( plasma_enum_t  uplo,
plasma_enum_t  trans,
int  n,
int  k,
plasma_complex64_t  alpha,
plasma_complex64_t *  pA,
int  lda,
plasma_complex64_t *  pB,
int  ldb,
double  beta,
plasma_complex64_t *  pC,
int  ldc 
)

Performs one of the Hermitian rank 2k operations

\[ C = \alpha A \times B^H + conjg( \alpha ) B \times A^H + \beta C, \]

or

\[ C = \alpha A^H \times B + conjg( \alpha ) B^H \times A + \beta C, \]

where alpha is a complex scalar, beta is a real scalar, C is an n-by-n Hermitian matrix, and A and B are n-by-k matrices in the first case and k-by-n matrices in the second case.

Parameters
[in]uplo
  • PlasmaUpper: Upper triangle of C is stored;
  • PlasmaLower: Lower triangle of C is stored.
[in]trans
  • PlasmaNoTrans:

    \[ C = \alpha A \times B^H + conjg( \alpha ) B \times A^H + \beta C; \]

  • PlasmaConjTrans:

    \[ C = \alpha A^H \times B + conjg( \alpha ) B^H \times A + \beta C. \]

[in]nThe order of the matrix C. n >= zero.
[in]kIf trans = PlasmaNoTrans, number of columns of the A and B matrices; if trans = PlasmaConjTrans, number of rows of the A and B matrices.
[in]alphaThe scalar alpha.
[in]pAAn lda-by-ka matrix. If trans = PlasmaNoTrans, ka = k; if trans = PlasmaConjTrans, ka = n.
[in]ldaThe leading dimension of the array A. If trans = PlasmaNoTrans, lda >= max(1, n); if trans = PlasmaConjTrans, lda >= max(1, k).
[in]pBAn ldb-by-kb matrix. If trans = PlasmaNoTrans, kb = k; if trans = PlasmaConjTrans, kb = n.
[in]ldbThe leading dimension of the array B. If trans = PlasmaNoTrans, ldb >= max(1, n); if trans = PlasmaConjTrans, ldb >= max(1, k).
[in]betaThe scalar beta.
[in,out]pCAn ldc-by-n matrix. On exit, the uplo part of the matrix is overwritten by the uplo part of the updated matrix.
[in]ldcThe leading dimension of the array C. ldc >= max(1, n).
Return values
PlasmaSuccesssuccessful exit
See also
plasma_omp_zher2k
plasma_cher2k
void plasma_omp_zher2k ( plasma_enum_t  uplo,
plasma_enum_t  trans,
plasma_complex64_t  alpha,
plasma_desc_t  A,
plasma_desc_t  B,
double  beta,
plasma_desc_t  C,
plasma_sequence_t *  sequence,
plasma_request_t *  request 
)

Performs rank 2k update. Non-blocking tile version of plasma_zher2k(). May return before the computation is finished. Operates on matrices stored by tiles. All matrices are passed through descriptors. All dimensions are taken from the descriptors. Allows for pipelining of operations at runtime.

Parameters
[in]uplo
  • PlasmaUpper: Upper triangle of C is stored;
  • PlasmaLower: Lower triangle of C is stored.
[in]trans
  • PlasmaNoTrans:

    \[ C = \alpha A \times B^H + conjg( \alpha ) B \times A^H + \beta C; \]

  • PlasmaConjTrans:

    \[ C = \alpha A^H \times B + conjg( \alpha ) B^H \times A + \beta C. \]

[in]alphaThe scalar alpha.
[in]ADescriptor of matrix A.
[in]BDescriptor of matrix B.
[in]betaThe scalar beta.
[in,out]CDescriptor of matrix C.
[in]sequenceIdentifies the sequence of function calls that this call belongs to (for completion checks and exception handling purposes). Check the sequence->status for errors.
[out]requestIdentifies this function call (for exception handling purposes).
Return values
voidErrors are returned by setting sequence->status and request->status to error values. The sequence->status and request->status should never be set to PlasmaSuccess (the initial values) since another async call may be setting a failure value at the same time.
See also
plasma_zher2k
plasma_omp_zher2k
plasma_omp_cher2k