Initialize/finalize | |
PLASMA descriptor | |
▼Utilities | |
Map LAPACK <=> PLASMA constants | |
▼Matrix layout conversion | |
cm2ccrb: Converts column-major (CM) to tiled (CCRB) | |
ccrb2cm: Converts tiled (CCRB) to column-major (CM) | |
▼Linear system solvers | Solves \( Ax = b \) |
►General matrices: LU | Solves \( Ax = b \) using LU factorization for general matrices |
gesv: Solves Ax = b using LU factorization (driver) | |
getrf: LU factorization | |
getrs: LU forward and back solves | |
getri: LU inverse | |
gerfs: Refine solution | |
►Auxiliary routines | |
getf2: LU panel factorization | |
laswp: Swap rows | |
►General matrices: least squares | Solves \( Ax \approx b \) where \( A \) is rectangular |
gels: Least squares solve of Ax = b using QR or LQ factorization (driver) | |
gelqs: Back solve using LQ factorization of A | |
geqrs: Back solve using QR factorization of A | |
►Symmetric/Hermitian positive definite: Cholesky | Solves \( Ax = b \) using Cholesky factorization for SPD/HPD matrices |
posv: Solves Ax = b using Cholesky factorization (driver) | |
potrf: Cholesky factorization | |
potrs: Cholesky forward and back solves | |
potri: Cholesky inverse | |
porfs: Refine solution | |
►Auxiliary routines | |
potf2: Cholesky panel factorization | |
lauum: Multiplies triangular matrices; used in potri | |
►Symmetric/Hermitian indefinite | Solves \( Ax = b \) using indefinite factorization for symmetric/Hermitian matrices |
sy/hesv: Solves Ax = b using symmetric indefinite factorization (driver) | |
sy/hetrf: Symmetric indefinite factorization | |
sy/hetrs: Symmetric indefinite forward and back solves | |
sy/hetri: Symmetric indefinite inverse | |
sy/herfs: Refine solution | |
Auxiliary routines | |
▼Orthogonal/unitary factorizations | Factor \( A \) using \( QR, RQ, QL, LQ \) |
►QR factorization | Factor \( A = QR \) |
geqrf: QR factorization | |
or/unmqr: Multiplies by Q from QR factorization | |
or/ungqr: Generates Q from QR factorization | |
►Auxiliary routines | |
geqr2: QR panel factorization | |
►RQ factorization | Factor \( A = RQ \) |
gerqf: RQ factorization | |
or/unmrq: Multiplies by Q from RQ factorization | |
or/ungrq: Generates Q from RQ factorization | |
►QL factorization | Factor \( A = QL \) |
geqlf: QL factorization | |
or/unmql: Multiplies by Q from QL factorization | |
or/ungql: Generates Q from QL factorization | |
►LQ factorization | Factor \( A = LQ \) |
gelqf: LQ factorization | |
or/unmlq: Multiplies by Q from LQ factorization | |
or/unglq: Generates Q from LQ factorization | |
▼Eigenvalues | Solves \( Ax = \lambda x \) |
►Non-symmetric eigenvalues | Solves \( Ax = \lambda x \) where \( A \) is general |
geev: Non-symmetric eigenvalues (driver) | |
gehrd: Hessenberg reduction | |
or/unmhr: Multiplies by Q from Hessenberg reduction | |
or/unghr: Generates Q from Hessenberg reduction | |
Auxiliary routines | |
►Symmetric/Hermitian eigenvalues | Solves \( Ax = \lambda x \) where \( A \) is symmetric/Hermitian |
sy/heev: Solves using QR iteration (driver) | |
sy/heevd: Solves using divide-and-conquer (driver) | |
sy/heevr: Solves using MRRR (driver) | |
sy/hetrd: Tridiagonal reduction | |
or/unmtr: Multiplies by Q from tridiagonal reduction | |
or/ungtr: Generates Q from tridiagonal reduction | |
►Auxiliary routines | |
hegst: divide-and-conquer method | |
►Generalized Symmetric/Hermitian eigenvalues | Solves \( Ax = \lambda B x \), \( ABx = \lambda x \), or \( BAx = \lambda x \) where \( A, B \) are symmetric/Hermitian and \( B \) is positive definite |
sy/hegv: Solves using QR iteration (driver) | |
sy/hegvd: Solves using divide-and-conquer (driver) | |
sy/hegvr: Solves using MRRR (driver) | |
►Auxiliary routines | |
hegst: divide-and-conquer method | |
▼Singular Value Decomposition (SVD) | Factor \( A = U \Sigma V^T \) |
gesvd: SVD using QR iteration | |
gesdd: SVD using divide-and-conquer | |
gebrd: Bidiagonal reduction | |
or/unmbr: Multiplies by Q or P from bidiagonal reduction | |
or/ungbr: Generates Q or P from bidiagonal reduction | |
Auxiliary routines | |
▼PLASMA BLAS and Auxiliary (parallel) | BLAS and Auxiliary functions. Standard BLAS and LAPACK auxiliary routines are grouped by amount of work into Level 1, 2, 3 |
►Level 1: vectors operations, O(n) work | Vector operations that perform \( O(n) \) work on \( O(n) \) data. These are memory bound, since every operation requires a memory read or write |
asum: Sum vector | \( \sum_i |x_i| \) |
axpy: Add vectors | \( y = \alpha x + y \) |
copy: Copy vector | \( y = x \) |
dot: Dot (inner) product | \( x^T y \) or \( x^H y \) |
iamax: Find max element | \( \text{argmax}_i\; |x_i| \) |
iamin: Find min element | \( \text{argmin}_i\; |x_i| \) |
nrm2: Vector 2 norm | \( ||x||_2 \) |
rot: Apply Given's rotation | |
rotg: Generate Given's rotation | |
scal: Scale vector | \( x = \alpha x \) |
swap: Swap vectors | \( x <=> y \) |
►Level 2: matrix-vector operations, O(n^2) work | Matrix operations that perform \( O(n^2) \) work on \( O(n^2) \) data. These are memory bound, since every operation requires a memory read or write |
geadd: Add matrices | \( B = \alpha A + \beta B \) |
gemv: General matrix-vector multiply | \( y = \alpha Ax + \beta y \) |
ger: General matrix rank 1 update | \( A = \alpha xy^T + A \) |
hemv: Hermitian matrix-vector multiply | \( y = \alpha Ax + \beta y \) |
her: Hermitian rank 1 update | \( A = \alpha xx^T + A \) |
her2: Hermitian rank 2 update | \( A = \alpha xy^T + \alpha yx^T + A \) |
symv: Symmetric matrix-vector multiply | \( y = \alpha Ax + \beta y \) |
syr: Symmetric rank 1 update | \( A = \alpha xx^T + A \) |
syr2: Symmetric rank 2 update | \( A = \alpha xy^T + \alpha yx^T + A \) |
trmv: Triangular matrix-vector multiply | \( x = Ax \) |
trsv: Triangular matrix-vector solve | \( x = op(A^{-1})\; b \) |
lacpy: Copy matrix | \( B = A \) |
lascl: Scale matrix by scalar | \( A = \alpha A \) |
lascl2: Scale matrix by diagonal | \( A = D A \) |
laset: Set matrix to constants | \( A_{ij} = \) diag if \( i=j \); \( A_{ij} = \) offdiag otherwise |
►Level 3: matrix-matrix operations, O(n^3) work | Matrix-matrix operations that perform \( O(n^3) \) work on \( O(n^2) \) data. These benefit from cache reuse, since many operations can be performed for every read from main memory |
gemm: General matrix multiply: C = AB + C | \( C = \alpha \;op(A) \;op(B) + \beta C \) |
hemm: Hermitian matrix multiply | \( C = \alpha A B + \beta C \) or \( C = \alpha B A + \beta C \) where \( A \) is Hermitian |
herk: Hermitian rank k update | \( C = \alpha A A^T + \beta C \) where \( C \) is Hermitian |
her2k: Hermitian rank 2k update | \( C = \alpha A B^T + \alpha B A^T + \beta C \) where \( C \) is Hermitian |
symm: Symmetric matrix multiply | \( C = \alpha A B + \beta C \) or \( C = \alpha B A + \beta C \) where \( A \) is symmetric |
syrk: Symmetric rank k update | \( C = \alpha A A^T + \beta C \) where \( C \) is symmetric |
syr2k: Symmetric rank 2k update | \( C = \alpha A B^T + \alpha B A^T + \beta C \) where \( C \) is symmetric |
trmm: Triangular matrix multiply | \( B = \alpha \;op(A)\; B \) or \( B = \alpha B \;op(A) \) where \( A \) is triangular |
trsm: Triangular solve matrix | \( C = op(A)^{-1} B \) or \( C = B \;op(A)^{-1} \) where \( A \) is triangular |
trtri: Triangular inverse; used in getri, potri | \( A = A^{-1} \) where \( A \) is triangular |
►Householder reflectors | |
larf: Apply Householder reflector to general matrix | |
larfy: Apply Householder reflector to symmetric/Hermitian matrix | |
larfg: Generate Householder reflector | |
larfb: Apply block of Householder reflectors (Level 3) | |
►Precision conversion | |
_lag2_: Converts general matrix between single and double | |
_lat2_: Converts triangular matrix between single and double | |
►Matrix norms | |
lange: General matrix norm | 1, Frobenius, or Infinity norm; or largest element |
lansy/he: Symmetric/Hermitian matrix norm | 1, Frobenius, or Infinity norm; or largest element |
lantr: Triangular matrix norm | 1, Frobenius, or Infinity norm; or largest element |
▼Core BLAS and Auxiliary (single core) | Core BLAS and Auxiliary functions. Standard BLAS and LAPACK auxiliary routines are grouped by amount of work into Level 1, 2, 3 |
►Level 0: element operations, O(1) work | Operations on single elements |
cabs1: Complex 1-norm absolute value | \( |real(alpha)| + |imag(alpha)| \) |
►Level 1: vectors operations, O(n) work | Vector operations that perform \( O(n) \) work on \( O(n) \) data. These are memory bound, since every operation requires a memory read or write |
asum: Sum vector | \( \sum_i |x_i| \) |
axpy: Add vectors | \( y = \alpha x + y \) |
copy: Copy vector | \( y = x \) |
dot: Dot (inner) product | \( x^T y \) or \( x^H y \) |
iamax: Find max element | \( \text{argmax}_i\; |x_i| \) |
iamin: Find min element | \( \text{argmin}_i\; |x_i| \) |
nrm2: Vector 2 norm | \( ||x||_2 \) |
rot: Apply Given's rotation | |
rotg: Generate Given's rotation | |
scal: Scale vector | \( x = \alpha x \) |
swap: Swap vectors | \( x <=> y \) |
►Level 2: matrix-vector operations, O(n^2) work | Matrix operations that perform \( O(n^2) \) work on \( O(n^2) \) data. These are memory bound, since every operation requires a memory read or write |
geadd: Add matrices | \( B = \alpha A + \beta B \) |
gemv: General matrix-vector multiply | \( y = \alpha Ax + \beta y \) |
ger: General matrix rank 1 update | \( A = \alpha xy^T + A \) |
hemv: Hermitian matrix-vector multiply | \( y = \alpha Ax + \beta y \) |
her: Hermitian rank 1 update | \( A = \alpha xx^T + A \) |
her2: Hermitian rank 2 update | \( A = \alpha xy^T + \alpha yx^T + A \) |
symv: Symmetric matrix-vector multiply | \( y = \alpha Ax + \beta y \) |
syr: Symmetric rank 1 update | \( A = \alpha xx^T + A \) |
syr2: Symmetric rank 2 update | \( A = \alpha xy^T + \alpha yx^T + A \) |
trmv: Triangular matrix-vector multiply | \( x = Ax \) |
trsv: Triangular matrix-vector solve | \( x = op(A^{-1})\; b \) |
lacpy: Copy matrix | \( B = A \) |
lascl: Scale matrix by scalar | \( A = \alpha A \) |
lascl2: Scale matrix by diagonal | \( A = D A \) |
laset: Set matrix to constants | \( A_{ij} = \) diag if \( i=j \); \( A_{ij} = \) offdiag otherwise |
►Level 3: matrix-matrix operations, O(n^3) work | Matrix-matrix operations that perform \( O(n^3) \) work on \( O(n^2) \) data. These benefit from cache reuse, since many operations can be performed for every read from main memory |
gemm: General matrix multiply: C = AB + C | \( C = \alpha \;op(A) \;op(B) + \beta C \) |
hemm: Hermitian matrix multiply | \( C = \alpha A B + \beta C \) or \( C = \alpha B A + \beta C \) where \( A \) is Hermitian |
herk: Hermitian rank k update | \( C = \alpha A A^T + \beta C \) where \( C \) is Hermitian |
her2k: Hermitian rank 2k update | \( C = \alpha A B^T + \alpha B A^T + \beta C \) where \( C \) is Hermitian |
symm: Symmetric matrix multiply | \( C = \alpha A B + \beta C \) or \( C = \alpha B A + \beta C \) where \( A \) is symmetric |
syrk: Symmetric rank k update | \( C = \alpha A A^T + \beta C \) where \( C \) is symmetric |
syr2k: Symmetric rank 2k update | \( C = \alpha A B^T + \alpha B A^T + \beta C \) where \( C \) is symmetric |
trmm: Triangular matrix multiply | \( B = \alpha \;op(A)\; B \) or \( B = \alpha B \;op(A) \) where \( A \) is triangular |
trsm: Triangular solve matrix | \( C = op(A)^{-1} B \) or \( C = B \;op(A)^{-1} \) where \( A \) is triangular |
trtri: Triangular inverse; used in getri, potri | \( A = A^{-1} \) where \( A \) is triangular |
►Householder reflectors | |
larf: Apply Householder reflector to general matrix | |
larfy: Apply Householder reflector to symmetric/Hermitian matrix | |
larfg: Generate Householder reflector | |
larfb: Apply block of Householder reflectors (Level 3) | |
►Precision conversion | |
_lag2_: Converts general matrix between single and double | |
_lat2_: Converts triangular matrix between single and double | |
►Matrix norms | |
lange: General matrix norm | 1, Frobenius, or Infinity norm; or largest element |
lanhe: Hermitian matrix norm | 1, Frobenius, or Infinity norm; or largest element |
lansy: Symmetric matrix norm | 1, Frobenius, or Infinity norm; or largest element |
lantr: Triangular matrix norm | 1, Frobenius, or Infinity norm; or largest element |
►Linear system solvers | |
potrf: Cholesky factorization | |
geqrt: QR factorization of a tile | |
tsqrt: QR factorization of a rectangular matrix of two tiles | |
unmqr: Apply Householder reflectors from QR to a tile | |
tsmqr: Apply Householder reflectors from QR to a rectangular matrix of two tiles | |
gelqt: LQ factorization of a tile | |
tslqt: LQ factorization of a rectangular matrix of two tiles | |
unmlq: Apply Householder reflectors from LQ to a tile | |
tsmlq: Apply Householder reflectors from LQ to a rectangular matrix of two tiles | |
pamm: Updating a matrix using two tiles | |
parfb: Apply Householder reflectors to a rectangular matrix of two tiles | |