SLATE 2024.05.31
Software for Linear Algebra Targeting Exascale
|
Functions | |
template<typename scalar_t > | |
void | slate::tile::tplqt (int64_t l, Tile< scalar_t > A1, Tile< scalar_t > A2, Tile< scalar_t > T) |
Compute the triangular-pentagonal LQ factorization of 2 tiles, A1 and A2. | |
template<typename scalar_t > | |
void | slate::tile::tpmlqt (Side side, Op op, int64_t l, Tile< scalar_t > V2, Tile< scalar_t > T, Tile< scalar_t > C1, Tile< scalar_t > C2) |
Multiply the matrix C by the unitary matrix Q obtained from a "triangular-pentagonal" block reflector H. | |
void slate::tile::tplqt | ( | int64_t | l, |
Tile< scalar_t > | A1, | ||
Tile< scalar_t > | A2, | ||
Tile< scalar_t > | T | ||
) |
Compute the triangular-pentagonal LQ factorization of 2 tiles, A1 and A2.
On exit, the pentagonal tile A2 has been eliminated.
[in] | l | The number of columns of the lower trapezoidal part of A2. min( k, n ) >= l >= 0. See Further Details. |
[in,out] | A1 | On entry, the k-by-k lower triangular matrix A1, in an k-by-k1 tile, where k1 >= k. |
[in,out] | A2 | On entry, the k-by-n lower pentagonal matrix A2, in an k-by-n2 tile, where n2 >= n. On exit, the rows represent the Householder reflectors. The left k-by-(n-l) portion is rectangular, the right k-by-l portion is lower trapezoidal. |
[out] | T | Array of size ib-by-k, where ib is the internal blocking to use, 1 <= ib <= k, in an ib3-by-k3 tile, ib3 >= ib and k3 >= k. On exit, stores a sequence of ib-by-ib upper triangular T matrices representing the block Householder reflectors. See Further Details. |
Let A be the k-by-(k + n) matrix
A = [ A1 A2 ]
where A1 is k-by-k lower triangular, and A2 is k-by-n lower pentagonal.
For all cases, A1 is lower triangular. Example with k = 4, where "." represent non-zeros.
A1 = [ . | 0 ] [ . . | 0 ] [ . . . | 0 ] [ . . . . | 0 ] k-by-k lower if k1 > k, triangular all-zero columns are ignored
Depending on n, k, l, there are several cases for A2. Currently, SLATE supports only cases 1, 2, and 3. It assumes n = min( A2.mb, A2.nb ), and l = n or l = 0.
Case 1: n = k = l, A2 is lower triangular. Example with n = k = l = 4.
A1 = [ . ] A2 = [ . | 0 ] [ . . ] [ . . | 0 ] [ . . . ] [ . . . | 0 ] [ . . . . ] [ . . . . | 0 ] k-by-l lower if n2 > n,
Case 2: n < k and l = min( n, k ) = n, A2 is lower trapezoidal (tall). Example with n = l = 3, k = 4.
A1 = [ . ] A2 = [ . ] <== k-by-l lower trapezoidal [ . . ] [ . . ] [ . . . ] [ . . . ] [ . . . . ] [ . . . ]
Case 3: l = 0, A2 is just the rectangular portion. Currently unused in SLATE, but should work. Example with n = 3, l = 0, k = 4.
A1 = [ . ] A2 = [ . . . ] <== k-by-n rectangular [ . . ] [ . . . ] [ . . . ] [ . . . ] [ . . . . ] [ . . . ]
Case 4: n > k and l = k, A2 is upper trapezoidal (wide). Currently unsupported in SLATE; would require explicitly passing n. Example with n = 6, l = k = 4.
A1 = [ . ] A2 = [ . . | . ] [ . . ] [ . . | . . ] [ . . . ] [ . . | . . . ] [ . . . . ] [ . . | . . . . ] k-by-(n - l) k-by-l lower
Case 5: 0 < l < min( n, k ), A2 is lower pentagonal. Currently unsupported in SLATE; would require explicitly passing n. Example with n = 5, l = 3, k = 4.
A1 = [ . ] A2 = [ . . | . ] [ . . ] [ . . | . . ] [ . . . ] [ . . | . . . ] [ . . . . ] [ . . | . . . ] k-by-k lower k-by-(n - l) k-by-l lower triangular rectangular trapezoidal
After factoring, the vector representing the elementary reflector H(i) is in the i-th row of the k-by-(k+n) matrix V:
V = [ I V2 ]
where I is the k-by-k identity and V2 is k-by-n pentagonal, same form as A2.
Thus, all of the information needed for V is contained in V2, which has the same form as A2 and overwrites A2 on exit.
The number of blocks is r = ceiling( k/ib ), where each block is of order ib except for the last block, which is of order rb = k - (r-1)*ib. For each of the r blocks, an upper triangular block reflector factor is computed: T1, T2, ..., Tr. The ib-by-ib (and rb-by-rb for the last block) T's are stored in the ib-by-k matrix T as
T = [ T1 T2 ... Tr ]
Note: compared to LAPACK, A is renamed here => A1, B => A2, W => V, V => V2, and m => k. This makes k match k in tpmlqt.
void slate::tile::tpmlqt | ( | Side | side, |
Op | op, | ||
int64_t | l, | ||
Tile< scalar_t > | V2, | ||
Tile< scalar_t > | T, | ||
Tile< scalar_t > | C1, | ||
Tile< scalar_t > | C2 | ||
) |
Multiply the matrix C by the unitary matrix Q obtained from a "triangular-pentagonal" block reflector H.
C consists of two tiles, C1 and C2.
If side == Left:
V = [ I V2 ] C = [ C1 ] <== k-by-n k-by-k k-by-m [ C2 ] <== m-by-n
and on exit, \(C = op(Q) C\). l, k are the same in tplqt; m = tplqt's n; n here is different.
If side == Right:
C = [ C1 C2 ] V = [ I V2 ] m-by-k m-by-n k-by-k k-by-n
and on exit, \(C = C op(Q)\). l, k, n are the same in tplqt; m here is different.
Q is a product of block reflectors,
Q = \prod_{j = r, ..., 1} I - Vj^H Tj^H Vj
where r is the number of blocks, Tj is the j-th block of T, and Vj is the j-th block row of V, with internal blocking size ib.
See Further Details in tplqt.
[in] | side |
|
[in] | op |
|
[in] | l | The number of columns of the lower trapezoidal part of V2.
|
[in] | V2 | The k-by-N, lower pentagonal matrix V2, in an k-by-N2 tile, where N2 >= N.
|
[in] | T | The upper triangular factors of the block reflectors as returned by tplqt, stored as an ib-by-k tile. |
[in,out] | C1 |
|
[in,out] | C2 | The m-by-n matrix C2.
|
Note: compared to LAPACK, A is renamed here => C1, B => C2, V => V2.