SLATE 2024.05.31
Software for Linear Algebra Targeting Exascale
Loading...
Searching...
No Matches

Functions

template<typename scalar_t >
void slate::tile::tplqt (int64_t l, Tile< scalar_t > A1, Tile< scalar_t > A2, Tile< scalar_t > T)
 Compute the triangular-pentagonal LQ factorization of 2 tiles, A1 and A2.
 
template<typename scalar_t >
void slate::tile::tpmlqt (Side side, Op op, int64_t l, Tile< scalar_t > V2, Tile< scalar_t > T, Tile< scalar_t > C1, Tile< scalar_t > C2)
 Multiply the matrix C by the unitary matrix Q obtained from a "triangular-pentagonal" block reflector H.
 

Detailed Description

Function Documentation

◆ tplqt()

template<typename scalar_t >
void slate::tile::tplqt ( int64_t  l,
Tile< scalar_t >  A1,
Tile< scalar_t >  A2,
Tile< scalar_t >  T 
)

Compute the triangular-pentagonal LQ factorization of 2 tiles, A1 and A2.

On exit, the pentagonal tile A2 has been eliminated.

Parameters
[in]lThe number of columns of the lower trapezoidal part of A2. min( k, n ) >= l >= 0. See Further Details.
[in,out]A1On entry, the k-by-k lower triangular matrix A1, in an k-by-k1 tile, where k1 >= k.
[in,out]A2On entry, the k-by-n lower pentagonal matrix A2, in an k-by-n2 tile, where n2 >= n. On exit, the rows represent the Householder reflectors. The left k-by-(n-l) portion is rectangular, the right k-by-l portion is lower trapezoidal.
[out]TArray of size ib-by-k, where ib is the internal blocking to use, 1 <= ib <= k, in an ib3-by-k3 tile, ib3 >= ib and k3 >= k. On exit, stores a sequence of ib-by-ib upper triangular T matrices representing the block Householder reflectors. See Further Details.

Further Details

Let A be the k-by-(k + n) matrix

A = [ A1  A2 ]

where A1 is k-by-k lower triangular, and A2 is k-by-n lower pentagonal.

For all cases, A1 is lower triangular. Example with k = 4, where "." represent non-zeros.

A1 = [ .        |  0 ]
     [ . .      |  0 ]
     [ . . .    |  0 ]
     [ . . . .  |  0 ]
  k-by-k lower    if k1 > k,
    triangular    all-zero columns are ignored

Depending on n, k, l, there are several cases for A2. Currently, SLATE supports only cases 1, 2, and 3. It assumes n = min( A2.mb, A2.nb ), and l = n or l = 0.

Case 1: n = k = l, A2 is lower triangular. Example with n = k = l = 4.

A1 = [ .       ]      A2 = [ .        |  0 ]
     [ . .     ]           [ . .      |  0 ]
     [ . . .   ]           [ . . .    |  0 ]
     [ . . . . ]           [ . . . .  |  0 ]
                        k-by-l lower     if n2 > n, 

Case 2: n < k and l = min( n, k ) = n, A2 is lower trapezoidal (tall). Example with n = l = 3, k = 4.

A1 = [ .       ]      A2 = [ .     ]  <== k-by-l lower trapezoidal
     [ . .     ]           [ . .   ]
     [ . . .   ]           [ . . . ]
     [ . . . . ]           [ . . . ]

Case 3: l = 0, A2 is just the rectangular portion. Currently unused in SLATE, but should work. Example with n = 3, l = 0, k = 4.

A1 = [ .       ]      A2 = [ . . . ]  <== k-by-n rectangular
     [ . .     ]           [ . . . ]
     [ . . .   ]           [ . . . ]
     [ . . . . ]           [ . . . ]

Case 4: n > k and l = k, A2 is upper trapezoidal (wide). Currently unsupported in SLATE; would require explicitly passing n. Example with n = 6, l = k = 4.

A1 = [ .       ]      A2 = [ . . | .       ]
     [ . .     ]           [ . . | . .     ]
     [ . . .   ]           [ . . | . . .   ]
     [ . . . . ]           [ . . | . . . . ]
                    k-by-(n - l)   k-by-l lower 

Case 5: 0 < l < min( n, k ), A2 is lower pentagonal. Currently unsupported in SLATE; would require explicitly passing n. Example with n = 5, l = 3, k = 4.

A1 = [ .       ]      A2 = [ . . | .     ]
     [ . .     ]           [ . . | . .   ]
     [ . . .   ]           [ . . | . . . ]
     [ . . . . ]           [ . . | . . . ]
    k-by-k lower    k-by-(n - l)   k-by-l lower
      triangular     rectangular   trapezoidal

After factoring, the vector representing the elementary reflector H(i) is in the i-th row of the k-by-(k+n) matrix V:

V = [ I  V2 ]

where I is the k-by-k identity and V2 is k-by-n pentagonal, same form as A2.

Thus, all of the information needed for V is contained in V2, which has the same form as A2 and overwrites A2 on exit.

The number of blocks is r = ceiling( k/ib ), where each block is of order ib except for the last block, which is of order rb = k - (r-1)*ib. For each of the r blocks, an upper triangular block reflector factor is computed: T1, T2, ..., Tr. The ib-by-ib (and rb-by-rb for the last block) T's are stored in the ib-by-k matrix T as

T = [ T1 T2 ... Tr ]

Note: compared to LAPACK, A is renamed here => A1, B => A2, W => V, V => V2, and m => k. This makes k match k in tpmlqt.

◆ tpmlqt()

template<typename scalar_t >
void slate::tile::tpmlqt ( Side  side,
Op  op,
int64_t  l,
Tile< scalar_t >  V2,
Tile< scalar_t >  T,
Tile< scalar_t >  C1,
Tile< scalar_t >  C2 
)

Multiply the matrix C by the unitary matrix Q obtained from a "triangular-pentagonal" block reflector H.

C consists of two tiles, C1 and C2.

If side == Left:

V = [ I  V2 ]      C = [ C1 ]  <== k-by-n
 k-by-k  k-by-m        [ C2 ]  <== m-by-n

and on exit, \(C = op(Q) C\). l, k are the same in tplqt; m = tplqt's n; n here is different.

If side == Right:

C = [ C1  C2 ]      V = [ I  V2 ]
  m-by-k  m-by-n     k-by-k  k-by-n

and on exit, \(C = C op(Q)\). l, k, n are the same in tplqt; m here is different.

Q is a product of block reflectors,

Q = \prod_{j = r, ..., 1} I - Vj^H Tj^H Vj

where r is the number of blocks, Tj is the j-th block of T, and Vj is the j-th block row of V, with internal blocking size ib.

See Further Details in tplqt.

Parameters
[in]side
  • Side::Left: Multiply from the left: \(C = op(Q) C\).
  • Side::Right: Multiply from the right: \(C = C op(Q)\).
[in]op
  • Op::NoTrans: Multiply by \(op(Q) = Q\).
  • Op::Trans: Multiply by \(op(Q) = Q^T\) (only in real case).
  • Op::ConjTrans: Multiply by \(op(Q) = Q^H\).
[in]lThe number of columns of the lower trapezoidal part of V2.
  • If side = left, min( m, k ) >= l >= 0.
  • If side = right, min( n, k ) >= l >= 0.
[in]V2The k-by-N, lower pentagonal matrix V2, in an k-by-N2 tile, where N2 >= N.
  • If side == Left, N = m.
  • If side == Right, N = n. The i-th row must contain the vector which defines the elementary reflector H(i), for i = 1, 2, ..., k, as returned by tplqt in A2. The left k-by-(N-l) portion is rectangular, the right k-by-l portion is lower trapezoidal. See Further Details in tplqt.
[in]TThe upper triangular factors of the block reflectors as returned by tplqt, stored as an ib-by-k tile.
[in,out]C1
  • If side == Left, the k-by-n matrix C1, in a k1-by-n tile, k1 >= k.
  • If side == Right, the m-by-k matrix C1, in an m-by-k1 tile, k1 >= k. On exit, C1 is overwritten by the corresponding block of \(op(Q) C\) or \(C op(Q)\).
[in,out]C2The m-by-n matrix C2.
  • If side == Left, C2 is in an m2-by-n tile, m2 >= m.
  • If side == Right, C2 is in an m-by-n2 tile, n2 >= n. On exit, C2 is overwritten by the corresponding block of \(op(Q) C\) or \(C op(Q)\).

Note: compared to LAPACK, A is renamed here => C1, B => C2, V => V2.