Skip to main content

Module block_transposed

Module block_transposed 

Source
Expand description

Block-transposed matrix types with configurable packing.

This module provides block-transposed matrix types — BlockTransposed (owned), BlockTransposedRef (shared view), and BlockTransposedMut (mutable view) — where groups of GROUP rows are stored in transposed form to enable efficient SIMD processing. An optional packing factor PACK interleaves adjacent columns within each group, which can be used to feed SIMD instructions that operate on packed pairs (e.g. vpmaddwd with PACK = 2).

§Layout

§PACK = 1 (standard block-transpose)

Given a logical matrix with rows a, b, c, d, e (each with K columns) and GROUP = 3:

           Group Size (3)
           <---------->

           +----------+    ^
           | a0 b0 c0 |    |
           | a1 b1 c1 |    |
           | a2 b2 c2 |    | Block Size (K)
 Block 0   | ...      |    |
 (Full)    | aK bK cK |    |
           +----------+    v
           +----------+
           | d0 e0 XX |
 Block 1   | d1 e1 XX |
 (Partial) | ...      |
           | dK eK XX |
           +----------+

§PACK = 2 (super-packed)

With GROUP = 4, PACK = 2, and a logical matrix with rows a, b, c, d, e, f (each with 5 columns — odd, to show padding), adjacent column-pairs are interleaved per row within each group panel:

             GROUP × PACK (4 × 2 = 8)
             <----------------------------->

             +-----------------------------+    ^
             | a0 a1  b0 b1  c0 c1  d0 d1  |    |  col-pair (0, 1)
             | a2 a3  b2 b3  c2 c3  d2 d3  |    |  col-pair (2, 3)
   Block 0   | a4 __  b4 __  c4 __  d4 __  |    |  col-pair (4, pad)
   (Full)    +-----------------------------+    v
             +-----------------------------+
             | e0 e1  f0 f1  XX XX  XX XX  |       col-pair (0, 1)
   Block 1   | e2 e3  f2 f3  XX XX  XX XX  |       col-pair (2, 3)
   (Partial) | e4 __  f4 __  XX XX  XX XX  |       col-pair (4, pad)
             +-----------------------------+

   __ = zero (column padding)    XX = zero (row padding)
   padded_ncols = 6  (5 rounded up to next multiple of PACK)
   Block Size  = padded_ncols / PACK = 3 physical rows per block

Each physical row of a block holds one column-pair across all GROUP rows. For example, the first physical row stores columns (0, 1) for rows a, b, c, d interleaved as [a0, a1, b0, b1, c0, c1, d0, d1].

Because ncols = 5 is odd (not a multiple of PACK = 2), the last column-pair (4, pad) is zero-padded: [a4, 0, b4, 0, c4, 0, d4, 0].

§Constraints

  • GROUP > 0
  • PACK > 0
  • GROUP % PACK == 0

Structs§

BlockTransposed
An owning block-transposed matrix.
BlockTransposedMut
A mutable view of a block-transposed matrix.
BlockTransposedRef
A shared (immutable) view of a block-transposed matrix.
Row
An immutable view of a single logical row in a block-transposed matrix.
RowIter
Iterator over the elements of a Row.
RowMut
A mutable view of a single logical row in a block-transposed matrix.