pub enum TensorMapFormat {
Tiled {
tile_size: Vec<u32>,
},
Im2col {
pixel_box_lower_corner: Vec<i32>,
pixel_box_upper_corner: Vec<i32>,
channels_per_pixel: u32,
pixels_per_column: u32,
},
Im2colWide {
pixel_box_lower_corner_width: i32,
pixel_box_upper_corner_width: i32,
channels_per_pixel: u32,
pixels_per_column: u32,
},
}
Expand description
Format of [TensorMap
]
Variants§
Tiled
Simple tiling
Fields
tile_size: Vec<u32>
Tile size that’s loaded from memory in each copy operation. Must have rank
elements.
In matmul, for example, this might be batch x m x k
, or whatever the stage size is.
If a dimension isn’t present in the tile, it should just be set to 1
.
For CUDA, this must be a power of two and <= 256
on each dimension.
Im2col
Im2col indexing. Loads a “column” (not the same column as im2col) of pixels into shared
memory, with a certain offset (kernel position). The corners are the bounds to load pixels
from at offset 0, so the top left corner of the kernel. The offset is added to the
corner offsets, so a (-1, -1)
corner will stop the bounding box at (1, 1)
for kernel
offset (2, 2)
.
Fields
pixel_box_lower_corner: Vec<i32>
Pixel box lower corner. This is the logical upper left corner in the input tensor,
when offset is 0. The length of this value should equal the spatial dimensions of
the input tensor (i.e. h, w
for an NHWC tensor). Should normally be set to -padding
.
pixel_box_upper_corner: Vec<i32>
Pixel box top corner. This is the logical lower right corner in the input tensor,
when offset is 0. The length of this value should equal the spatial dimensions of
the input tensor (i.e. h, w
for an NHWC tensor). Should normally be set to
padding - kernel_size - 1
(where kernel_size
accounts for dilation). This is not
equal to padding, it’s equal to the bounding box for the top left corner of the kernel.
Im2colWide
Wide im2col
Trait Implementations§
Source§impl Clone for TensorMapFormat
impl Clone for TensorMapFormat
Source§fn clone(&self) -> TensorMapFormat
fn clone(&self) -> TensorMapFormat
1.0.0 · Source§fn clone_from(&mut self, source: &Self)
fn clone_from(&mut self, source: &Self)
source
. Read more