Expand description
Packed CUDA backend for GPU-only tile-based FDTD simulation.
This backend keeps ALL tile data in a single packed GPU buffer, enabling GPU-only halo exchange with ZERO host transfers during simulation steps.
§Key Features
- Zero host transfers: Halo exchange happens entirely on GPU
- Two kernel launches per step: Exchange halos + FDTD compute
- Massive parallelism: All tiles computed in single kernel launch
- Precomputed halo routing: Copy indices computed once at init
§Memory Layout
All tiles packed contiguously in GPU memory:
[Tile(0,0) 18×18][Tile(1,0) 18×18][Tile(0,1) 18×18]...Each tile is (tile_size + 2)² floats (18×18 = 324 floats for 16×16 tile).
Structs§
- Cuda
Packed Backend - Packed CUDA backend for tile-based FDTD simulation.