Module cuda_packed

ringkernel_wavesim::simulation

Module cuda_packed

Expand description

Packed CUDA backend for GPU-only tile-based FDTD simulation.

This backend keeps ALL tile data in a single packed GPU buffer, enabling GPU-only halo exchange with ZERO host transfers during simulation steps.

§Key Features

Zero host transfers: Halo exchange happens entirely on GPU
Two kernel launches per step: Exchange halos + FDTD compute
Massive parallelism: All tiles computed in single kernel launch
Precomputed halo routing: Copy indices computed once at init

§Memory Layout

All tiles packed contiguously in GPU memory:

[Tile(0,0) 18×18][Tile(1,0) 18×18][Tile(0,1) 18×18]...

Each tile is (tile_size + 2)² floats (18×18 = 324 floats for 16×16 tile).

Structs§

CudaPackedBackend: Packed CUDA backend for tile-based FDTD simulation.