Module cuda_packed

Module cuda_packed 

Source
Expand description

Packed CUDA backend for GPU-only tile-based FDTD simulation.

This backend keeps ALL tile data in a single packed GPU buffer, enabling GPU-only halo exchange with ZERO host transfers during simulation steps.

§Key Features

  • Zero host transfers: Halo exchange happens entirely on GPU
  • Two kernel launches per step: Exchange halos + FDTD compute
  • Massive parallelism: All tiles computed in single kernel launch
  • Precomputed halo routing: Copy indices computed once at init

§Memory Layout

All tiles packed contiguously in GPU memory:

[Tile(0,0) 18×18][Tile(1,0) 18×18][Tile(0,1) 18×18]...

Each tile is (tile_size + 2)² floats (18×18 = 324 floats for 16×16 tile).

Structs§

CudaPackedBackend
Packed CUDA backend for tile-based FDTD simulation.