RingKernel WaveSim
Interactive 2D acoustic wave propagation simulator showcasing RingKernel's GPU compute and actor model capabilities.
Overview
WaveSim implements a Finite-Difference Time-Domain (FDTD) solver for the 2D wave equation, demonstrating several RingKernel features:
- Tile-based Actor Model: 16×16 cell tiles as actors with K2K messaging for halo exchange
- Multiple Backends: CPU (SoA + SIMD + Rayon), CUDA, and WGPU
- GPU-Only Halo Exchange: Zero host transfers during simulation (CUDA Packed backend)
Performance
| Backend | 256×256 | 512×512 | Notes |
|---|---|---|---|
| CPU SimulationGrid | 35,418 steps/s | 7,229 steps/s | SoA + SIMD + Rayon |
| CPU TileKernelGrid | 1,357 steps/s | — | Actor model with K2K |
| CUDA Packed | 112,837 steps/s | 71,324 steps/s | GPU-only halo exchange |
GPU vs CPU speedup: 3.1x at 256×256, 9.9x at 512×512
See PERFORMANCE.md for detailed analysis.
Quick Start
Interactive GUI
# CPU backend (default)
# With GPU compute (requires wgpu feature)
Click anywhere on the canvas to inject wave impulses.
Benchmarks
# Full benchmark suite (CPU, CUDA, WGPU)
# CUDA Packed backend benchmark (GPU-only halo exchange)
# Verify CUDA correctness against CPU
Architecture
Backends
-
CPU SimulationGrid: Optimized SoA layout with SIMD (f32x8) and Rayon parallelization. Best for small-to-medium grids.
-
CPU TileKernelGrid: 16×16 tile actors demonstrating RingKernel's K2K messaging. Each tile exchanges halo data with neighbors.
-
CUDA Packed (recommended for large grids): All tiles packed in a single GPU buffer. Halo exchange happens entirely on GPU via memory copies—zero host transfers during simulation.
Memory Layout (CUDA Packed)
GPU Buffer: [Tile(0,0)][Tile(1,0)]...[Tile(n,m)]
Each tile: 18×18 floats
┌───┬────────────────┬───┐
│ NW│ North Halo │NE │ ← Row 0 (from neighbor)
├───┼────────────────┼───┤
│ W │ 16×16 Interior│ E │ ← Owned cells
├───┼────────────────┼───┤
│ SW│ South Halo │SE │ ← Row 17 (from neighbor)
└───┴────────────────┴───┘
Simulation Step (CUDA Packed)
1. exchange_all_halos kernel ─┐
(GPU-to-GPU memory copies) │ Zero host transfers
2. apply_boundary_conditions │ Only 2-3 kernel launches
3. fdtd_all_tiles kernel ─┘
4. Swap buffer pointers (host-side, trivial)
Features
| Feature | Description |
|---|---|
cpu |
CPU backend (default) |
cuda |
NVIDIA CUDA backend |
wgpu |
WebGPU cross-platform backend |
simd |
SIMD optimizations (requires nightly) |
all-backends |
Enable all GPU backends |
Files
src/
├── simulation/
│ ├── grid.rs # CPU SimulationGrid (SoA + SIMD)
│ ├── tile_grid.rs # TileKernelGrid (actor model)
│ ├── cuda_packed.rs # CUDA Packed backend
│ ├── cuda_compute.rs # CUDA per-tile backend
│ └── wgpu_compute.rs # WGPU backend
├── shaders/
│ ├── fdtd_packed.cu # CUDA kernels (packed layout)
│ └── fdtd_tile.cu # CUDA kernels (per-tile)
└── bin/
├── wavesim.rs # Interactive GUI
├── benchmark.rs # Quick benchmark
├── full_benchmark.rs # Comprehensive benchmark
├── bench_packed.rs # CUDA Packed benchmark
└── verify_packed.rs # Correctness verification
Physics
The simulation solves the 2D acoustic wave equation:
∂²p/∂t² = c² (∂²p/∂x² + ∂²p/∂y²) - γ·∂p/∂t
Where:
p= pressure fieldc= speed of sound (343 m/s default)γ= damping coefficient
Discretized using central differences (FDTD):
p_new = 2p - p_prev + c²Δt²/Δx² · (p_N + p_S + p_E + p_W - 4p) - γ·(p - p_prev)
License
Apache-2.0