pub struct GpuBenchHarness {
pub warmup: u32,
pub iterations: u32,
pub reports: Vec<GpuBenchReport>,
}Expand description
Timing harness for GPU/CPU backend comparison benchmarks.
Fields§
§warmup: u32Warm-up iterations (not timed).
iterations: u32Timed iterations.
reports: Vec<GpuBenchReport>Collected reports.
Implementations§
Source§impl GpuBenchHarness
impl GpuBenchHarness
Sourcepub fn available_backends() -> Vec<BackendKind>
pub fn available_backends() -> Vec<BackendKind>
Return which backends are available in this build.
CPU is always available. wgpu and CUDA depend on feature flags and device availability (they appear in the list only when initialisation succeeds).
Sourcepub fn bench_sph_density(&mut self, n: usize) -> Vec<GpuBenchReport>
pub fn bench_sph_density(&mut self, n: usize) -> Vec<GpuBenchReport>
Benchmark SPH density summation for n particles on all available backends.
Each particle’s density is recomputed from scratch each call to avoid caching effects. FLOPs estimated as 10 × N² (distance + kernel eval).
Sourcepub fn bench_lbm_step(
&mut self,
nx: usize,
ny: usize,
nz: usize,
) -> Vec<GpuBenchReport>
pub fn bench_lbm_step( &mut self, nx: usize, ny: usize, nz: usize, ) -> Vec<GpuBenchReport>
Benchmark one LBM BGK step on an nx × ny × nz domain.
FLOPs estimated as 120 × nc (19 distribution reads + BGK + streaming).
Sourcepub fn bench_parallel_scan(&mut self, n: usize) -> GpuBenchReport
pub fn bench_parallel_scan(&mut self, n: usize) -> GpuBenchReport
Benchmark parallel prefix scan on n f64 elements (CPU Rayon scan).
FLOPs = 2n (N adds in up-sweep + N adds in down-sweep).
Sourcepub fn run_full_suite(&mut self) -> String
pub fn run_full_suite(&mut self) -> String
Run the complete GPU benchmark suite and return a formatted summary.
use oxiphysics_gpu::gpu_bench::GpuBenchHarness;
let mut h = GpuBenchHarness::new();
let summary = h.run_full_suite();
assert!(!summary.is_empty());Sourcepub fn cpu_vs_wgpu_comparison(&mut self, n: usize) -> Vec<GpuBenchReport>
pub fn cpu_vs_wgpu_comparison(&mut self, n: usize) -> Vec<GpuBenchReport>
Benchmark CPU inclusive scan vs wgpu copy dispatch for n f64 elements.
Both sides operate on the same data (a ramp of 0.0..n). The wgpu side
dispatches a copy shader (since f32 on-device means scan parity is a
different test). Returns a Vec with one CPU report, and optionally one
wgpu report if an adapter is available.
If no GPU adapter is present, only the CPU report is returned (no panic).
use oxiphysics_gpu::gpu_bench::GpuBenchHarness;
let mut h = GpuBenchHarness::new();
let reports = h.cpu_vs_wgpu_comparison(1000);
assert!(!reports.is_empty());
assert_eq!(reports[0].name, "cpu_copy_scan");Sourcepub fn cpu_vs_wgpu_sph(&mut self, n: usize) -> Vec<GpuBenchReport>
pub fn cpu_vs_wgpu_sph(&mut self, n: usize) -> Vec<GpuBenchReport>
Benchmark the SPH density kernel on CPU and wgpu backends side-by-side.
Builds an SPH simulation with n particles arranged in a uniform grid
inside the domain [-10, 10]³. Runs the full SphSimulation::step
(density + pressure + accel + integrate) on both backends and returns
timing reports.
If no wgpu adapter is available, only the CPU report is returned.
§Example
let mut h = oxiphysics_gpu::gpu_bench::GpuBenchHarness::new();
let reports = h.cpu_vs_wgpu_sph(64);
assert!(!reports.is_empty());Sourcepub fn cpu_vs_cuda_sph(&mut self, n: usize) -> Vec<GpuBenchReport>
pub fn cpu_vs_cuda_sph(&mut self, n: usize) -> Vec<GpuBenchReport>
Run SPH density on CPU and (optionally) CUDA; return timing reports.
The CPU path runs the same SphSimulation::step loop as
Self::cpu_vs_wgpu_sph but is tagged with "cuda_sph_density_cpu".
Under the cuda-backend feature, a second report is added when a CUDA
device is available at runtime. If no CUDA driver is present (e.g. on
macOS) only the CPU report is returned — no panic.
§Example
let mut h = oxiphysics_gpu::gpu_bench::GpuBenchHarness::new();
let reports = h.cpu_vs_cuda_sph(64);
assert!(!reports.is_empty());
assert!(reports[0].name.contains("sph_density"));
assert!(reports[0].mean > std::time::Duration::ZERO);Sourcepub fn print_comparison(&self)
pub fn print_comparison(&self)
Print a comparison table for all collected reports.
Trait Implementations§
Auto Trait Implementations§
impl Freeze for GpuBenchHarness
impl RefUnwindSafe for GpuBenchHarness
impl Send for GpuBenchHarness
impl Sync for GpuBenchHarness
impl Unpin for GpuBenchHarness
impl UnsafeUnpin for GpuBenchHarness
impl UnwindSafe for GpuBenchHarness
Blanket Implementations§
Source§impl<T> BorrowMut<T> for Twhere
T: ?Sized,
impl<T> BorrowMut<T> for Twhere
T: ?Sized,
Source§fn borrow_mut(&mut self) -> &mut T
fn borrow_mut(&mut self) -> &mut T
Source§impl<T> IntoEither for T
impl<T> IntoEither for T
Source§fn into_either(self, into_left: bool) -> Either<Self, Self>
fn into_either(self, into_left: bool) -> Either<Self, Self>
self into a Left variant of Either<Self, Self>
if into_left is true.
Converts self into a Right variant of Either<Self, Self>
otherwise. Read moreSource§fn into_either_with<F>(self, into_left: F) -> Either<Self, Self>
fn into_either_with<F>(self, into_left: F) -> Either<Self, Self>
self into a Left variant of Either<Self, Self>
if into_left(&self) returns true.
Converts self into a Right variant of Either<Self, Self>
otherwise. Read more