pub fn fft_batch<R: Runtime>(
device: &R::Device,
signals: &[Vec<f32>],
) -> Vec<(Vec<f32>, Vec<f32>)>Expand description
Computes the Cooley-Tukey radix-2 DIT FFT for a batch of signals in a single GPU pass.
All signals are zero-padded to the same length: the next power-of-two of the
longest signal. Every other signal is padded to that same length so the
batch forms a rectangular batch_size × n matrix in GPU memory.
Returns one (real, imag) pair per input signal, each of length n.
§Performance
All batch_size signals are processed simultaneously using a 2-D kernel
dispatch — the Y-dimension of the grid indexes the signal and the X-dimension
covers butterfly pairs within a signal. This amortises kernel-launch overhead
over the entire batch.
§Panics
Does not panic. An empty batch returns an empty Vec.
§Example
ⓘ
use cubecl::wgpu::WgpuRuntime;
use gpu_fft::fft::fft_batch;
let signals = vec![vec![1.0f32, 0.0, 0.0, 0.0], vec![0.0, 1.0, 0.0, 0.0]];
let results = fft_batch::<WgpuRuntime>(&Default::default(), &signals);
assert_eq!(results.len(), 2);