pub fn cuda_batch_step<A, S>(
store: &S,
ptx_source: &str,
_module_name: &str,
kernel_name: &str,
block_size: u32,
) -> Result<AccelStepResult, String>where
A: SoaExtractable,
S: AgentStore<A>,Expand description
CUDA batch step over SoA columns.
Uploads agent columns to the GPU on a dedicated cudarc::driver::CudaStream,
launches the named kernel from the provided PTX source, then downloads
results back onto the host and writes them into the agent store.
Supports any number of SoA columns — the launch uses the stream-based
cudarc::driver::CudaStream::launch_builder API and collects
(col_0, col_1, …, col_{k-1}, n) as argument slots in order.
Failure surfaces returned as Err(String) include:
- invalid
block_size - CUDA context initialization
- PTX compile/load or kernel lookup
- host/device transfer failures
- kernel launch or stream synchronization failures
§Safety requirements for the PTX kernel
- the kernel signature must match the launched argument list
- the kernel must bounds-check against
n - the kernel must not read or write outside the provided column buffers
§Arguments
store– the agent store to extract from / write back toptx_source– PTX source string (compile your.cuto PTX offline or embed it)module_name– name for the loaded module (unused with cudarc 0.19, kept for source-compatibility with the previous API)kernel_name– the__global__function name inside the PTXblock_size– CUDA threads per block (e.g. 256)