Skip to main content

auto_grid_for

Function auto_grid_for 

Source
pub fn auto_grid_for(func: &Function, n: usize) -> CudaResult<(Dim3, Dim3)>
Expand description

Computes optimal grid and block dimensions for a 1D problem of n elements.

Queries the CUDA occupancy API to determine the block size that maximises multiprocessor occupancy for the given kernel function, then calculates the grid size needed to cover n work items.

Returns (grid_dim, block_dim) suitable for use with LaunchParams.

§Errors

Returns a CudaError if the occupancy query fails (e.g., invalid function handle, driver not loaded).

§Examples

use oxicuda_launch::grid::auto_grid_for;

let func = module.get_function("my_kernel")?;
let (grid, block) = auto_grid_for(&func, 100_000)?;