Proc macro for marking GPU kernel functions.

#[warp_kernel] transforms a function into a proper PTX kernel entry point when compiling for nvptx64, and generates a host-side launcher when compiling for the host target.

Usage

In your kernel crate (compiled for nvptx64):

use warp_types::prelude::*;
use warp_types_kernel::warp_kernel;

#[warp_kernel]
pub fn butterfly_reduce(data: *mut i32) {
    let warp: Warp<All> = Warp::kernel_entry();
    let tid = warp_types::gpu::thread_id_x();
    let mut val = unsafe { *data.add(tid as usize) };

    val += warp.shuffle_xor(PerLane::new(val), 16).get();
    val += warp.shuffle_xor(PerLane::new(val), 8).get();
    val += warp.shuffle_xor(PerLane::new(val), 4).get();
    val += warp.shuffle_xor(PerLane::new(val), 2).get();
    val += warp.shuffle_xor(PerLane::new(val), 1).get();

    unsafe { *data.add(tid as usize) = val; }
}

The macro emits:

On nvptx64: #[no_mangle] pub unsafe extern "ptx-kernel" fn butterfly_reduce(...)
On host: nothing (kernel functions are only compiled for GPU)

warp-types-kernel 0.3.0

Usage