Proc macro for marking GPU kernel functions.

#[warp_kernel] transforms a function into a proper PTX kernel entry point when compiling for nvptx64, and generates a host-side launcher when compiling for the host target.

Usage

In your kernel crate (compiled for nvptx64):

use warp_types::prelude::*;
use warp_types_kernel::warp_kernel;

#[warp_kernel]
pub fn butterfly_reduce(data: *mut i32) {
    let warp: Warp<All> = Warp::kernel_entry();
    let tid = warp_types::gpu::thread_id_x();
    let mut val = unsafe { *data.add(tid as usize) };

    val += warp.shuffle_xor(PerLane::new(val), 16).get();
    val += warp.shuffle_xor(PerLane::new(val), 8).get();
    val += warp.shuffle_xor(PerLane::new(val), 4).get();
    val += warp.shuffle_xor(PerLane::new(val), 2).get();
    val += warp.shuffle_xor(PerLane::new(val), 1).get();

    unsafe { *data.add(tid as usize) = val; }
}

The macro always emits #[no_mangle] pub unsafe extern "ptx-kernel" fn ... regardless of target. Kernel crates should target nvptx64 exclusively — the extern "ptx-kernel" ABI requires nightly abi_ptx and is only meaningful on GPU targets.

warp-types-kernel 0.3.1

Usage