Crate ringkernel_derive

Crate ringkernel_derive 

Source
Expand description

Procedural macros for RingKernel.

This crate provides the following macros:

  • #[derive(RingMessage)] - Implement the RingMessage trait for message types
  • #[derive(PersistentMessage)] - Implement PersistentMessage for GPU kernel dispatch
  • #[ring_kernel] - Define a ring kernel handler
  • #[stencil_kernel] - Define a GPU stencil kernel (with cuda-codegen feature)
  • #[gpu_kernel] - Define a multi-backend GPU kernel with capability checking

§Example

use ringkernel_derive::{RingMessage, ring_kernel};

#[derive(RingMessage)]
struct AddRequest {
    #[message(id)]
    id: MessageId,
    a: f32,
    b: f32,
}

#[derive(RingMessage)]
struct AddResponse {
    #[message(id)]
    id: MessageId,
    result: f32,
}

#[ring_kernel(id = "adder")]
async fn process(ctx: &mut RingContext, req: AddRequest) -> AddResponse {
    AddResponse {
        id: MessageId::generate(),
        result: req.a + req.b,
    }
}

§Multi-Backend GPU Kernels

The #[gpu_kernel] macro enables multi-backend code generation with capability checking:

use ringkernel_derive::gpu_kernel;

// Generate code for CUDA and Metal, with fallback order
#[gpu_kernel(backends = [cuda, metal], fallback = [wgpu, cpu])]
fn saxpy(x: &[f32], y: &mut [f32], a: f32, n: i32) {
    let idx = global_thread_id_x();
    if idx < n {
        y[idx as usize] = a * x[idx as usize] + y[idx as usize];
    }
}

// Require specific capabilities at compile time
#[gpu_kernel(backends = [cuda], requires = [f64, atomic64])]
fn double_precision(data: &mut [f64], n: i32) {
    // Uses f64 operations - validated at compile time
}

§Stencil Kernels (with cuda-codegen feature)

use ringkernel_derive::stencil_kernel;
use ringkernel_cuda_codegen::GridPos;

#[stencil_kernel(id = "fdtd", grid = "2d", tile_size = 16, halo = 1)]
fn fdtd(p: &[f32], p_prev: &mut [f32], c2: f32, pos: GridPos) {
    let curr = p[pos.idx()];
    let lap = pos.north(p) + pos.south(p) + pos.east(p) + pos.west(p) - 4.0 * curr;
    p_prev[pos.idx()] = 2.0 * curr - p_prev[pos.idx()] + c2 * lap;
}

Attribute Macros§

gpu_kernel
Attribute macro for defining multi-backend GPU kernels.
ring_kernel
Attribute macro for defining ring kernel handlers.
stencil_kernel
Attribute macro for defining stencil kernels that transpile to CUDA.

Derive Macros§

ControlBlockState
Derive macro for implementing EmbeddedState trait.
GpuType
Derive macro for GPU-compatible types.
PersistentMessage
Derive macro for implementing the PersistentMessage trait.
RingMessage
Derive macro for implementing the RingMessage trait.