Module dsl

Module dsl 

Source
Expand description

Rust DSL functions for writing CUDA kernels.

This module provides Rust functions that map to CUDA intrinsics during transpilation. These functions have CPU fallback implementations for testing but are transpiled to the corresponding CUDA operations when used in kernel code.

§Thread/Block Index Access

use ringkernel_cuda_codegen::dsl::*;

fn my_kernel(...) {
    let tx = thread_idx_x();  // -> threadIdx.x
    let bx = block_idx_x();   // -> blockIdx.x
    let idx = bx * block_dim_x() + tx;  // Global thread index
}

§Thread Synchronization

sync_threads();  // -> __syncthreads()

Functions§

block_dim_x
Get the block dimension (x dimension). Transpiles to: blockDim.x
block_dim_y
Get the block dimension (y dimension). Transpiles to: blockDim.y
block_dim_z
Get the block dimension (z dimension). Transpiles to: blockDim.z
block_idx_x
Get the block index within a grid (x dimension). Transpiles to: blockIdx.x
block_idx_y
Get the block index within a grid (y dimension). Transpiles to: blockIdx.y
block_idx_z
Get the block index within a grid (z dimension). Transpiles to: blockIdx.z
grid_dim_x
Get the grid dimension (x dimension). Transpiles to: gridDim.x
grid_dim_y
Get the grid dimension (y dimension). Transpiles to: gridDim.y
grid_dim_z
Get the grid dimension (z dimension). Transpiles to: gridDim.z
sync_threads
Synchronize all threads in a block. Transpiles to: __syncthreads()
thread_fence
Thread memory fence. Transpiles to: __threadfence()
thread_fence_block
Block-level memory fence. Transpiles to: __threadfence_block()
thread_idx_x
Get the thread index within a block (x dimension). Transpiles to: threadIdx.x
thread_idx_y
Get the thread index within a block (y dimension). Transpiles to: threadIdx.y
thread_idx_z
Get the thread index within a block (z dimension). Transpiles to: threadIdx.z