Crate ringkernel

Crate ringkernel 

Source
Expand description

§RingKernel

GPU-native persistent actor model framework for Rust.

RingKernel is a Rust port of DotCompute’s Ring Kernel system, enabling GPU-accelerated actor systems with persistent kernels, lock-free message passing, and hybrid logical clocks for causal ordering.

§Features

  • Persistent GPU-resident state across kernel invocations
  • Lock-free message passing between kernels (K2K messaging)
  • Hybrid Logical Clocks (HLC) for temporal ordering
  • Multiple GPU backends: CUDA, Metal, WebGPU
  • Type-safe serialization via rkyv/zerocopy

§Quick Start

use ringkernel::prelude::*;

#[derive(RingMessage)]
struct AddRequest {
    #[message(id)]
    id: MessageId,
    a: f32,
    b: f32,
}

#[tokio::main]
async fn main() -> Result<()> {
    // Create runtime with auto-detected backend
    let runtime = RingKernel::builder()
        .backend(Backend::Auto)
        .build()
        .await?;

    // Launch a kernel
    let kernel = runtime.launch("adder", LaunchOptions::default()).await?;
    kernel.activate().await?;

    // Send a message
    kernel.send(AddRequest {
        id: MessageId::generate(),
        a: 1.0,
        b: 2.0,
    }).await?;

    // Receive response
    let response = kernel.receive().await?;
    println!("Result: {:?}", response);

    kernel.terminate().await?;
    Ok(())
}

§Backends

RingKernel supports multiple GPU backends:

  • CPU - Testing and fallback (always available)
  • CUDA - NVIDIA GPUs (requires cuda feature)
  • Metal - Apple GPUs (requires metal feature, macOS only)
  • WebGPU - Cross-platform via wgpu (requires wgpu feature)

Enable backends via Cargo features:

[dependencies]
ringkernel = { version = "0.1", features = ["cuda", "wgpu"] }

§Architecture

┌─────────────────────────────────────────────────────────┐
│                    Host (CPU)                           │
│  ┌─────────────┐  ┌─────────────┐  ┌─────────────────┐  │
│  │ Application │──│   Runtime   │──│  Message Bridge │  │
│  └─────────────┘  └─────────────┘  └─────────────────┘  │
└──────────────────────────┬──────────────────────────────┘
                           │ DMA Transfers
┌──────────────────────────┴──────────────────────────────┐
│                   Device (GPU)                          │
│  ┌───────────┐  ┌───────────────┐  ┌───────────────┐    │
│  │ Control   │  │ Input Queue   │  │ Output Queue  │    │
│  │ Block     │  │ (lock-free)   │  │ (lock-free)   │    │
│  └───────────┘  └───────────────┘  └───────────────┘    │
│  ┌─────────────────────────────────────────────────┐    │
│  │         Persistent Kernel (your code)          │    │
│  └─────────────────────────────────────────────────┘    │
└─────────────────────────────────────────────────────────┘

Re-exports§

pub use ringkernel_codegen as codegen;

Modules§

alerting
Alert routing system for enterprise monitoring.
analytics_context
Analytics context for grouped buffer lifecycle management.
audit
Audit logging for enterprise security and compliance.
auth
Authentication framework for RingKernel.
availability
Check availability of backends at runtime.
checkpoint
Kernel checkpointing for persistent state snapshot and restore.
config
Unified configuration for RingKernel enterprise features.
context
Ring context providing GPU intrinsics facade for kernel handlers.
control
Control block for kernel state management.
dispatcher
Multi-Kernel Message Dispatcher
domain
Business domain classification for kernel messages.
error
Error types for RingKernel operations.
health
Health monitoring and resilience infrastructure for RingKernel.
hlc
Hybrid Logical Clock (HLC) implementation for causal ordering.
k2k
Kernel-to-Kernel (K2K) direct messaging.
logging
Structured logging with trace correlation.
memory
GPU and host memory management abstractions.
message
Message types and traits for kernel-to-kernel communication.
multi_gpu
Multi-GPU coordination, topology discovery, and cross-GPU messaging.
observability
Observability infrastructure for RingKernel.
persistent_message
Persistent Message Traits for Type-Based Kernel Dispatch
prelude
Prelude module for convenient imports.
pubsub
Topic-based publish/subscribe messaging.
queue
Lock-free message queue implementation.
rate_limiting
Rate limiting for enterprise workloads.
rbac
Role-Based Access Control (RBAC) for RingKernel.
reduction
Global Reduction Primitives
runtime
Runtime traits and types for kernel management.
runtime_context
Unified runtime context for RingKernel enterprise features.
secrets
Secrets management for secure key storage and retrieval.
security
Security features for GPU kernel protection and compliance.
state
Control block state helpers for GPU-compatible kernel state.
telemetry
Telemetry and metrics collection for kernel monitoring.
telemetry_pipeline
Real-time telemetry pipeline for streaming metrics.
tenancy
Multi-tenancy support for RingKernel.
timeout
Operation-level timeouts and deadline management.
types
Core type definitions for GPU thread identification and coordination.

Macros§

gpu_profile
Macro for scoped GPU profiling.

Structs§

BlockId
Block ID within a grid (0 to grid_size - 1).
ControlBlock
Kernel control block (128 bytes, cache-line aligned).
CpuRuntime
CPU-based implementation of RingKernelRuntime.
GlobalThreadId
Global thread ID across all blocks.
HlcTimestamp
Hybrid Logical Clock timestamp.
KernelHandle
Handle to a launched kernel.
KernelId
Unique kernel identifier.
KernelStatus
Kernel status including state and metrics.
LaunchOptions
Options for launching a kernel.
MemoryPool
Memory pool for efficient allocation/deallocation.
MessageHeader
Fixed-size message header (256 bytes, cache-line aligned).
MessageId
Unique message identifier.
PinnedMemory
Pinned (page-locked) host memory for efficient DMA transfers.
QueueStats
Statistics for a message queue.
RingContext
GPU intrinsics facade for kernel handlers.
RingKernel
Main RingKernel runtime facade.
RingKernelBuilder
Builder for RingKernel runtime.
TelemetryBuffer
Telemetry buffer (64 bytes, cache-line aligned).
ThreadId
Thread ID within a block (0 to block_size - 1).
WarpId
Warp ID within a block.

Enums§

Backend
GPU backend type.
Domain
Business domain classification for kernel messages.
KernelState
Kernel lifecycle state.
Priority
Message priority levels.
RingKernelError
Comprehensive error type for RingKernel operations.

Traits§

DeviceMemory
Trait for device memory allocation.
DomainMessage
Trait for messages that belong to a specific business domain.
GpuBuffer
Trait for GPU buffer operations.
MessageQueue
Trait for message queue implementations.
RingKernelRuntime
Backend-agnostic runtime trait for kernel management.
RingMessage
Trait for types that can be sent as kernel messages.

Functions§

registered_kernels
Get list of registered kernels from the inventory.

Type Aliases§

Result
Result type alias for RingKernel operations.

Attribute Macros§

gpu_kernel
Attribute macro for defining multi-backend GPU kernels.
ring_kernel
Attribute macro for defining ring kernel handlers.
stencil_kernel
Attribute macro for defining stencil kernels that transpile to CUDA.

Derive Macros§

ControlBlockState
Derive macro for implementing EmbeddedState trait.
GpuType
Derive macro for GPU-compatible types.
PersistentMessage
Derive macro for implementing the PersistentMessage trait.
RingMessage
Derive macro for implementing the RingMessage trait.