Skip to main content

Crate ferrum_kernels

Crate ferrum_kernels

Expand description

Ferrum unified compute kernels for high-performance inference.

Provides the Backend trait and implementations for CUDA, Metal, and CPU. On CUDA builds, kernels are compiled to PTX during cargo build and loaded on demand at runtime.

Re-exports§

pub use linear::Linear;

Modules§

backend: Unified Backend trait for CUDA, Metal, and CPU compute.
linear: Linear<B> trait — weight-bearing projection abstraction.
moe_host: Backend-agnostic MoE host-side helpers — used by all backends and across all builds (no cfg(feature = "metal") gate).