Skip to main content

Crate ferrum_kernels

Crate ferrum_kernels 

Source
Expand description

Ferrum unified compute kernels for high-performance inference.

Provides the Backend trait and implementations for CUDA, Metal, and CPU. On CUDA builds, kernels are compiled to PTX during cargo build and loaded on demand at runtime.

Re-exports§

pub use linear::Linear;

Modules§

backend
Unified Backend trait for CUDA, Metal, and CPU compute.
linear
Linear<B> trait — weight-bearing projection abstraction.
moe_host
Backend-agnostic MoE host-side helpers — used by all backends and across all builds (no cfg(feature = "metal") gate).