moeflux 0.1.0-pre.3

Pure-Rust streaming-experts MoE inference on Metal. Forked from flash-moe; only the Metal kernels remain from upstream.
1
2
3
4
5
6
7
8
9
10
11
12
pub mod full_attn_forward;
pub mod gpu_attn;
pub mod gpu_linear_attn;
pub mod gpu_mla;
pub mod gpu_rope;
pub mod linear_attn;
pub mod linear_attn_forward;
pub mod mla_attn_cpu;
pub mod mla_attn_forward;
pub mod rms_norm;
pub mod rope;
pub mod sdpa;