realizar 0.8.5

Pure Rust ML inference engine built from scratch - model serving for GGUF and safetensors
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
//! Transformer layer operations: SwiGLU FFN, full transformer layer, batched processing
//!
//! This module implements:
//! - PAR-023: GPU-Resident SwiGLU FFN
//! - PAR-044: GPU-Resident Transformer Layer
//! - PAR-111: Batched Transformer Layer for multi-sequence processing
//! - PAR-062: CUDA Graph-captured decode
//! - Full forward pass with all layers

#![allow(clippy::wildcard_imports)] // Internal module organization uses super::*

use super::*;

include!("layer_cuda_executor.rs");
include!("layer_tests_ffn_swiglu.rs");