//! Transformer layer building blocks.
//!
//! # Modules
//!
//! | Module | Contents |
//! |--------|----------|
//! | [`attention`] | Multi-head attention (MHA/GQA) with KV-cache |
//! | [`embedding`] | Token embedding, learned positional embedding, RoPE |
//! | [`ffn`] | MLP (GELU) and SwiGLU feed-forward networks |
//! | [`norm`] | RMSNorm and LayerNorm |
//! | [`transformer`] | GPT-2 and LLaMA transformer blocks; `PastKvCache` |
pub use ;
pub use ;
pub use ;
pub use ;
pub use ;