//! Burn model building blocks for local LLM inference.
//!
//! Shared components used by both Qwen3.5 and Nemotron model implementations.
//! Uses Burn's built-in `RmsNorm` and `RotaryEncoding` directly — only
//! attention (GQA + QK-norm) and feed-forward (SwiGLU) are custom.
use *;
/// Trait for models that can produce logits from input token IDs.
///
/// Implemented by both `Qwen3TextModel` and (soon) `NemotronModel`,
/// allowing `greedy_generate` to work with any architecture.