1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
//! Pre-trained model architectures for inference.
//!
//! This module provides ready-to-use model implementations that combine
//! the primitives from `nn` into complete architectures.
//!
//! # Available Models
//!
//! - `Qwen2Model` - Qwen2-0.5B-Instruct decoder-only transformer
//!
//! # Design Philosophy
//!
//! Models follow the "assembly pattern" - they compose existing primitives
//! (attention, normalization, feedforward) rather than duplicating code.
//!
//! ```text
//! ┌─────────────────────────────────────────────────────────────────┐
//! │ Model Architecture │
//! ├─────────────────────────────────────────────────────────────────┤
//! │ │
//! │ ┌─────────────┐ ┌─────────────────────┐ ┌────────────┐ │
//! │ │ Embedding │ -> │ N × DecoderLayer │ -> │ LM Head │ │
//! │ │ (vocab→d) │ │ (GQA + FFN + Norm) │ │ (d→vocab) │ │
//! │ └─────────────┘ └─────────────────────┘ └────────────┘ │
//! │ │
//! └─────────────────────────────────────────────────────────────────┘
//! ```
//!
//! # References
//!
//! - Bai et al. (2023). "Qwen Technical Report"
//! - Vaswani et al. (2017). "Attention Is All You Need"
pub use ;
pub use Qwen2Model;