Skip to main content

Module common

Module common 

Source
Expand description

Common model components shared across text model architectures.

Re-exports§

pub use chatml_history::*;

Modules§

chatml_history
text_model

Structs§

Cache
Abstraction over cosine and sine tables, kv-caching and attention masking.
CausalSelfAttention
Config
Generalized LLM configuration shared by all decoder-only text models.
LinearAttnConfig
Configuration for linear (recurrent) attention layers (e.g. Gated DeltaNet in Qwen3.5).
MLP
Multi-perceptron implementation with fused gate+up projection.
RopeScaling
RoPE scaling configuration for models with extended context (e.g. LLaMA 3.1+).
Transformer
Transformer block with causal self attention and several caching strategies.

Enums§

EosTokenId
EOS token ID(s) — deserializes from either a single u32 or an array of u32.

Functions§

detect_text_model_arch
Auto-detect text model architecture from config.json’s “architectures” field.
load_rms_norm
Load an RMS norm, optionally applying the residual weight pattern (1 + weight). When residual is true (Qwen3.5), the stored weight is treated as a residual and 1.0 is added.