Skip to main content

Crate llama_models

Crate llama_models

Expand description

§llama-models

Foundational model blocks for llama.rs:

RMSNorm
RoPE
Attention (scaled dot-product, single-step decode form)
MLP (SwiGLU)
Safetensors weight loading

Structs§

LlamaBlock: Minimal Llama block composition.
ModelWeights: Named weight storage loaded from safetensors.
QwenBlock: Minimal Qwen block composition (same block primitives at this stage).
Tensor: Lightweight tensor holder for loaded model weights.

Enums§

ModelError: Errors for model operations and weight loading.

Functions§

apply_rope: Apply rotary positional embeddings in-place to query and key vectors.
attention_decode: Single-step decode attention:
mlp_swiglu: SwiGLU MLP:
rms_norm: Root mean square normalization.