Expand description
A safe wrapper around llama_context_params.
Use LlamaContextParams to configure context size, batching, KV layout,
RoPE / YaRN scaling, flash attention, per-sequence samplers, and pairing
with another context (ctx_other).
Structs§
- Llama
Context Params - Builder for
llama_context_params.
Enums§
- Llama
Attention Type - Attention mask mode used when the context runs in embedding mode.
- Llama
Context Type - A rusty wrapper around
llama_context_type. - Llama
Flash Attn Type - Flash-attention enablement policy for the context.
- Llama
Pooling Type - A rusty wrapper around
LLAMA_POOLING_TYPE. - Params
Clone Error - Error returned when
LlamaContextParams::try_clonecannot duplicate state. - Rope
Scaling Type - A rusty wrapper around
rope_scaling_type.