Skip to main content

Module params

Module params 

Source
Expand description

A safe wrapper around llama_context_params.

Use LlamaContextParams to configure context size, batching, KV layout, RoPE / YaRN scaling, flash attention, per-sequence samplers, and pairing with another context (ctx_other).

Structs§

LlamaContextParams
Builder for llama_context_params.

Enums§

LlamaAttentionType
Attention mask mode used when the context runs in embedding mode.
LlamaContextType
A rusty wrapper around llama_context_type.
LlamaFlashAttnType
Flash-attention enablement policy for the context.
LlamaPoolingType
A rusty wrapper around LLAMA_POOLING_TYPE.
ParamsCloneError
Error returned when LlamaContextParams::try_clone cannot duplicate state.
RopeScalingType
A rusty wrapper around rope_scaling_type.