Module params

Expand description

A safe wrapper around llama_context_params.

Use LlamaContextParams to configure context size, batching, KV layout, RoPE / YaRN scaling, flash attention, per-sequence samplers, and pairing with another context (ctx_other).

Structs§

LlamaContextParams: Builder for llama_context_params.

Enums§

LlamaAttentionType: Attention mask mode used when the context runs in embedding mode.
LlamaContextType: A rusty wrapper around llama_context_type.
LlamaFlashAttnType: Flash-attention enablement policy for the context.
LlamaPoolingType: A rusty wrapper around LLAMA_POOLING_TYPE.
ParamsCloneError: Error returned when LlamaContextParams::try_clone cannot duplicate state.
RopeScalingType: A rusty wrapper around rope_scaling_type.

Module params

Module params Copy item path

Structs§

Enums§

Module params