Expand description
Chat-template engine for RLX runners.
Replaces LlamaModel::apply_chat_template (llama-cpp-4) end-to-end. Two
sources: an inline Jinja2 string, or tokenizer.chat_template (and
tokenizer.ggml.chat_template) read directly from a GGUF file’s
metadata. Rendering uses minijinja.
BOS/EOS strings are looked up via tokenizer.ggml.bos_token_id /
eos_token_id against the tokenizer.ggml.tokens array (the GGUF
convention).
Structs§
- Chat
Message - One chat turn.
roleis conventionally one ofsystem,user,assistant,tool— but templates can accept anything. - Chat
Template - Compiled Jinja chat template + BOS/EOS strings.
Enums§
- Chat
Template Source - Where a
ChatTemplatewas loaded from. Useful for diagnostics and for letting a caller round-trip the source string into config.
Functions§
- auto_
chat_ template - Convenience for the M3 auto-dispatch family: load the chat template