Expand description
Granite (IBM) runner.
Granite ships as general.architecture = granite / granitemoe /
granitehybrid in its GGUF converters — Llama-shaped with
attention-scale and embedding-scale multipliers. This crate is a
thin wrapper over rlx_llama32::Llama32Runner with arch
validation.
Caveat: Granite’s attention.scale and embedding.scale keys
aren’t yet read or applied in rlx-llama32 — runs will produce
some tokens but won’t match the upstream reference until those
land. granitemoe/granitehybrid additionally need MoE / hybrid
wiring (PLAN.md M4 + M5).
Structs§
Constants§
Functions§
Type Aliases§
- Llama32
Config Source - Where to load the Llama 3.2 config from.