Expand description
§Model File Loader
Loads a .tern.bin ModelCoherence file produced by transmute_llama.py
and maps its ternarized layers into a 13-expert ExpertBank for EPIS inference.
§Layer → Expert mapping
Round-robin: layer i → expert (i % 13). When multiple layers land on the same expert they are fused by majority vote: agreement → keep the value, disagreement → 0 (hold).
§Dimension handling
Large transformer layers (e.g. 2048×2048) are truncated to EXPERT_DIMS. Layers smaller than EXPERT_DIMS are zero-padded.
Structs§
- Model
File Info - Metadata returned alongside a loaded ExpertBank.
Constants§
Functions§
- load_
expert_ bank - Load a
.tern.binfile and construct a real ExpertBank13 from its layers.