Expand description
Native BitNet MoE model implementation for Hydra.
This module implements the Hydra model architecture with native Rust inference, loading weights directly from safetensors format.
§Architecture (from actual model weights)
Input tokens → Embedding [32000, 192]
↓
4x MoE Layers:
- Gate: Linear(192, 4) → softmax → top-k selection
- Experts: Heterogeneous MLPs (different depths/widths)
↓
LayerNorm [192]
↓
SemanticHead: Linear(192, 192)
↓
CompressionHead: Linear(192, 4) → [NONE, BPE, BROTLI, ZLIB]
SecurityHead: Linear(192, 2) → [SAFE, UNSAFE]Structs§
- Expert
- Expert MLP with variable architecture
- Hydra
BitNet - Complete Hydra model
- Hydra
Config - Model configuration derived from actual weights
- Layer
Norm - Layer normalization
- Linear
- Linear layer (dense)
- MoELayer
- MoE Layer with gating