Expand description
Phi3 model implementation with quantization support.
Phi3 is a language model intended for research purposes. This implementation provides quantization for reduced memory usage.
Key characteristics:
- Multi-head attention
- RMSNorm for layer normalization
- Rotary positional embeddings (RoPE)
- Support for quantization
References: