Expand description
SqueezeBERT (Iandola et al., 2020) + ALBERT (Lan et al., 2020)
This model combines SqueezeBERT and ALBERT:
- SqueezeBERT replaces most matrix multiplications by grouped convolutions, resulting in smaller models and higher inference speeds.
- ALBERT allows layers to share parameters and decouples the embedding size from the hidden state size. Both result in smaller models.
Combined, your models can be even smaller and faster.
Structsยง
- Squeeze
Albert Config - SqueezeALBERT model configuration.
- Squeeze
Albert Encoder - SqueezeALBERT encoder.