Module squeeze_albert

Expand description

SqueezeBERT (Iandola et al., 2020) + ALBERT (Lan et al., 2020)

This model combines SqueezeBERT and ALBERT:

SqueezeBERT replaces most matrix multiplications by grouped convolutions, resulting in smaller models and higher inference speeds.
ALBERT allows layers to share parameters and decouples the embedding size from the hidden state size. Both result in smaller models.

Combined, your models can be even smaller and faster.

Structs§