Module models

Source
Expand description

§Torch implementation of language models

Modules§

albert
ALBERT: A Lite BERT for Self-supervised Learning of Language Representations (Lan et al.)
bart
BART (Lewis et al.)
bert
BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding (Devlin et al.)
deberta
DeBERTa :Decoding-enhanced BERT with Disentangled Attention (He et al.)
deberta_v2
DeBERTa V2 (He et al.)
distilbert
DistilBERT, a distilled version of BERT: smaller, faster, cheaper and lighter (Sanh et al.)
electra
Electra: Pre-training Text Encoders as Discriminators Rather Than Generators (Clark et al.)
fnet
FNet, Mixing Tokens with Fourier Transforms (Lee-Thorp et al.)
gpt2
GPT2 (Radford et al.)
gpt_j
GPT-J
gpt_neo
GPT-Neo
longformer
Longformer: The Long-Document Transformer (Betalgy et al.)
longt5
LongT5 (Efficient Text-To-Text Transformer for Long Sequences)
m2m_100
M2M-100 (Fan et al.)
marian
Marian
mbart
MBart (Liu et al.)
mobilebert
MobileBERT (A Compact Task-agnostic BERT for Resource-Limited Devices)
nllb
openai_gpt
GPT (Radford et al.)
pegasus
Pegasus (Zhang et al.)
prophetnet
ProphetNet (ProphetNet: Predicting Future N-gram for Sequence-to-Sequence Pre-training)
reformer
Reformer: The Efficient Transformer (Kitaev et al.)
roberta
RoBERTa: A Robustly Optimized BERT Pretraining Approach (Liu et al.)
t5
T5 (Text-To-Text Transfer Transformer)
xlnet
XLNet (Generalized Autoregressive Pretraining for Language Understanding)