Module quantized_t5

Source
Expand description

T5 model implementation with quantization support.

T5 is an encoder-decoder model pre-trained on a multi-task mixture of supervised and unsupervised tasks. This implementation provides quantization for reduced memory and compute requirements.

Key characteristics:

  • Encoder-decoder architecture
  • Layer normalization
  • Relative positional encodings
  • Support for 8-bit quantization

References:

Re-exportsยง

pub use crate::quantized_var_builder::VarBuilder;

Structsยง

Config
T5EncoderModel
T5ForConditionalGeneration