Expand description
Stochastic decoders for autoregressive sequence generation.
This module collects sampling-based decoding strategies that are
complementary to the greedy/beam-search decoders in crate::beam.
Each submodule implements a different truncation criterion on the
softmax distribution before drawing a single token:
top_k— Fan, Lewis & Dauphin (2018), Hierarchical Neural Story Generation: restrict the support to thekmost-likely tokens.nucleus— Holtzman, Buys, Du, Forbes & Choi (2020), The Curious Case of Neural Text Degeneration: restrict the support to the smallest set whose cumulative probability exceedsp.typical— Meister, Pimentel, Wiher & Cotterell (2022), Typical Decoding for Natural Language Generation: restrict the support to tokens whose negative log-probability is closest to the conditional entropy of the distribution.
All decoders are deterministic given a seeded crate::handle::LcgRng
and return crate::error::SeqResult.
Re-exports§
pub use contrastive::ContrastiveConfig;pub use contrastive::ContrastiveSearcher;pub use pointer_network::PointerGrad;pub use pointer_network::PointerNetwork;pub use nucleus::*;pub use top_k::*;pub use typical::*;
Modules§
- contrastive
- Contrastive search decoding (Su et al. 2022 ACL).
- nucleus
- Nucleus (top-
p) sampling for autoregressive sequence decoding. - pointer_
network - Pointer Network.
- top_k
- Top-
ksampling for autoregressive sequence decoding. - typical
- Typical (locally typical) decoding for autoregressive sequence generation.