Module generation_utils

Expand description

§Natural Language Generation utilities

Set of text generation utilities, serving as a basis for TextGenerationModel, SummarizationModels and TranslationModels. Include techniques such as beam search, top-k and nucleus sampling, temperature setting and repetition penalty. Supports batch generation of sentences from several prompts. Sequences will be left-padded with the model’s padding token if present, the unknown token otherwise. This may impact the results and it is recommended to submit prompts of similar length for best results.

use rust_bert::gpt2::GPT2Generator;
use rust_bert::pipelines::generation_utils::{
    GenerateConfig, GenerateOptions, LanguageGenerator,
};

let generate_config = GenerateConfig {
    do_sample: true,
    num_beams: 5,
    temperature: 1.1,
    num_return_sequences: 3,
    ..Default::default()
};
let mut gpt2_generator = GPT2Generator::new(generate_config)?;

let input_context = "The dog";
let second_input_context = "The cat was";

let generate_options = GenerateOptions {
    min_length: Some(32),
    max_length: Some(128),
    output_scores: true,
    ..Default::default()
};

let output = gpt2_generator.generate(
    Some(&[input_context, second_input_context]),
    Some(generate_options),
);

Example output: \

[
    "The dog's owners, however, did not want to be named. According to the lawsuit, the animal's owner, a 29-year",
    "The dog has always been part of the family. \"He was always going to be my dog and he was always looking out for me",
    "The dog has been able to stay in the home for more than three months now. \"It's a very good dog. She's",
    "The cat was discovered earlier this month in the home of a relative of the deceased. The cat\'s owner, who wished to remain anonymous,",
    "The cat was pulled from the street by two-year-old Jazmine.\"I didn't know what to do,\" she said",
    "The cat was attacked by two stray dogs and was taken to a hospital. Two other cats were also injured in the attack and are being treated."
]

Structs§

GenerateConfig: Configuration for text generation
GenerateOptions: Type alias for a function defining allowed tokens based on current tokens generated. This function should take a batch_id and associated tensor of already generated tokens and should return a vector of allowed tokens. This is useful for controlled generation, i.e. deterministic generation of a token continuation if a sequence of token occurs.
GeneratedIndicesOutput: Generated indices output
GeneratedTextOutput: Generated text output
LMModelOutput: Container holding a language model output for generation tasks

Enums§

Cache

Traits§

LanguageGenerator: Common trait for text generation models.

Type Aliases§

PrefixAllowedFunction