Struct rust_bert::models::gpt_neo::GptNeoForCausalLM
source · pub struct GptNeoForCausalLM { /* private fields */ }
Expand description
GPT-Neo Model for causal language modeling
Gpt-Neo model with a vocabulary decoding head. The language model decoding head is tied to the word embedding matrix weights It is made of the following blocks:
transformer
:GptNeoModel
Base ProphetNet model
Implementations§
source§impl GptNeoForCausalLM
impl GptNeoForCausalLM
sourcepub fn new<'p, P>(
p: P,
config: &GptNeoConfig
) -> Result<GptNeoForCausalLM, RustBertError>
pub fn new<'p, P>( p: P, config: &GptNeoConfig ) -> Result<GptNeoForCausalLM, RustBertError>
Build a new GptNeoForCausalLM
Arguments
p
- Variable store path for the root of the GPT-Neo modelconfig
-GptNeoConfig
object defining the model architecture
Example
use rust_bert::gpt_neo::{GptNeoConfig, GptNeoForCausalLM};
use rust_bert::Config;
use std::path::Path;
use tch::{nn, Device};
let config_path = Path::new("path/to/config.json");
let device = Device::Cpu;
let p = nn::VarStore::new(device);
let config = GptNeoConfig::from_file(config_path);
let gpt_neo_model = GptNeoForCausalLM::new(&p.root(), &config).unwrap();
sourcepub fn forward_t(
&self,
input_ids: Option<&Tensor>,
input_embeds: Option<&Tensor>,
token_type_ids: Option<&Tensor>,
position_ids: Option<&Tensor>,
layer_states: Option<Vec<Option<LayerState>>>,
attention_mask: Option<&Tensor>,
train: bool
) -> Result<GptNeoModelLMOutput, RustBertError>
pub fn forward_t( &self, input_ids: Option<&Tensor>, input_embeds: Option<&Tensor>, token_type_ids: Option<&Tensor>, position_ids: Option<&Tensor>, layer_states: Option<Vec<Option<LayerState>>>, attention_mask: Option<&Tensor>, train: bool ) -> Result<GptNeoModelLMOutput, RustBertError>
Forward pass through the model
Arguments
input_ids
- Optional input tensor of shape (batch size, sequence_length). This orinput_embeds
must be provided.input_embeds
- Optional input tensor of shape (batch size, sequence_length, embeddings dimension). This orinput_ids
must be provided.token_type_ids
- Optional token type ids used to indicate the portion of the input the token belongs to. If not None, token type embeddings will be added to the token and position embeddings.position_ids
- Optional position ids of shape (batch size, sequence_length). If None, will be incremented starting from the length of the past input.layer_states
- Optional VectorOption<Vec<Option<&LayerState>>>
of length n_layer containing tuples with the past keys and values for both the self attention of each layer.attention_mask
- Optional attention mask of shape (batch size, sequence_length) for the encoder positions. Positions with a mask with value 0 will be masked.train
- boolean flag to turn on/off the dropout layers in the model. Should be set to false for inference.
Returns
Result<GptNeoModelLMOutput, RustBertError>
containing:lm_logits
-Tensor
of shape (batch size, sequence_length, vocab_size) representing the logits for each vocab item and positionnext_cache
-Option<Vec<Option<LayerState>>>
of length n_layer containing the past content for the the attention layersall_hidden_states
-Option<Vec<Tensor>>
of length n_layer + 1 with shape (batch size, sequence_length, hidden_size)all_attentions
-Option<Vec<Tensor>>
of length n_layer containing the attention weights for each layer
Example
use rust_bert::gpt_neo::{GptNeoConfig, GptNeoForCausalLM};
let (batch_size, sequence_length) = (64, 128);
let input_tensor = Tensor::rand(&[batch_size, sequence_length], (Int64, device));
let attention_mask = Tensor::ones(&[batch_size, sequence_length], (Int64, device));
let model_output = no_grad(|| {
gpt_neo_model.forward_t(
Some(&input_tensor),
Some(&attention_mask),
None,
None,
None,
None,
false,
)
});
Auto Trait Implementations§
impl RefUnwindSafe for GptNeoForCausalLM
impl Send for GptNeoForCausalLM
impl !Sync for GptNeoForCausalLM
impl Unpin for GptNeoForCausalLM
impl UnwindSafe for GptNeoForCausalLM
Blanket Implementations§
source§impl<T> BorrowMut<T> for Twhere
T: ?Sized,
impl<T> BorrowMut<T> for Twhere
T: ?Sized,
source§fn borrow_mut(&mut self) -> &mut T
fn borrow_mut(&mut self) -> &mut T
Mutably borrows from an owned value. Read more