pub struct GptNeoForCausalLM { /* private fields */ }
Expand description
§GPT-Neo Model for causal language modeling
Gpt-Neo model with a vocabulary decoding head. The language model decoding head is tied to the word embedding matrix weights It is made of the following blocks:
transformer
:GptNeoModel
Base ProphetNet model
Implementations§
Source§impl GptNeoForCausalLM
impl GptNeoForCausalLM
Sourcepub fn new<'p, P>(
p: P,
config: &GptNeoConfig,
) -> Result<GptNeoForCausalLM, RustBertError>
pub fn new<'p, P>( p: P, config: &GptNeoConfig, ) -> Result<GptNeoForCausalLM, RustBertError>
Build a new GptNeoForCausalLM
§Arguments
p
- Variable store path for the root of the GPT-Neo modelconfig
-GptNeoConfig
object defining the model architecture
§Example
use rust_bert::gpt_neo::{GptNeoConfig, GptNeoForCausalLM};
use rust_bert::Config;
use std::path::Path;
use tch::{nn, Device};
let config_path = Path::new("path/to/config.json");
let device = Device::Cpu;
let p = nn::VarStore::new(device);
let config = GptNeoConfig::from_file(config_path);
let gpt_neo_model = GptNeoForCausalLM::new(&p.root(), &config).unwrap();
Sourcepub fn forward_t(
&self,
input_ids: Option<&Tensor>,
input_embeds: Option<&Tensor>,
token_type_ids: Option<&Tensor>,
position_ids: Option<&Tensor>,
layer_states: Option<Vec<Option<LayerState>>>,
attention_mask: Option<&Tensor>,
train: bool,
) -> Result<GptNeoModelLMOutput, RustBertError>
pub fn forward_t( &self, input_ids: Option<&Tensor>, input_embeds: Option<&Tensor>, token_type_ids: Option<&Tensor>, position_ids: Option<&Tensor>, layer_states: Option<Vec<Option<LayerState>>>, attention_mask: Option<&Tensor>, train: bool, ) -> Result<GptNeoModelLMOutput, RustBertError>
Forward pass through the model
§Arguments
input_ids
- Optional input tensor of shape (batch size, sequence_length). This orinput_embeds
must be provided.input_embeds
- Optional input tensor of shape (batch size, sequence_length, embeddings dimension). This orinput_ids
must be provided.token_type_ids
- Optional token type ids used to indicate the portion of the input the token belongs to. If not None, token type embeddings will be added to the token and position embeddings.position_ids
- Optional position ids of shape (batch size, sequence_length). If None, will be incremented starting from the length of the past input.layer_states
- Optional VectorOption<Vec<Option<&LayerState>>>
of length n_layer containing tuples with the past keys and values for both the self attention of each layer.attention_mask
- Optional attention mask of shape (batch size, sequence_length) for the encoder positions. Positions with a mask with value 0 will be masked.train
- boolean flag to turn on/off the dropout layers in the model. Should be set to false for inference.
§Returns
Result<GptNeoModelLMOutput, RustBertError>
containing:lm_logits
-Tensor
of shape (batch size, sequence_length, vocab_size) representing the logits for each vocab item and positionnext_cache
-Option<Vec<Option<LayerState>>>
of length n_layer containing the past content for the the attention layersall_hidden_states
-Option<Vec<Tensor>>
of length n_layer + 1 with shape (batch size, sequence_length, hidden_size)all_attentions
-Option<Vec<Tensor>>
of length n_layer containing the attention weights for each layer
§Example
use rust_bert::gpt_neo::{GptNeoConfig, GptNeoForCausalLM};
let (batch_size, sequence_length) = (64, 128);
let input_tensor = Tensor::rand(&[batch_size, sequence_length], (Int64, device));
let attention_mask = Tensor::ones(&[batch_size, sequence_length], (Int64, device));
let model_output = no_grad(|| {
gpt_neo_model.forward_t(
Some(&input_tensor),
Some(&attention_mask),
None,
None,
None,
None,
false,
)
});
Auto Trait Implementations§
impl Freeze for GptNeoForCausalLM
impl RefUnwindSafe for GptNeoForCausalLM
impl Send for GptNeoForCausalLM
impl !Sync for GptNeoForCausalLM
impl Unpin for GptNeoForCausalLM
impl UnwindSafe for GptNeoForCausalLM
Blanket Implementations§
Source§impl<T> BorrowMut<T> for Twhere
T: ?Sized,
impl<T> BorrowMut<T> for Twhere
T: ?Sized,
Source§fn borrow_mut(&mut self) -> &mut T
fn borrow_mut(&mut self) -> &mut T
Mutably borrows from an owned value. Read more
Source§impl<T> Instrument for T
impl<T> Instrument for T
Source§fn instrument(self, span: Span) -> Instrumented<Self>
fn instrument(self, span: Span) -> Instrumented<Self>
Source§fn in_current_span(self) -> Instrumented<Self>
fn in_current_span(self) -> Instrumented<Self>
Source§impl<T> IntoEither for T
impl<T> IntoEither for T
Source§fn into_either(self, into_left: bool) -> Either<Self, Self>
fn into_either(self, into_left: bool) -> Either<Self, Self>
Converts
self
into a Left
variant of Either<Self, Self>
if into_left
is true
.
Converts self
into a Right
variant of Either<Self, Self>
otherwise. Read moreSource§fn into_either_with<F>(self, into_left: F) -> Either<Self, Self>
fn into_either_with<F>(self, into_left: F) -> Either<Self, Self>
Converts
self
into a Left
variant of Either<Self, Self>
if into_left(&self)
returns true
.
Converts self
into a Right
variant of Either<Self, Self>
otherwise. Read more