Struct dfdx::nn::TransformerEncoderBlock
source · [−]pub struct TransformerEncoderBlock<const MODEL_DIM: usize, const NUM_HEADS: usize, const FF_DIM: usize> {
pub self_attn: MultiHeadAttention<MODEL_DIM, NUM_HEADS>,
pub norm1: LayerNorm1D<MODEL_DIM>,
pub ff: Residual<(Linear<M, F>, ReLU, Linear<F, M>)>,
pub norm2: LayerNorm1D<MODEL_DIM>,
}
Expand description
Requires Nightly A single transformer encoder block
Generics
MODEL_DIM
: The size of query/key/value tensors. Given to MultiHeadAttention.NUM_HEADS
: The number of heads in MultiHeadAttention.FF_DIM
: The size of the hidden layer in the feedforward network.
Pytorch equivalent:
encoder = torch.nn.TransformerEncoderLayer(
EMBED_DIM, NUM_HEADS, dim_feedforward=FF_DIM, batch_first=True, dropout=0.0
)
TODO: Doctests
Fields
self_attn: MultiHeadAttention<MODEL_DIM, NUM_HEADS>
norm1: LayerNorm1D<MODEL_DIM>
ff: Residual<(Linear<M, F>, ReLU, Linear<F, M>)>
norm2: LayerNorm1D<MODEL_DIM>
Trait Implementations
sourceimpl<const M: usize, const H: usize, const F: usize> CanUpdateWithGradients for TransformerEncoderBlock<M, H, F>
impl<const M: usize, const H: usize, const F: usize> CanUpdateWithGradients for TransformerEncoderBlock<M, H, F>
sourcefn update<G: GradientProvider>(
&mut self,
grads: &mut G,
unused: &mut UnusedTensors
)
fn update<G: GradientProvider>(
&mut self,
grads: &mut G,
unused: &mut UnusedTensors
)
Updates self given the GradientProvider. When any parameters that
are NOT present in
G
, then this function should
add the tensor’s UniqueId to UnusedTensors. Read moresourceimpl<const MODEL_DIM: usize, const NUM_HEADS: usize, const FF_DIM: usize> Clone for TransformerEncoderBlock<MODEL_DIM, NUM_HEADS, FF_DIM>
impl<const MODEL_DIM: usize, const NUM_HEADS: usize, const FF_DIM: usize> Clone for TransformerEncoderBlock<MODEL_DIM, NUM_HEADS, FF_DIM>
sourcefn clone(&self) -> TransformerEncoderBlock<MODEL_DIM, NUM_HEADS, FF_DIM>
fn clone(&self) -> TransformerEncoderBlock<MODEL_DIM, NUM_HEADS, FF_DIM>
Returns a copy of the value. Read more
1.0.0 · sourcefn clone_from(&mut self, source: &Self)
fn clone_from(&mut self, source: &Self)
Performs copy-assignment from
source
. Read moresourceimpl<const MODEL_DIM: usize, const NUM_HEADS: usize, const FF_DIM: usize> Debug for TransformerEncoderBlock<MODEL_DIM, NUM_HEADS, FF_DIM>
impl<const MODEL_DIM: usize, const NUM_HEADS: usize, const FF_DIM: usize> Debug for TransformerEncoderBlock<MODEL_DIM, NUM_HEADS, FF_DIM>
sourceimpl<const MODEL_DIM: usize, const NUM_HEADS: usize, const FF_DIM: usize> Default for TransformerEncoderBlock<MODEL_DIM, NUM_HEADS, FF_DIM>
impl<const MODEL_DIM: usize, const NUM_HEADS: usize, const FF_DIM: usize> Default for TransformerEncoderBlock<MODEL_DIM, NUM_HEADS, FF_DIM>
sourcefn default() -> TransformerEncoderBlock<MODEL_DIM, NUM_HEADS, FF_DIM>
fn default() -> TransformerEncoderBlock<MODEL_DIM, NUM_HEADS, FF_DIM>
Returns the “default value” for a type. Read more
sourceimpl<const M: usize, const H: usize, const F: usize> LoadFromNpz for TransformerEncoderBlock<M, H, F>
impl<const M: usize, const H: usize, const F: usize> LoadFromNpz for TransformerEncoderBlock<M, H, F>
sourceimpl<const M: usize, const H: usize, const F: usize, Src> Module<Src> for TransformerEncoderBlock<M, H, F>where
Src: Tensor<Dtype = f32>,
MultiHeadAttention<M, H>: Module<(Src, Src::NoTape, Src::NoTape), Output = Src>,
LayerNorm1D<M>: Module<Src, Output = Src>,
Residual<(Linear<M, F>, ReLU, Linear<F, M>)>: Module<Src, Output = Src>,
impl<const M: usize, const H: usize, const F: usize, Src> Module<Src> for TransformerEncoderBlock<M, H, F>where
Src: Tensor<Dtype = f32>,
MultiHeadAttention<M, H>: Module<(Src, Src::NoTape, Src::NoTape), Output = Src>,
LayerNorm1D<M>: Module<Src, Output = Src>,
Residual<(Linear<M, F>, ReLU, Linear<F, M>)>: Module<Src, Output = Src>,
sourceimpl<const M: usize, const H: usize, const F: usize, T> ModuleMut<T> for TransformerEncoderBlock<M, H, F>where
Self: Module<T>,
impl<const M: usize, const H: usize, const F: usize, T> ModuleMut<T> for TransformerEncoderBlock<M, H, F>where
Self: Module<T>,
type Output = <TransformerEncoderBlock<M, H, F> as Module<T>>::Output
type Output = <TransformerEncoderBlock<M, H, F> as Module<T>>::Output
The type that this unit produces given
Input
.sourcefn forward_mut(&mut self, t: T) -> Self::Output
fn forward_mut(&mut self, t: T) -> Self::Output
sourceimpl<const M: usize, const H: usize, const F: usize> ResetParams for TransformerEncoderBlock<M, H, F>
impl<const M: usize, const H: usize, const F: usize> ResetParams for TransformerEncoderBlock<M, H, F>
Auto Trait Implementations
impl<const MODEL_DIM: usize, const NUM_HEADS: usize, const FF_DIM: usize> RefUnwindSafe for TransformerEncoderBlock<MODEL_DIM, NUM_HEADS, FF_DIM>
impl<const MODEL_DIM: usize, const NUM_HEADS: usize, const FF_DIM: usize> Send for TransformerEncoderBlock<MODEL_DIM, NUM_HEADS, FF_DIM>
impl<const MODEL_DIM: usize, const NUM_HEADS: usize, const FF_DIM: usize> Sync for TransformerEncoderBlock<MODEL_DIM, NUM_HEADS, FF_DIM>
impl<const MODEL_DIM: usize, const NUM_HEADS: usize, const FF_DIM: usize> Unpin for TransformerEncoderBlock<MODEL_DIM, NUM_HEADS, FF_DIM>
impl<const MODEL_DIM: usize, const NUM_HEADS: usize, const FF_DIM: usize> UnwindSafe for TransformerEncoderBlock<MODEL_DIM, NUM_HEADS, FF_DIM>
Blanket Implementations
sourceimpl<T> BorrowMut<T> for Twhere
T: ?Sized,
impl<T> BorrowMut<T> for Twhere
T: ?Sized,
const: unstable · sourcefn borrow_mut(&mut self) -> &mut T
fn borrow_mut(&mut self) -> &mut T
Mutably borrows from an owned value. Read more