Expand description
Structs containing initialized Tensors & impls for super::Module. See super::builders for helpful utilities in creating these in a device/dtype agnostic way.
Re-exports
pub use super::*;
Structs
- Calls abs().
- Add inputs together into a single tensor.
T
should be a tuple - Average pool with 2d kernel that operates on images (3d) and batches of images (4d). Each patch reduces to the average of the values in the patch.
- Applies average pooling over an entire image, fully reducing the height and width dimensions:
- Batch normalization for sequences as described in Batch Normalization: Accelerating Deep Network Training by Reducing Internal Covariate Shift
- Batch normalization for images as described in Batch Normalization: Accelerating Deep Network Training by Reducing Internal Covariate Shift
- Adds a learnable 1d bias to 3d and 4d inputs. Can be used with
crate::nn::modules::Conv2D
to create a Biased conv. - Requires Nightly Performs unbiased 2d convolutions on 3d and 4d images.
- Calls cos().
- An embedding Initializes Self::weight from a Uniform distribution between [-1 / sqrt(I), 1 / sqrt(I)].
- Calls exp().
- Requires Nightly Flattens 3d tensors to 1d, and 4d tensors to 2d.
- Calls gelu().
- A residual connection
R
aroundF
:F(x) + R(x)
, as introduced in Deep Residual Learning for Image Recognition. - Implements layer normalization as described in Layer Normalization.
- A linear transformation of the form
weight * x + bias
, whereweight
is a matrix,x
is a vector or matrix, andbias
is a vector. - Calls ln().
- Max pool with 2d kernel that operates on images (3d) and batches of images (4d). Each patch reduces to the maximum value in that patch.
- Applies max pooling over an entire image, fully reducing the height and width dimensions:
- Minimum pool with 2d kernel that operates on images (3d) and batches of images (4d). Each patch reduces to the minimum of the values in the patch.
- Applies min pooling over an entire image, fully reducing the height and width dimensions:
- A multi-head attention layer.
- Calls relu().
- Repeats
T
N
times. This requires thatT
’s input is the same as it’s output. - A residual connection around
F
:F(x) + x
, as introduced in Deep Residual Learning for Image Recognition. - Calls sigmoid().
- Calls sin().
- Calls softmax().
- Splits input into multiple heads.
T
should be a tuple, where every element of the tuple accepts the same input type. - Calls sqrt().
- Calls square().
- Calls tanh().
- Transformer architecture as described in Attention is all you need.
- A transformer decoder.
- A transformer decoder block. Different than the normal transformer block as this self attention accepts an additional sequence from the encoder.
- A single transformer encoder block
- A linear transformation of the form
weight * x
, whereweight
is a matrix,x
is a vector or matrix.
Type Definitions
- A transformer encoder.