pub struct MultiHeadAttentionConfig {
pub d_model: usize,
pub n_heads: usize,
pub dropout: f64,
pub min_float: f64,
pub quiet_softmax: bool,
pub initializer: Initializer,
}Expand description
Configuration to create a Multi Head Attention layer using the init function.
Fields§
§d_model: usizeThe size of each linear layer.
n_heads: usizeThe number of heads.
dropout: f64The dropout rate. Default: 0.1
min_float: f64The minimum value a float can take. Default: -1.0e4 This is used to mask attention scores before calculating attention weights. A value too low might result in NaN.
quiet_softmax: boolUse “quiet softmax” instead of regular softmax.
- Usage may improve performance by allowing attention heads to deposit no information (if the sequence contains no information relevant to that head).
- Usage may reduce the entropy of weights in the model, enhancing quantization and compression.
Reference: https://www.evanmiller.org/attention-is-off-by-one.html
initializer: InitializerThe type of function used to initialize neural network parameters
Implementations§
Source§impl MultiHeadAttentionConfig
impl MultiHeadAttentionConfig
Sourcepub fn with_dropout(self, dropout: f64) -> Self
pub fn with_dropout(self, dropout: f64) -> Self
The dropout rate. Default: 0.1
Sourcepub fn with_min_float(self, min_float: f64) -> Self
pub fn with_min_float(self, min_float: f64) -> Self
The minimum value a float can take. Default: -1.0e4
Sourcepub fn with_quiet_softmax(self, quiet_softmax: bool) -> Self
pub fn with_quiet_softmax(self, quiet_softmax: bool) -> Self
Use “quiet softmax” instead of regular softmax.
Sourcepub fn with_initializer(self, initializer: Initializer) -> Self
pub fn with_initializer(self, initializer: Initializer) -> Self
The type of function used to initialize neural network parameters
Source§impl MultiHeadAttentionConfig
impl MultiHeadAttentionConfig
Sourcepub fn init<B: Backend>(&self, device: &B::Device) -> MultiHeadAttention<B>
pub fn init<B: Backend>(&self, device: &B::Device) -> MultiHeadAttention<B>
Initialize a new multihead attention module.
Trait Implementations§
Source§impl Clone for MultiHeadAttentionConfig
impl Clone for MultiHeadAttentionConfig
Source§impl Config for MultiHeadAttentionConfig
impl Config for MultiHeadAttentionConfig
Source§fn save<P: AsRef<Path>>(&self, file: P) -> Result<()>
fn save<P: AsRef<Path>>(&self, file: P) -> Result<()>
Available on crate feature
std only.Saves the configuration to a file. Read more
Source§fn load<P: AsRef<Path>>(file: P) -> Result<Self, ConfigError>
fn load<P: AsRef<Path>>(file: P) -> Result<Self, ConfigError>
Available on crate feature
std only.Loads the configuration from a file. Read more
Source§fn load_binary(data: &[u8]) -> Result<Self, ConfigError>
fn load_binary(data: &[u8]) -> Result<Self, ConfigError>
Loads the configuration from a binary buffer. Read more
Source§impl<'de> Deserialize<'de> for MultiHeadAttentionConfig
impl<'de> Deserialize<'de> for MultiHeadAttentionConfig
Source§fn deserialize<D>(deserializer: D) -> Result<Self, D::Error>where
D: Deserializer<'de>,
fn deserialize<D>(deserializer: D) -> Result<Self, D::Error>where
D: Deserializer<'de>,
Deserialize this value from the given Serde deserializer. Read more
Source§impl Display for MultiHeadAttentionConfig
impl Display for MultiHeadAttentionConfig
Auto Trait Implementations§
impl Freeze for MultiHeadAttentionConfig
impl RefUnwindSafe for MultiHeadAttentionConfig
impl Send for MultiHeadAttentionConfig
impl Sync for MultiHeadAttentionConfig
impl Unpin for MultiHeadAttentionConfig
impl UnwindSafe for MultiHeadAttentionConfig
Blanket Implementations§
Source§impl<T> BorrowMut<T> for Twhere
T: ?Sized,
impl<T> BorrowMut<T> for Twhere
T: ?Sized,
Source§fn borrow_mut(&mut self) -> &mut T
fn borrow_mut(&mut self) -> &mut T
Mutably borrows from an owned value. Read more
Source§impl<T> CloneToUninit for Twhere
T: Clone,
impl<T> CloneToUninit for Twhere
T: Clone,
Source§impl<T> Instrument for T
impl<T> Instrument for T
Source§fn instrument(self, span: Span) -> Instrumented<Self>
fn instrument(self, span: Span) -> Instrumented<Self>
Source§fn in_current_span(self) -> Instrumented<Self>
fn in_current_span(self) -> Instrumented<Self>
Source§impl<T> IntoEither for T
impl<T> IntoEither for T
Source§fn into_either(self, into_left: bool) -> Either<Self, Self>
fn into_either(self, into_left: bool) -> Either<Self, Self>
Converts
self into a Left variant of Either<Self, Self>
if into_left is true.
Converts self into a Right variant of Either<Self, Self>
otherwise. Read moreSource§fn into_either_with<F>(self, into_left: F) -> Either<Self, Self>
fn into_either_with<F>(self, into_left: F) -> Either<Self, Self>
Converts
self into a Left variant of Either<Self, Self>
if into_left(&self) returns true.
Converts self into a Right variant of Either<Self, Self>
otherwise. Read more