pub struct MultiHeadAttention<const M: usize, const N: usize, const K: usize, const V: usize, const H: usize> {
    pub w_q: Linear<M, K>,
    pub w_k: Linear<N, K>,
    pub w_v: Linear<N, V>,
    pub w_o: Linear<V, M>,
}
Expand description

Requires Nightly A multi-head attention layer.

Generics

  • M The embedding size of token vectors from decoder.
  • N The embedding size of token vectors from encoder.
  • K The size of the keys in self attention.
  • V The size of the values.
  • H The number of attention heads.

Examples

MultiHeadAttention<8, 10, 10, 10, 2> is an attention layer with 2 heads and 10 token, key and value dims. TODO: Doctests fail for some reason

Fields

w_q: Linear<M, K>w_k: Linear<N, K>w_v: Linear<N, V>w_o: Linear<V, M>

Trait Implementations

Updates self given the GradientProvider. When any parameters that are NOT present in G, then this function should add the tensor’s UniqueId to UnusedTensors. Read more

Returns a copy of the value. Read more

Performs copy-assignment from source. Read more

Formats the value using the given formatter. Read more

Returns the “default value” for a type. Read more

Encoder-Decoder style self attention where one set of tensors is used for values and keys, and another is used for queries

The type that this unit produces given Input.

Pass an Input through the unit and produce Self::Output. Can be implemented for multiple Input types. Read more

Batched Encoder-Decoder style self attention where one set of tensors is used for values and keys, and another is used for queries

The type that this unit produces given Input.

Pass an Input through the unit and produce Self::Output. Can be implemented for multiple Input types. Read more

Normal self attention (where same tensors are used for keys, queries and values)

The type that this unit produces given Input.

Pass an Input through the unit and produce Self::Output. Can be implemented for multiple Input types. Read more

Batched normal self attention (where same tensors are used for keys, queries and values)

The type that this unit produces given Input.

Pass an Input through the unit and produce Self::Output. Can be implemented for multiple Input types. Read more

Mutate the unit’s parameters using rand::Rng. Each implementor of this trait decides how the parameters are initialized. In fact, some impls may not even use the rng. Read more

Auto Trait Implementations

Blanket Implementations

Gets the TypeId of self. Read more

Immutably borrows from an owned value. Read more

Mutably borrows from an owned value. Read more

Returns the argument unchanged.

Calls U::from(self).

That is, this conversion is whatever the implementation of From<T> for U chooses to do.

Should always be Self

The resulting type after obtaining ownership.

Creates owned data from borrowed data, usually by cloning. Read more

Uses borrowed data to replace owned data, usually by cloning. Read more

The type returned in the event of a conversion error.

Performs the conversion.

The type returned in the event of a conversion error.

Performs the conversion.