Skip to main content

MultiHeadAttention

Struct MultiHeadAttention

pub struct MultiHeadAttention { /* private fields */ }

Expand description

Multi-head attention.

Projects input through Q, K, V linear layers, splits into n_heads heads, runs per-head scaled dot-product attention (via the fused Attention op), concatenates heads, and applies an output projection.

Implementations§

impl MultiHeadAttention

pub fn new( wq: Linear, wk: Linear, wv: Linear, wo: Linear, n_heads: usize, ) -> Self

Create a new MultiHeadAttention from pre-built linear layers.

wq, wk, wv: projection layers with weight [n_heads * head_dim, model_dim]
wo: output projection [model_dim, n_heads * head_dim]
n_heads: number of attention heads

pub fn forward_causal(&self, x: &Tensor) -> Result<Tensor>

Forward pass with causal masking (self-attention, auto-regressive).

x has shape [seq_len, model_dim]. Returns [seq_len, model_dim].

Auto Trait Implementations§

impl Freeze for MultiHeadAttention

impl !RefUnwindSafe for MultiHeadAttention

impl Send for MultiHeadAttention

impl Sync for MultiHeadAttention

impl Unpin for MultiHeadAttention

impl UnsafeUnpin for MultiHeadAttention

impl !UnwindSafe for MultiHeadAttention

Blanket Implementations§

impl<T> Any for T
where T: 'static + ?Sized,

fn type_id(&self) -> TypeId

Gets the TypeId of self. Read more

impl<T> Borrow<T> for T
where T: ?Sized,

fn borrow(&self) -> &T

Immutably borrows from an owned value. Read more

impl<T> BorrowMut<T> for T
where T: ?Sized,

fn borrow_mut(&mut self) -> &mut T

Mutably borrows from an owned value. Read more

impl<T> From<T> for T

fn from(t: T) -> T

Returns the argument unchanged.

impl<T, U> Into<U> for T
where U: From<T>,

fn into(self) -> U

Calls U::from(self).

That is, this conversion is whatever the implementation of From<T> for U chooses to do.

impl<T, U> TryFrom<U> for T
where U: Into<T>,

type Error = Infallible

The type returned in the event of a conversion error.

fn try_from(value: U) -> Result<T, <T as TryFrom<U>>::Error>

Performs the conversion.

impl<T, U> TryInto<U> for T
where U: TryFrom<T>,

type Error = <U as TryFrom<T>>::Error

The type returned in the event of a conversion error.

fn try_into(self) -> Result<U, <U as TryFrom<T>>::Error>

Performs the conversion.

impl<V, T> VZip<V> for T
where V: MultiLane<T>,

fn vzip(self) -> V