Crate concision_ext

Source
Expand description

§concision-ext

This library uses the concision framework to implement a variety of additional machine learning models and layers.

Modules§

attention
Attention
prelude
simple

Structs§

MultiHeadAttention
Multi-Headed attention is the first evolution of the Scaled Dot-Product Attention mechanism. They allow the model to jointly attend to information from different representation subspaces at different positions.
QkvParamsBase
This object is designed to store the parameters of the QKV (Query, Key, Value)
ScaledDotProductAttention
Scaled Dot-Product Attention mechanism is the core of the Transformer architecture. It computes the attention scores using the dot product of the query and key vectors, scales them by the square root of the dimension of the key vectors, and applies a softmax function to obtain the attention weights.