Expand description
Streaming linear attention models.
This module provides StreamingAttentionModel, a streaming machine learning
model that uses multi-head linear attention as its temporal feature extractor,
feeding into a Recursive Least Squares (RLS) readout layer. It integrates
with irithyll’s StreamingLearner trait
and StreamingPreprocessor trait.
§Architecture
input features ──→ [MultiHeadAttention] ──→ temporal features ──→ [RLS] ──→ prediction
(d_model) (recurrent state) (d_model) (1)The attention layer processes each feature vector as a timestep, maintaining per-head recurrent state that captures temporal dependencies via linear attention mechanisms (RetNet, Hawk, GLA, DeltaNet, GatedDeltaNet, RWKV, mLSTM). The RLS readout learns a linear mapping from the attention output to the target.
§Components
StreamingAttentionModel– full model implementingStreamingLearnerAttentionPreprocessor– attention-only preprocessor implementingStreamingPreprocessorStreamingAttentionConfig/StreamingAttentionConfigBuilder– validated configuration
§Example
ⓘ
use irithyll::attention::{StreamingAttentionModel, StreamingAttentionConfig, AttentionMode};
use irithyll::learner::StreamingLearner;
let config = StreamingAttentionConfig::builder()
.d_model(8)
.n_heads(2)
.mode(AttentionMode::GLA)
.build()
.unwrap();
let mut model = StreamingAttentionModel::new(config);
model.train(&[1.0, 2.0, 3.0, 4.0, 5.0, 6.0, 7.0, 8.0], 5.0);
let pred = model.predict(&[1.0, 2.0, 3.0, 4.0, 5.0, 6.0, 7.0, 8.0]);
assert!(pred.is_finite());Re-exports§
pub use attention_config::StreamingAttentionConfig;pub use attention_config::StreamingAttentionConfigBuilder;pub use attention_preprocessor::AttentionPreprocessor;pub use streaming_attention::StreamingAttentionModel;
Modules§
- attention_
config - Configuration and builder for
StreamingAttentionModel. - attention_
preprocessor - Attention-based streaming preprocessor for pipeline composition.
- streaming_
attention - Streaming linear attention model: multi-head attention + RLS readout.
Structs§
- Attention
Config - Full configuration for a multi-head streaming attention layer.
- Multi
Head Attention - Multi-head streaming linear attention layer.
Enums§
- Attention
Mode - Selects the attention architecture variant.
Traits§
- Attention
Layer - Trait for streaming attention layers.
Functions§
- delta_
net - Create a Gated DeltaNet model (NVIDIA, strongest retrieval).
- gla
- Create a Gated Linear Attention model (SOTA streaming attention).
- hawk
- Create a Hawk model (lightest, vector state).
- ret_net
- Create a RetNet model (simplest, fixed decay).
- streaming_
attention - Create a generic streaming attention model with any mode.