Function default_lambda_init

Source

pub fn default_lambda_init(max_levels: usize) -> f64

Available on crate feature alloc only.

Expand description

Default initial λ for AttentionMode::LogLinear. With Σ λ ≤ 1 after softplus-softmax mixing, an init of 1/max_levels makes the un-trained mixture uniform — every level contributes equally. Paper §3.3 (R1 §5.3) notes: in the streaming setting without backprop, the λ projection is fixed at init time, so a uniform mixture is the principled choice when no information about which levels are useful is available.

default_lambda_init

Function default_lambda_init Copy item path

Function default_lambda_init