Skip to main content

SAMPLING_TEMPERATURE

Constant SAMPLING_TEMPERATURE 

Source
pub const SAMPLING_TEMPERATURE: Entropy = 2.0;
Expand description

Temperature (T) - controls sampling entropy via policy scaling. Higher T → more uniform (exploratory); lower T → more peaked (greedy). Formula: σ’(a) = max(ε, (σ(a)/T + β) / (Σσ + β)).