Trait vrp_core::algorithms::mdp::LearningStrategy [−][src]
A learning strategy for the MDP.
Required methods
fn value(
&self,
reward_value: f64,
old_value: f64,
estimates: &ActionEstimates<S>
) -> f64
[src]
&self,
reward_value: f64,
old_value: f64,
estimates: &ActionEstimates<S>
) -> f64
Estimates an action value given received reward, current value, and actions values from the new state.