Trait vrp_core::algorithms::mdp::LearningStrategy [−][src]
pub trait LearningStrategy<S: State> { fn value(
&self,
reward_value: f64,
old_value: f64,
estimates: &ActionEstimates<S>
) -> f64; }
A learning strategy for the MDP.
Required methods
fn value(
&self,
reward_value: f64,
old_value: f64,
estimates: &ActionEstimates<S>
) -> f64[src]
&self,
reward_value: f64,
old_value: f64,
estimates: &ActionEstimates<S>
) -> f64
Estimates an action value given received reward, current value, and actions values from the new state.
Implementors
impl<S: State> LearningStrategy<S> for MonteCarlo[src]
impl<S: State> LearningStrategy<S> for MonteCarlo[src]impl<S: State> LearningStrategy<S> for QLearning[src]
impl<S: State> LearningStrategy<S> for QLearning[src]