Module vrp_core::algorithms::mdp [−][src]
This module contains definition of Markov Decision Process (MDP) model and related reinforcement learning logic.
Structs
| ActionEstimates | Keeps track of action estimation. |
| EpsilonGreedy | An e-greedy action selection strategy which acts as greedy except it can select some random action with probability specified. |
| EpsilonWeighted | An e-weighted action selection strategy. |
| Greedy | A greedy strategy. |
| MonteCarlo | Applies monte cargo learning strategy to calculate values for taken actions. |
| QLearning | Applies q-learning strategy to calculate values for taken actions. |
| Simulator | A simulator to train agent with multiple episodes. |
Traits
| Agent | Represents an agent in MDP. |
| LearningStrategy | A learning strategy for the MDP. |
| PolicyStrategy | A policy strategy for MDP. |
| State | Represents a state in MDP. |
Type Definitions
| StateEstimates | A type which keeps track of all state-action estimates. |