Module vrp_core::algorithms::mdp [−][src]
This module contains definition of Markov Decision Process (MDP) model and related reinforcement learning logic.
Structs
ActionEstimates | Keeps track of action estimation. |
EpsilonGreedy | An e-greedy action selection strategy which acts as greedy except it can select some random action with probability specified. |
EpsilonWeighted | An e-weighted action selection strategy. |
Greedy | A greedy strategy. |
MonteCarlo | Applies monte cargo learning strategy to calculate values for taken actions. |
QLearning | Applies q-learning strategy to calculate values for taken actions. |
Simulator | A simulator to train agent with multiple episodes. |
Traits
Agent | Represents an agent in MDP. |
LearningStrategy | A learning strategy for the MDP. |
PolicyStrategy | A policy strategy for MDP. |
State | Represents a state in MDP. |
Type Definitions
StateEstimates | A type which keeps track of all state-action estimates. |