pub struct Mdp {
pub n_states: usize,
pub n_actions: usize,
pub transition: Array3<f64>,
pub reward: Array3<f64>,
pub gamma: f64,
pub terminal_states: Vec<usize>,
}Expand description
A finite Markov Decision Process.
Transition probabilities: T[s, a, s'] = P(s' | s, a).
Rewards: R[s, a, s'] (triple-index form; use Mdp::with_state_action_reward for 2-D rewards).
Fields§
§n_states: usizeNumber of states.
n_actions: usizeNumber of actions.
transition: Array3<f64>Transition tensor (n_states × n_actions × n_states).
reward: Array3<f64>Reward tensor (n_states × n_actions × n_states).
gamma: f64Discount factor γ ∈ [0, 1).
terminal_states: Vec<usize>Optional absorbing / terminal states (no transitions away).
Implementations§
Source§impl Mdp
impl Mdp
Sourcepub fn new(
n_states: usize,
n_actions: usize,
transition: Array3<f64>,
reward: Array3<f64>,
gamma: f64,
) -> Result<Self, OptimizeError>
pub fn new( n_states: usize, n_actions: usize, transition: Array3<f64>, reward: Array3<f64>, gamma: f64, ) -> Result<Self, OptimizeError>
Create a new MDP and validate it.
Sourcepub fn validate(&self) -> Result<(), OptimizeError>
pub fn validate(&self) -> Result<(), OptimizeError>
Validate that transition probabilities sum to 1 for each (s, a).
Sourcepub fn expected_reward(&self) -> Array2<f64>
pub fn expected_reward(&self) -> Array2<f64>
Expected reward R(s,a) = Σ_{s’} T(s,a,s’) · R(s,a,s’).
Trait Implementations§
Auto Trait Implementations§
impl Freeze for Mdp
impl RefUnwindSafe for Mdp
impl Send for Mdp
impl Sync for Mdp
impl Unpin for Mdp
impl UnsafeUnpin for Mdp
impl UnwindSafe for Mdp
Blanket Implementations§
Source§impl<T> BorrowMut<T> for Twhere
T: ?Sized,
impl<T> BorrowMut<T> for Twhere
T: ?Sized,
Source§fn borrow_mut(&mut self) -> &mut T
fn borrow_mut(&mut self) -> &mut T
Mutably borrows from an owned value. Read more
Source§impl<T> CloneToUninit for Twhere
T: Clone,
impl<T> CloneToUninit for Twhere
T: Clone,
Source§impl<T> IntoEither for T
impl<T> IntoEither for T
Source§fn into_either(self, into_left: bool) -> Either<Self, Self>
fn into_either(self, into_left: bool) -> Either<Self, Self>
Converts
self into a Left variant of Either<Self, Self>
if into_left is true.
Converts self into a Right variant of Either<Self, Self>
otherwise. Read moreSource§fn into_either_with<F>(self, into_left: F) -> Either<Self, Self>
fn into_either_with<F>(self, into_left: F) -> Either<Self, Self>
Converts
self into a Left variant of Either<Self, Self>
if into_left(&self) returns true.
Converts self into a Right variant of Either<Self, Self>
otherwise. Read moreSource§impl<T> Pointable for T
impl<T> Pointable for T
Source§impl<SS, SP> SupersetOf<SS> for SPwhere
SS: SubsetOf<SP>,
impl<SS, SP> SupersetOf<SS> for SPwhere
SS: SubsetOf<SP>,
Source§fn to_subset(&self) -> Option<SS>
fn to_subset(&self) -> Option<SS>
The inverse inclusion map: attempts to construct
self from the equivalent element of its
superset. Read moreSource§fn is_in_subset(&self) -> bool
fn is_in_subset(&self) -> bool
Checks if
self is actually part of its subset T (and can be converted to it).Source§fn to_subset_unchecked(&self) -> SS
fn to_subset_unchecked(&self) -> SS
Use with care! Same as
self.to_subset but without any property checks. Always succeeds.Source§fn from_subset(element: &SS) -> SP
fn from_subset(element: &SS) -> SP
The inclusion map: converts
self to the equivalent element of its superset.