QLearner

Struct QLearner 

Source
pub struct QLearner { /* private fields */ }
Expand description

A simple reinforcement learning framework that can be used to learn optimal policies for Markov decision processes using Q-learning. Q-learning is a model-free reinforcement learning algorithm that learns an optimal action-value function from experience by repeatedly updating estimates of the Q-value of state-action pairs.

Implementations§

Source§

impl QLearner

Source

pub fn update(&mut self, state: u64, action: u32, reward: f64)

Updates Q-value for a state-action pair based on received reward.

§Arguments
  • state - An integer representing the state.
  • action - An integer representing the action.
  • reward - A number representing the reward received for the action in the state.
Source

pub fn get_best_action(&mut self, state: u64) -> i32

Returns the best action for a given state based on the current Q-values.

§Arguments
  • state - The current state.
§Returns
  • i32 - The action with the highest Q-value for the given state.
Source

pub fn visit_matrix(&mut self, handler: Box<dyn FnMut(u64, u32, f64)>)

Visits all state-action pairs and calls the provided handler function for each pair.

§Arguments
  • handler - A function that is called for each state-action pair.
Source

pub fn pack(hints: &Vec<i32>, values: &Vec<i32>) -> u64

Constructs a state from given hints and condition values.

§Arguments
  • hints - A vector of integers representing the byte length of provided values.
  • values - The condition values as discrete values.
§Returns
  • i64 - The packed state value.
Source

pub fn unpack(hints: &Vec<i32>, state: u64) -> Vec<i32>

Deconstructs a state from given hints to get condition values.

§Arguments
  • hints - A vector of integers representing the byte length of provided values.
  • state - The state integer to unpack.
§Returns
  • Vec<i32> - The condition values as discrete values.
Source

pub fn new(gamma: f64, alpha: f64, max_q: f64) -> QLearner

Creates a new QLearner object with optional parameters for gamma, alpha, and maxQ.

§Arguments
  • gamma - The discount factor for future rewards.
  • alpha - The learning rate for updating Q-values.
  • maxQ - The maximum Q-value. Defaults to 100.0.
§Returns
  • QLearner - The newly created QLearner object.

Trait Implementations§

Source§

impl Clone for QLearner

Source§

fn clone(&self) -> QLearner

Returns a duplicate of the value. Read more
1.0.0 · Source§

fn clone_from(&mut self, source: &Self)

Performs copy-assignment from source. Read more
Source§

impl Drop for QLearner

Source§

fn drop(&mut self)

Executes the destructor for this type. Read more
Source§

impl IObject for QLearner

Source§

fn raw(&self) -> i64

Source§

fn obj(&self) -> &dyn IObject

Source§

fn as_any(&self) -> &dyn Any

Source§

fn get_id(&self) -> i32

Auto Trait Implementations§

Blanket Implementations§

Source§

impl<T> Any for T
where T: 'static + ?Sized,

Source§

fn type_id(&self) -> TypeId

Gets the TypeId of self. Read more
Source§

impl<T> Borrow<T> for T
where T: ?Sized,

Source§

fn borrow(&self) -> &T

Immutably borrows from an owned value. Read more
Source§

impl<T> BorrowMut<T> for T
where T: ?Sized,

Source§

fn borrow_mut(&mut self) -> &mut T

Mutably borrows from an owned value. Read more
Source§

impl<T> CloneToUninit for T
where T: Clone,

Source§

unsafe fn clone_to_uninit(&self, dest: *mut u8)

🔬This is a nightly-only experimental API. (clone_to_uninit)
Performs copy-assignment from self to dest. Read more
Source§

impl<T> From<T> for T

Source§

fn from(t: T) -> T

Returns the argument unchanged.

Source§

impl<T, U> Into<U> for T
where U: From<T>,

Source§

fn into(self) -> U

Calls U::from(self).

That is, this conversion is whatever the implementation of From<T> for U chooses to do.

Source§

impl<T> ToOwned for T
where T: Clone,

Source§

type Owned = T

The resulting type after obtaining ownership.
Source§

fn to_owned(&self) -> T

Creates owned data from borrowed data, usually by cloning. Read more
Source§

fn clone_into(&self, target: &mut T)

Uses borrowed data to replace owned data, usually by cloning. Read more
Source§

impl<T, U> TryFrom<U> for T
where U: Into<T>,

Source§

type Error = Infallible

The type returned in the event of a conversion error.
Source§

fn try_from(value: U) -> Result<T, <T as TryFrom<U>>::Error>

Performs the conversion.
Source§

impl<T, U> TryInto<U> for T
where U: TryFrom<T>,

Source§

type Error = <U as TryFrom<T>>::Error

The type returned in the event of a conversion error.
Source§

fn try_into(self) -> Result<U, <U as TryFrom<T>>::Error>

Performs the conversion.