pub struct DqnConfig {
pub gamma: f64,
pub learning_rate: f64,
pub batch_size: usize,
pub buffer_capacity: usize,
pub min_replay_size: usize,
pub target_update_freq: usize,
pub hidden_sizes: Vec<usize>,
pub epsilon_start: f64,
pub epsilon_end: f64,
pub epsilon_decay_steps: usize,
}Expand description
Configuration for a DQN agent.
All hyperparameters live here. Pass this to DqnAgent::new().
The defaults reflect standard DQN practice suitable for moderately
complex environments. Simple environments like CartPole will want
smaller buffer/warmup values and faster epsilon decay.
Fields§
§gamma: f64Discount factor γ. Controls how much future rewards are valued. Typical values: 0.95–0.999. Default: 0.99
learning_rate: f64Learning rate for the Adam optimiser. Default: 1e-4
batch_size: usizeNumber of experiences sampled per gradient update. Default: 32
buffer_capacity: usizeMaximum number of experiences in the replay buffer. Oldest are overwritten when full. Default: 100_000
min_replay_size: usizeMinimum number of experiences collected before training begins. During warm-up, actions are sampled randomly. Must be >= batch_size. Default: 10_000
target_update_freq: usizeNumber of steps between hard target network updates.
The target network is a frozen copy of the online network used to compute stable TD targets. Updating it too frequently causes instability; too rarely slows learning. Default: 1_000
Hidden layer sizes for the Q-network.
The network architecture is:
obs_size -> hidden[0] -> hidden[1] -> ... -> num_actions
All hidden layers use ReLU activations.
Default: [128, 128]
epsilon_start: f64Starting epsilon for ε-greedy exploration. At step 0, actions are random with this probability. Default: 1.0
epsilon_end: f64Final epsilon after decay is complete. Default: 0.05
epsilon_decay_steps: usizeNumber of steps over which epsilon decays linearly from
epsilon_start to epsilon_end. Default: 50_000
Implementations§
Trait Implementations§
Auto Trait Implementations§
impl Freeze for DqnConfig
impl RefUnwindSafe for DqnConfig
impl Send for DqnConfig
impl Sync for DqnConfig
impl Unpin for DqnConfig
impl UnsafeUnpin for DqnConfig
impl UnwindSafe for DqnConfig
Blanket Implementations§
Source§impl<T> BorrowMut<T> for Twhere
T: ?Sized,
impl<T> BorrowMut<T> for Twhere
T: ?Sized,
Source§fn borrow_mut(&mut self) -> &mut T
fn borrow_mut(&mut self) -> &mut T
Source§impl<T> CloneToUninit for Twhere
T: Clone,
impl<T> CloneToUninit for Twhere
T: Clone,
Source§impl<T> Instrument for T
impl<T> Instrument for T
Source§fn instrument(self, span: Span) -> Instrumented<Self>
fn instrument(self, span: Span) -> Instrumented<Self>
Source§fn in_current_span(self) -> Instrumented<Self>
fn in_current_span(self) -> Instrumented<Self>
Source§impl<T> IntoEither for T
impl<T> IntoEither for T
Source§fn into_either(self, into_left: bool) -> Either<Self, Self>
fn into_either(self, into_left: bool) -> Either<Self, Self>
self into a Left variant of Either<Self, Self>
if into_left is true.
Converts self into a Right variant of Either<Self, Self>
otherwise. Read moreSource§fn into_either_with<F>(self, into_left: F) -> Either<Self, Self>
fn into_either_with<F>(self, into_left: F) -> Either<Self, Self>
self into a Left variant of Either<Self, Self>
if into_left(&self) returns true.
Converts self into a Right variant of Either<Self, Self>
otherwise. Read more