pub struct DeterministicPolicy { /* private fields */ }Expand description
Deterministic policy wrapper for DDPG/TD3.
Holds only the action bounds and exploration noise configuration; the
network parameters (μ_θ(s)) are managed externally.
Implementations§
Source§impl DeterministicPolicy
impl DeterministicPolicy
Sourcepub fn with_bounds(action_dim: usize, action_low: f32, action_high: f32) -> Self
pub fn with_bounds(action_dim: usize, action_low: f32, action_high: f32) -> Self
Create a policy with custom action bounds.
Sourcepub fn action_dim(&self) -> usize
pub fn action_dim(&self) -> usize
Number of action dimensions.
Sourcepub fn clip_action(&self, action: &[f32]) -> RlResult<Vec<f32>>
pub fn clip_action(&self, action: &[f32]) -> RlResult<Vec<f32>>
Clip action to [action_low, action_high].
§Errors
RlError::DimensionMismatchifaction.len() != action_dim.
Sourcepub fn exploration_action(
&self,
action: &[f32],
sigma: f32,
handle: &mut RlHandle,
) -> RlResult<Vec<f32>>
pub fn exploration_action( &self, action: &[f32], sigma: f32, handle: &mut RlHandle, ) -> RlResult<Vec<f32>>
Add Gaussian exploration noise and clip.
Returns clip(action + N(0, σ²), low, high).
§Errors
RlError::DimensionMismatchifaction.len() != action_dim.
Sourcepub fn smooth_target_action(
&self,
action: &[f32],
sigma: f32,
clip_c: f32,
handle: &mut RlHandle,
) -> RlResult<Vec<f32>>
pub fn smooth_target_action( &self, action: &[f32], sigma: f32, clip_c: f32, handle: &mut RlHandle, ) -> RlResult<Vec<f32>>
TD3 target policy smoothing: add clipped noise to target actions.
ã = clip(action + clip(N(0, σ²), -c, c), low, high)§Errors
RlError::DimensionMismatchifaction.len() != action_dim.
Trait Implementations§
Source§impl Clone for DeterministicPolicy
impl Clone for DeterministicPolicy
Source§fn clone(&self) -> DeterministicPolicy
fn clone(&self) -> DeterministicPolicy
Returns a duplicate of the value. Read more
1.0.0 · Source§fn clone_from(&mut self, source: &Self)
fn clone_from(&mut self, source: &Self)
Performs copy-assignment from
source. Read moreAuto Trait Implementations§
impl Freeze for DeterministicPolicy
impl RefUnwindSafe for DeterministicPolicy
impl Send for DeterministicPolicy
impl Sync for DeterministicPolicy
impl Unpin for DeterministicPolicy
impl UnsafeUnpin for DeterministicPolicy
impl UnwindSafe for DeterministicPolicy
Blanket Implementations§
Source§impl<T> BorrowMut<T> for Twhere
T: ?Sized,
impl<T> BorrowMut<T> for Twhere
T: ?Sized,
Source§fn borrow_mut(&mut self) -> &mut T
fn borrow_mut(&mut self) -> &mut T
Mutably borrows from an owned value. Read more