Crate stateset_rl_core

Crate stateset_rl_core 

Source
Expand description

StateSet RL Core - High-performance Rust implementations for RL operations

This crate provides optimized implementations of performance-critical operations for the StateSet RL framework, exposed to Python via PyO3.

§Features

  • SIMD-accelerated advantage computation
  • Parallel trajectory processing
  • Efficient reward normalization
  • Fast GAE computation

Structs§

RewardStatistics
Compute reward statistics
RustTrajectory
Lightweight trajectory representation for Rust processing
RustTrajectoryGroup
Trajectory group for GRPO processing

Functions§

auto_scale_rewards
Apply reward scaling with automatic range detection
batch_normalize
Batch normalize rewards (mean 0, std 1)
batch_trajectories
Batch multiple trajectory groups for efficient processing
clip_rewards
Clip rewards to a range
compute_advantages_for_group
Compute advantages for a single group
compute_advantages_global_baseline
Compute advantages with a global baseline
compute_cumulative_rewards
Compute cumulative rewards for a trajectory
compute_discounted_rewards
Compute discounted cumulative rewards
compute_gae_internal
Compute GAE advantages from rewards and value estimates
compute_gae_with_dones
Compute GAE with done flags for episode boundaries
compute_lambda_returns
Compute lambda returns directly (alternative to GAE)
compute_returns
Compute returns (advantages + values) for value function training
exponential_moving_average
Compute exponential moving average of rewards
normalize_with_running_stats
Normalize rewards using Welford’s online algorithm for running statistics
shape_rewards
Reward shaping: transform sparse rewards to dense signals