Expand description
StateSet RL Core - High-performance Rust implementations for RL operations
This crate provides optimized implementations of performance-critical operations for the StateSet RL framework, exposed to Python via PyO3.
§Features
- SIMD-accelerated advantage computation
- Parallel trajectory processing
- Efficient reward normalization
- Fast GAE computation
Structs§
- Reward
Statistics - Compute reward statistics
- Rust
Trajectory - Lightweight trajectory representation for Rust processing
- Rust
Trajectory Group - Trajectory group for GRPO processing
Functions§
- auto_
scale_ rewards - Apply reward scaling with automatic range detection
- batch_
normalize - Batch normalize rewards (mean 0, std 1)
- batch_
trajectories - Batch multiple trajectory groups for efficient processing
- clip_
rewards - Clip rewards to a range
- compute_
advantages_ for_ group - Compute advantages for a single group
- compute_
advantages_ global_ baseline - Compute advantages with a global baseline
- compute_
cumulative_ rewards - Compute cumulative rewards for a trajectory
- compute_
discounted_ rewards - Compute discounted cumulative rewards
- compute_
gae_ internal - Compute GAE advantages from rewards and value estimates
- compute_
gae_ with_ dones - Compute GAE with done flags for episode boundaries
- compute_
lambda_ returns - Compute lambda returns directly (alternative to GAE)
- compute_
returns - Compute returns (advantages + values) for value function training
- exponential_
moving_ average - Compute exponential moving average of rewards
- normalize_
with_ running_ stats - Normalize rewards using Welford’s online algorithm for running statistics
- shape_
rewards - Reward shaping: transform sparse rewards to dense signals