Expand description
Core types and traits for evolutionary deep reinforcement learning.
rlevo-core defines the shared vocabulary used across the entire rlevo
workspace. Every other crate — rlevo-reinforcement-learning,
rlevo-evolution, rlevo-environments, rlevo-benchmarks — depends on
these primitives. No concrete algorithms or environments live here.
§Module map
| Module | What it provides |
|---|---|
base | Reward, Observation, State, Action, TensorConvertible, UpdateFunction — the primitive trait vocabulary |
action | DiscreteAction, MultiDiscreteAction, ContinuousAction — layered action-space extensions |
state | MarkovState, BeliefState, HiddenState, LatentState, StateAggregation — POMDP and latent-space extensions |
environment | Environment, Snapshot, SnapshotBase, EpisodeStatus, EnvironmentError — the agent/environment protocol |
reward | ScalarReward — the standard single-value reward concrete type |
evaluation | BenchEnv, BenchStep, BenchError — object-safe environment interface for harnesses |
fitness | BenchableAgent, FitnessEvaluable, Landscape, Metric, MetricsProvider — inference-only agent and fitness evaluation |
render | AsciiRenderable, Renderer, styled/palette/payload sub-modules — optional debug and TUI visualization layer |
agent | Reserved; empty in v0.1.x while the unified agent trait hierarchy stabilizes |
util | Shared utility helpers |
§Const-generic RANK
Several traits — Observation, State, Action — are parameterized
by a const generic R (rank) that denotes the number of tensor axes
(equivalent to NumPy’s ndim or Burn’s Tensor<B, R>). This encodes
shape compatibility at compile time: a rank-1 observation and a rank-2
observation cannot be mixed up without a compile error.
Environment<R, SR, AR>
├── StateType : State<SR> (shape: [usize; SR])
├── ObservationType: Observation<R> (shape: [usize; R])
├── ActionType : Action<AR> (shape: [usize; AR])
└── SnapshotType : Snapshot<R, ...>§Episode loop sketch
The basic agent/environment interaction loop follows this pattern:
env.reset() → Snapshot { observation, reward, status: Running }
loop:
agent selects action from observation
env.step(action) → Snapshot { observation, reward, status }
break when status.is_done()EpisodeStatus::Terminated and EpisodeStatus::Truncated are kept
distinct so RL algorithms can bootstrap value correctly at truncation
boundaries.
Modules§
- action
- Layered action-space traits:
DiscreteAction,MultiDiscreteAction, andContinuousAction. - agent
- Reserved for a future unified agent trait hierarchy.
- base
- Primitive trait vocabulary:
Reward,Observation,State,Action,TensorConvertible, andUpdateFunction. - environment
- Agent/environment interaction protocol.
- evaluation
- Object-safe environment interface for benchmarking harnesses.
- fitness
- Inference-only agent and fitness-evaluation traits.
- render
- Optional rendering layer for debug output and TUI visualization.
- reward
- Concrete reward types.
- state
- Advanced state abstractions for POMDPs and latent representations.
- util
- Shared utility helpers used across
rlevo-coremodules. Shared utilities used acrossrlevo-coreconsumers.