Skip to main content

Crate rlevo_core

Crate rlevo_core 

Source
Expand description

Core types and traits for evolutionary deep reinforcement learning.

rlevo-core defines the shared vocabulary used across the entire rlevo workspace. Every other crate — rlevo-reinforcement-learning, rlevo-evolution, rlevo-environments, rlevo-benchmarks — depends on these primitives. No concrete algorithms or environments live here.

§Module map

ModuleWhat it provides
baseReward, Observation, State, Action, TensorConvertible, UpdateFunction — the primitive trait vocabulary
actionDiscreteAction, MultiDiscreteAction, ContinuousAction — layered action-space extensions
stateMarkovState, BeliefState, HiddenState, LatentState, StateAggregation — POMDP and latent-space extensions
environmentEnvironment, Snapshot, SnapshotBase, EpisodeStatus, EnvironmentError — the agent/environment protocol
rewardScalarReward — the standard single-value reward concrete type
evaluationBenchEnv, BenchStep, BenchError — object-safe environment interface for harnesses
fitnessBenchableAgent, FitnessEvaluable, Landscape, Metric, MetricsProvider — inference-only agent and fitness evaluation
renderAsciiRenderable, Renderer, styled/palette/payload sub-modules — optional debug and TUI visualization layer
agentReserved; empty in v0.1.x while the unified agent trait hierarchy stabilizes
utilShared utility helpers

§Const-generic RANK

Several traits — Observation, State, Action — are parameterized by a const generic R (rank) that denotes the number of tensor axes (equivalent to NumPy’s ndim or Burn’s Tensor<B, R>). This encodes shape compatibility at compile time: a rank-1 observation and a rank-2 observation cannot be mixed up without a compile error.

Environment<R, SR, AR>
  ├── StateType     : State<SR>        (shape: [usize; SR])
  ├── ObservationType: Observation<R>  (shape: [usize; R])
  ├── ActionType    : Action<AR>       (shape: [usize; AR])
  └── SnapshotType  : Snapshot<R, ...>

§Episode loop sketch

The basic agent/environment interaction loop follows this pattern:

env.reset() → Snapshot { observation, reward, status: Running }
  loop:
    agent selects action from observation
    env.step(action) → Snapshot { observation, reward, status }
    break when status.is_done()

EpisodeStatus::Terminated and EpisodeStatus::Truncated are kept distinct so RL algorithms can bootstrap value correctly at truncation boundaries.

Modules§

action
Layered action-space traits: DiscreteAction, MultiDiscreteAction, and ContinuousAction.
agent
Reserved for a future unified agent trait hierarchy.
base
Primitive trait vocabulary: Reward, Observation, State, Action, TensorConvertible, and UpdateFunction.
environment
Agent/environment interaction protocol.
evaluation
Object-safe environment interface for benchmarking harnesses.
fitness
Inference-only agent and fitness-evaluation traits.
render
Optional rendering layer for debug output and TUI visualization.
reward
Concrete reward types.
state
Advanced state abstractions for POMDPs and latent representations.
util
Shared utility helpers used across rlevo-core modules. Shared utilities used across rlevo-core consumers.