Crate rlevo_core

Expand description

Core types and traits for evolutionary deep reinforcement learning.

rlevo-core defines the shared vocabulary used across the entire rlevo workspace. Every other crate — rlevo-reinforcement-learning, rlevo-evolution, rlevo-environments, rlevo-benchmarks — depends on these primitives. No concrete algorithms or environments live here.

§Module map

Module	What it provides
`base`	`Reward`, `Observation`, `State`, `Action`, `TensorConvertible`, `UpdateFunction` — the primitive trait vocabulary
`action`	`DiscreteAction`, `MultiDiscreteAction`, `ContinuousAction` — layered action-space extensions
`state`	`MarkovState`, `BeliefState`, `HiddenState`, `LatentState`, `StateAggregation` — POMDP and latent-space extensions
`environment`	`Environment`, `Snapshot`, `SnapshotBase`, `EpisodeStatus`, `EnvironmentError` — the agent/environment protocol
`reward`	`ScalarReward` — the standard single-value reward concrete type
`evaluation`	`BenchEnv`, `BenchStep`, `BenchError` — object-safe environment interface for harnesses
`fitness`	`BenchableAgent`, `FitnessEvaluable`, `Landscape`, `Metric`, `MetricsProvider` — inference-only agent and fitness evaluation
`render`	`AsciiRenderable`, `Renderer`, styled/palette/payload sub-modules — optional debug and TUI visualization layer
`agent`	Reserved; empty in v0.1.x while the unified agent trait hierarchy stabilizes
`util`	Shared utility helpers

§Const-generic `RANK`

Several traits — Observation, State, Action — are parameterized by a const generic R (rank) that denotes the number of tensor axes (equivalent to NumPy’s ndim or Burn’s Tensor<B, R>). This encodes shape compatibility at compile time: a rank-1 observation and a rank-2 observation cannot be mixed up without a compile error.

Environment<R, SR, AR>
  ├── StateType     : State<SR>        (shape: [usize; SR])
  ├── ObservationType: Observation<R>  (shape: [usize; R])
  ├── ActionType    : Action<AR>       (shape: [usize; AR])
  └── SnapshotType  : Snapshot<R, ...>

§Episode loop sketch

The basic agent/environment interaction loop follows this pattern:

env.reset() → Snapshot { observation, reward, status: Running }
  loop:
    agent selects action from observation
    env.step(action) → Snapshot { observation, reward, status }
    break when status.is_done()

EpisodeStatus::Terminated and EpisodeStatus::Truncated are kept distinct so RL algorithms can bootstrap value correctly at truncation boundaries.

Modules§

action: Layered action-space traits: DiscreteAction, MultiDiscreteAction, and ContinuousAction.
agent: Reserved for a future unified agent trait hierarchy.
base: Primitive trait vocabulary: Reward, Observation, State, Action, TensorConvertible, and UpdateFunction.
environment: Agent/environment interaction protocol.
evaluation: Object-safe environment interface for benchmarking harnesses.
fitness: Inference-only agent and fitness-evaluation traits.
render: Optional rendering layer for debug output and TUI visualization.
reward: Concrete reward types.
state: Advanced state abstractions for POMDPs and latent representations.
util: Shared utility helpers used across rlevo-core modules. Shared utilities used across rlevo-core consumers.

Crate rlevo_core

Crate rlevo_core Copy item path

§Module map

§Const-generic RANK

§Episode loop sketch

Modules§

Crate rlevo_core

§Const-generic `RANK`