Expand description
The Cervo runtime is a combined front for multiple inference models, managing data routing and offering higher-level constructs for execution.
The goal of the runtime is to simplify batched execution, especially in the face of time-slotted execution.
use cervo_asset::{AssetData, AssetKind};
use cervo_runtime::{BrainId, Runtime, AgentId};
use std::time::Duration;
let mut runtime = Runtime::new();
let racer_asset = load_model("racing-car");
let racer_inferer = racer_asset.load_basic()?;
let racer_infer_key = runtime.add_inferer(racer_inferer);
let truck_asset = load_model("monster-truck");
let truck_inferer = truck_asset.load_basic()?;
let truck_infer_key = runtime.add_inferer(truck_inferer);
game::observe_racecars(racer_infer_key, &mut runtime);
game::observe_trucks(truck_infer_key, &mut runtime);
let responses = runtime.run_for(Duration::from_millis(1))?;
game::assign_actions(responses);
§Notes on time-slotted execution
There is a very experimental feature for using time-slotted execution based on collected performance metrics. Each time a model is executed the runtime records the batch size and time-cost. This is then used to estimate the cost of future batches.
The implementation currently requires that whichever model is first in
the queue for execution gets to run. This is to ensure that models
don’t end up in back-off. This means that
Runtime::run_for(Duration::from_millis(0))
will still execute one
model.
Once the first model has executed, models will be processed in order, starting by the one that has waited longest for running. Models that would take too long to run will be skipped and end up at the back of the queue. This ensures that the first skipped model is at the start of the queue next round.
The estimation algorithm uses Welford’s Online Algorithm which can integrate mean and variance without requiring extra storage. However, this update method can be quite unstable with few samples. This can lead to some stuttering early on by underestimating cost, or running too few models by overestimation.
Structs§
- Identifier for a specific brain.
- The runtime wraps a multitude of inference models with batching support, and support for time-limited execution.
Enums§
- Errors that can be returned by Cervo Runtime.
Type Aliases§
- Identifier for a specific agent.