Module worker

Expand description

Local-GPU worker — two-tier supervision adapter (doc §4, §5.3).

WorkerActor is the stable parent: addressable, supervised by the engine-core, never restarts. Its child ContextActor is restartable and owns the runtime-specific resources (CUDA context, weights, etc). When the runner reports CudaContextPoisoned the parent panics with the rakka_accel::cuda::error::CONTEXT_POISONED_TAG marker so that rakka_accel::cuda::error::device_supervisor_strategy routes the failure to Directive::Restart.

The supervision policy (3 retries / 60s, decider, marker tags) is re-used verbatim from rakka-accel’s error module — that’s the upstream substrate for the doc’s §5.11 two-tier pattern. The body this crate adds is the runtime-polymorphic Box<dyn ModelRunner> slot, which is inference-specific.

Per-runtime crates supply the runner via the WorkerSlot factory. Remote runtimes go through inference-remote-core::RemoteWorkerActor instead.

Structs§

ContextActor: ContextActor — restartable child holding the CUDA context (or the remote-network analogue). Distinct from rakka_accel::cuda::device::ContextActor: that one specialises to CUDA memory / streams; this one holds the polymorphic Box<dyn ModelRunner> so the same supervision shape covers remote-network runners too.
WorkerActor
WorkerSlot: What the parent hands to its child on construction. The runner owns the GPU context indirectly (via cudarc::driver::CudaContext, rakka_accel::cuda::device::DeviceState, or whatever the backend uses); when the parent decides to rebuild, it constructs a fresh WorkerSlot and the child cell starts anew.

Enums§

ContextMsg
WorkerMsg

Module worker

Module worker Copy item path

Structs§

Enums§

Module worker