pub struct SwarmHealthMonitor { /* private fields */ }Expand description
Swarm health monitor — tracks worker health, detects failures, and provides aggregate telemetry.
§Heartbeat Protocol
Workers send heartbeats at regular intervals. If a worker misses
max_missed_beats consecutive heartbeats, it is marked as Unhealthy.
After max_missed_beats * 2 missed beats, it is marked as Dead.
§Dead-Agent Detection
The health monitor periodically scans all tracked workers. Workers that
have been Dead for longer than replacement_timeout_secs are candidates
for automatic replacement.
§Metrics
The monitor tracks task throughput, worker utilization, error rates, and
communication latency. These are exposed via metrics().
Implementations§
Source§impl SwarmHealthMonitor
impl SwarmHealthMonitor
Sourcepub fn new(
heartbeat_interval_secs: u64,
max_missed_beats: u32,
replacement_timeout_secs: u64,
) -> Self
pub fn new( heartbeat_interval_secs: u64, max_missed_beats: u32, replacement_timeout_secs: u64, ) -> Self
Create a new health monitor with custom parameters.
Sourcepub fn register_worker(&mut self, role: &str)
pub fn register_worker(&mut self, role: &str)
Register a new worker for health tracking.
Sourcepub fn task_started(&mut self, role: &str)
pub fn task_started(&mut self, role: &str)
Record that a worker started a task.
Sourcepub fn task_completed(&mut self, role: &str)
pub fn task_completed(&mut self, role: &str)
Record that a worker completed a task successfully.
Sourcepub fn task_failed(&mut self, role: &str)
pub fn task_failed(&mut self, role: &str)
Record that a worker’s task failed.
Sourcepub fn record_error(&mut self, role: &str)
pub fn record_error(&mut self, role: &str)
Record an error from a worker.
Sourcepub fn message_sent(&mut self, role: &str)
pub fn message_sent(&mut self, role: &str)
Record a message sent by a worker.
Sourcepub fn message_received(&mut self, role: &str)
pub fn message_received(&mut self, role: &str)
Record a message received by a worker.
Sourcepub fn check_health(&mut self) -> Vec<String>
pub fn check_health(&mut self) -> Vec<String>
Scan all workers and update their health status based on heartbeat timing. Returns a list of workers that have been detected as dead.
Sourcepub fn worker_telemetry(&self, role: &str) -> Option<WorkerTelemetry>
pub fn worker_telemetry(&self, role: &str) -> Option<WorkerTelemetry>
Get telemetry for a specific worker.
Sourcepub fn all_worker_telemetry(&self) -> Vec<WorkerTelemetry>
pub fn all_worker_telemetry(&self) -> Vec<WorkerTelemetry>
Get telemetry for all workers.
Sourcepub fn metrics(&self) -> SwarmMetrics
pub fn metrics(&self) -> SwarmMetrics
Get aggregate swarm metrics.
Sourcepub fn dead_workers_for_replacement(&self) -> Vec<String>
pub fn dead_workers_for_replacement(&self) -> Vec<String>
Get workers that are candidates for replacement (dead for > replacement_timeout).
Sourcepub fn remove_worker(&mut self, role: &str)
pub fn remove_worker(&mut self, role: &str)
Remove a worker from tracking (after replacement).
Sourcepub fn worker_count(&self) -> usize
pub fn worker_count(&self) -> usize
Get the number of tracked workers.
Sourcepub fn format_status(&self) -> String
pub fn format_status(&self) -> String
Format health status for logging.
Trait Implementations§
Source§impl Clone for SwarmHealthMonitor
impl Clone for SwarmHealthMonitor
Source§fn clone(&self) -> SwarmHealthMonitor
fn clone(&self) -> SwarmHealthMonitor
1.0.0 (const: unstable) · Source§fn clone_from(&mut self, source: &Self)
fn clone_from(&mut self, source: &Self)
source. Read moreSource§impl Debug for SwarmHealthMonitor
impl Debug for SwarmHealthMonitor
Auto Trait Implementations§
impl Freeze for SwarmHealthMonitor
impl RefUnwindSafe for SwarmHealthMonitor
impl Send for SwarmHealthMonitor
impl Sync for SwarmHealthMonitor
impl Unpin for SwarmHealthMonitor
impl UnsafeUnpin for SwarmHealthMonitor
impl UnwindSafe for SwarmHealthMonitor
Blanket Implementations§
Source§impl<T> BorrowMut<T> for Twhere
T: ?Sized,
impl<T> BorrowMut<T> for Twhere
T: ?Sized,
Source§fn borrow_mut(&mut self) -> &mut T
fn borrow_mut(&mut self) -> &mut T
impl<ST, DT> CastableFrom<ST, Initialized, Initialized> for DT
impl<ST, DT> CastableFrom<ST, Uninit, Uninit> for DT
Source§impl<T> CloneToUninit for Twhere
T: Clone,
impl<T> CloneToUninit for Twhere
T: Clone,
Source§impl<T> FutureExt for T
impl<T> FutureExt for T
Source§fn with_context(self, otel_cx: Context) -> WithContext<Self>
fn with_context(self, otel_cx: Context) -> WithContext<Self>
Source§fn with_current_context(self) -> WithContext<Self>
fn with_current_context(self) -> WithContext<Self>
Source§impl<T> Instrument for T
impl<T> Instrument for T
Source§fn instrument(self, span: Span) -> Instrumented<Self>
fn instrument(self, span: Span) -> Instrumented<Self>
Source§fn in_current_span(self) -> Instrumented<Self>
fn in_current_span(self) -> Instrumented<Self>
Source§impl<T> IntoRequest<T> for T
impl<T> IntoRequest<T> for T
Source§fn into_request(self) -> Request<T>
fn into_request(self) -> Request<T>
T in a tonic::Request