flux-keeper 🏥
Health monitoring, stuck detection, and watchdog for agent fleets. Tracks vitals (energy, memory, task completion, response time) and triggers alerts, recovery, or apoptosis on failure cascades.
use ;
let mut kp = new;
kp.add_check;
kp.report_ok; // heartbeat received
kp.report_failure; // first failure → Warning
let alerts = kp.tick; // periodic eval
println!;
Why Keeper?
Agents get sick. They OOM, deadlock, lose connectivity, or spin in loops. Keeper catches it before it cascades:
- Configurable per-check — interval, timeout, failure threshold
- Three status states — Ok → Warning → Critical (automatic escalation)
- Timeout detection — if a check hasn't reported in
timeoutseconds, it fires - Consecutive failure limit — transient glitch ≠ death (3 failures = alert)
- No deps — pure Rust, no runtime, fits in embedded
API
let mut kp = new;
// Register a health check
kp.add_check;
// Report status
kp.report_ok;
kp.report_failure; // increments consecutive_failures
// Periodic evaluation
let new_alerts: = kp.tick;
// Read alerts
for alert in kp.alerts
// Status query
println!;
let status = kp.status; // Option<&CheckStatus>
Alert Escalation
| State | Cue | Action |
|---|---|---|
| Ok | Heartbeat received | Nothing |
| Warning | 1-2 failures | Log alert, continue |
| Critical | ≥3 consecutive failures | Apoptosis trigger |
Cargo.toml
[]
= { = "https://github.com/Lucineer/flux-keeper" }
Fleet Context
Part of the Lucineer/Cocapn fleet. Sister crate to flux-telepathy (alert routing) and flux-confidence (calibrated decision-making under uncertainty).