pub struct HealthCheckConfig {
pub enabled: bool,
pub server_bind_address: Option<SocketAddr>,
pub cache_ttl: Duration,
pub min_connections: Option<usize>,
pub max_memory_mb: Option<usize>,
pub max_time_drift_ms: Option<i64>,
pub max_pending_events: Option<usize>,
}Expand description
Comprehensive health check configuration for NodeConfig.
This configuration enables and configures the health check system for a node.
When set in NodeConfig, the node will automatically initialize health checks
and optionally start an HTTP server to expose health endpoints.
§Health Check System
The health check system provides:
- Built-in checks for connections, memory, time drift, and state convergence
- Configurable thresholds for each check
- HTTP endpoints for Kubernetes probes and load balancers
- Result caching to minimize overhead
§HTTP Endpoints
When server_bind_address is set, the following endpoints are exposed:
GET /health- Overall health status (200 OK if healthy/degraded, 503 if unhealthy)GET /ready- Readiness probe (200 OK if healthy/degraded, 503 if unhealthy)GET /live- Liveness probe (200 OK if healthy/degraded, 503 if unhealthy)
§Example
use elara_runtime::health::HealthCheckConfig;
use elara_runtime::node::NodeConfig;
use std::time::Duration;
let health_config = HealthCheckConfig {
enabled: true,
server_bind_address: Some("0.0.0.0:8080".parse().unwrap()),
cache_ttl: Duration::from_secs(30),
min_connections: Some(3),
max_memory_mb: Some(1800),
max_time_drift_ms: Some(100),
max_pending_events: Some(1000),
};
let node_config = NodeConfig {
health_checks: Some(health_config),
..Default::default()
};§Production Recommendations
§Small Deployment (10 nodes)
use elara_runtime::health::HealthCheckConfig;
use std::time::Duration;
let config = HealthCheckConfig {
enabled: true,
server_bind_address: Some("0.0.0.0:8080".parse().unwrap()),
cache_ttl: Duration::from_secs(30),
min_connections: Some(2),
max_memory_mb: Some(1000),
max_time_drift_ms: Some(100),
max_pending_events: Some(500),
};§Medium Deployment (100 nodes)
use elara_runtime::health::HealthCheckConfig;
use std::time::Duration;
let config = HealthCheckConfig {
enabled: true,
server_bind_address: Some("0.0.0.0:8080".parse().unwrap()),
cache_ttl: Duration::from_secs(30),
min_connections: Some(5),
max_memory_mb: Some(2000),
max_time_drift_ms: Some(100),
max_pending_events: Some(1000),
};§Large Deployment (1000 nodes)
use elara_runtime::health::HealthCheckConfig;
use std::time::Duration;
let config = HealthCheckConfig {
enabled: true,
server_bind_address: Some("0.0.0.0:8080".parse().unwrap()),
cache_ttl: Duration::from_secs(30),
min_connections: Some(10),
max_memory_mb: Some(4000),
max_time_drift_ms: Some(100),
max_pending_events: Some(2000),
};Fields§
§enabled: boolEnable or disable health checks.
When false, no health checks are performed and no HTTP server is started.
This allows health checks to be completely disabled in environments where
they are not needed.
Default: true
server_bind_address: Option<SocketAddr>Optional bind address for the health check HTTP server.
When Some, an HTTP server is started on this address to expose health
check endpoints (/health, /ready, /live). When None, health checks
are still performed but no HTTP server is started (useful for programmatic
health checking without exposing endpoints).
Format: "host:port" (e.g., "0.0.0.0:8080", "127.0.0.1:8080")
Default: Some("0.0.0.0:8080")
cache_ttl: DurationCache TTL for health check results.
Health check results are cached for this duration to avoid excessive checking overhead. Subsequent health check requests within the TTL return cached results.
Recommended values:
- High-frequency checks: 10-15 seconds
- Normal checks: 30 seconds
- Low-frequency checks: 60 seconds
Default: 30 seconds
min_connections: Option<usize>Minimum number of active connections for ConnectionHealthCheck.
When Some, a ConnectionHealthCheck is registered that monitors
the number of active connections. The check returns Degraded if
the connection count falls below this threshold.
When None, no connection health check is performed.
Recommended values:
- Small deployment: 2-3
- Medium deployment: 5-10
- Large deployment: 10-20
Default: Some(3)
max_memory_mb: Option<usize>Maximum memory usage in megabytes for MemoryHealthCheck.
When Some, a MemoryHealthCheck is registered that monitors
process memory usage. The check returns Unhealthy if memory
usage exceeds this threshold.
When None, no memory health check is performed.
Recommended values:
- Small deployment: 1000 MB (1 GB)
- Medium deployment: 2000 MB (2 GB)
- Large deployment: 4000 MB (4 GB)
Set this to 80-90% of your container memory limit to allow for graceful degradation before OOM kills.
Default: Some(1800) (1.8 GB)
max_time_drift_ms: Option<i64>Maximum time drift in milliseconds for TimeDriftCheck.
When Some, a TimeDriftCheck is registered that monitors
time drift between the local node and network consensus time.
The check returns Degraded if drift exceeds this threshold.
When None, no time drift check is performed.
Recommended value: 100 ms
Excessive time drift can cause synchronization issues and state divergence in distributed systems.
Default: Some(100)
max_pending_events: Option<usize>Maximum pending events for StateDivergenceCheck.
When Some, a StateDivergenceCheck is registered that monitors
the state reconciliation engine. The check returns Degraded if
the number of pending events exceeds this threshold.
When None, no state divergence check is performed.
Recommended values:
- Small deployment: 500
- Medium deployment: 1000
- Large deployment: 2000
High pending event counts may indicate network partitions or reconciliation issues.
Default: Some(1000)
Implementations§
Source§impl HealthCheckConfig
impl HealthCheckConfig
Sourcepub fn disabled() -> Self
pub fn disabled() -> Self
Creates a new HealthCheckConfig with all checks disabled.
This is useful when you want to selectively enable only specific checks.
§Example
use elara_runtime::health::HealthCheckConfig;
let mut config = HealthCheckConfig::disabled();
config.enabled = true;
config.max_memory_mb = Some(2000); // Only enable memory checkSourcepub fn small_deployment() -> Self
pub fn small_deployment() -> Self
Creates a configuration for small deployments (10 nodes).
Recommended thresholds:
- Min connections: 2
- Max memory: 1000 MB
- Max time drift: 100 ms
- Max pending events: 500
Sourcepub fn medium_deployment() -> Self
pub fn medium_deployment() -> Self
Creates a configuration for medium deployments (100 nodes).
Recommended thresholds:
- Min connections: 5
- Max memory: 2000 MB
- Max time drift: 100 ms
- Max pending events: 1000
Sourcepub fn large_deployment() -> Self
pub fn large_deployment() -> Self
Creates a configuration for large deployments (1000 nodes).
Recommended thresholds:
- Min connections: 10
- Max memory: 4000 MB
- Max time drift: 100 ms
- Max pending events: 2000
Sourcepub fn validate(&self) -> Result<(), String>
pub fn validate(&self) -> Result<(), String>
Validates the configuration.
Returns Ok(()) if the configuration is valid, or an error message
describing the validation failure.
§Validation Rules
cache_ttlmust be at least 1 second- If
min_connectionsis set, it must be > 0 - If
max_memory_mbis set, it must be > 0 - If
max_time_drift_msis set, it must be > 0 - If
max_pending_eventsis set, it must be > 0
Trait Implementations§
Source§impl Clone for HealthCheckConfig
impl Clone for HealthCheckConfig
Source§fn clone(&self) -> HealthCheckConfig
fn clone(&self) -> HealthCheckConfig
1.0.0 · Source§fn clone_from(&mut self, source: &Self)
fn clone_from(&mut self, source: &Self)
source. Read moreSource§impl Debug for HealthCheckConfig
impl Debug for HealthCheckConfig
Auto Trait Implementations§
impl Freeze for HealthCheckConfig
impl RefUnwindSafe for HealthCheckConfig
impl Send for HealthCheckConfig
impl Sync for HealthCheckConfig
impl Unpin for HealthCheckConfig
impl UnsafeUnpin for HealthCheckConfig
impl UnwindSafe for HealthCheckConfig
Blanket Implementations§
Source§impl<T> BorrowMut<T> for Twhere
T: ?Sized,
impl<T> BorrowMut<T> for Twhere
T: ?Sized,
Source§fn borrow_mut(&mut self) -> &mut T
fn borrow_mut(&mut self) -> &mut T
Source§impl<T> CloneToUninit for Twhere
T: Clone,
impl<T> CloneToUninit for Twhere
T: Clone,
Source§impl<T> FutureExt for T
impl<T> FutureExt for T
Source§fn with_context(self, otel_cx: Context) -> WithContext<Self>
fn with_context(self, otel_cx: Context) -> WithContext<Self>
Source§fn with_current_context(self) -> WithContext<Self>
fn with_current_context(self) -> WithContext<Self>
Source§impl<T> Instrument for T
impl<T> Instrument for T
Source§fn instrument(self, span: Span) -> Instrumented<Self>
fn instrument(self, span: Span) -> Instrumented<Self>
Source§fn in_current_span(self) -> Instrumented<Self>
fn in_current_span(self) -> Instrumented<Self>
Source§impl<T> IntoEither for T
impl<T> IntoEither for T
Source§fn into_either(self, into_left: bool) -> Either<Self, Self>
fn into_either(self, into_left: bool) -> Either<Self, Self>
self into a Left variant of Either<Self, Self>
if into_left is true.
Converts self into a Right variant of Either<Self, Self>
otherwise. Read moreSource§fn into_either_with<F>(self, into_left: F) -> Either<Self, Self>
fn into_either_with<F>(self, into_left: F) -> Either<Self, Self>
self into a Left variant of Either<Self, Self>
if into_left(&self) returns true.
Converts self into a Right variant of Either<Self, Self>
otherwise. Read moreSource§impl<T> IntoRequest<T> for T
impl<T> IntoRequest<T> for T
Source§fn into_request(self) -> Request<T>
fn into_request(self) -> Request<T>
T in a tonic::Request