Skip to main content

Crate rusmes_metrics

Crate rusmes_metrics 

Source
Expand description

Observability layer for RusMES

This crate provides a complete observability stack for the RusMES mail server:

  • Prometheus-compatible metrics exported over HTTP (pull-based scraping)
  • OpenTelemetry distributed tracing via the OTLP exporter (see tracing module)
  • Kubernetes-compatible health probes (/health, /ready, /live)
  • Grafana dashboard support via standard Prometheus metric naming

§Key Features

  • Counter metrics for SMTP, IMAP, and JMAP protocol operations (connections, messages, commands, errors)
  • Gauge metrics for queue depth, mailbox count, message count, and storage bytes
  • Histogram metrics with carefully chosen bucket boundaries for:
    • Message processing latency: 1 ms – 10 s
    • SMTP session duration: 100 ms – 600 s
  • Lock-free atomic counters (AtomicU64) — no contention on the hot path
  • Mutex-guarded histogram state for thread-safe observation
  • Integration with tracing-opentelemetry for correlating traces and logs

§Usage

use rusmes_metrics::MetricsCollector;
use rusmes_config::MetricsConfig;

// Create a shared metrics collector (cheap to clone, backed by Arc)
let metrics = MetricsCollector::new();

// Increment counters on protocol events
metrics.inc_smtp_connections();
metrics.inc_smtp_messages_received();

// Time an operation with a histogram
let timer = metrics.start_message_processing_timer();
// ... process message ...
timer.observe();     // records elapsed seconds into the histogram

// Expose a Prometheus-scrape endpoint
let config = MetricsConfig {
    enabled: true,
    bind_address: "0.0.0.0:9090".to_string(),
    path: "/metrics".to_string(),
    basic_auth: None,
};
metrics.start_http_server(config).await?;

§HTTP Endpoints

PathDescription
/metricsPrometheus text-format metrics
/healthJSON health report with component checks
/readyKubernetes readiness probe (HTTP 200)
/liveKubernetes liveness probe (HTTP 200)
curl http://localhost:9090/metrics
curl http://localhost:9090/health
curl http://localhost:9090/ready
curl http://localhost:9090/live

§Histogram Buckets

  • Message processing latency (rusmes_message_processing_latency_seconds): 1 ms, 5 ms, 10 ms, 25 ms, 50 ms, 100 ms, 250 ms, 500 ms, 1 s, 2.5 s, 5 s, 10 s
  • SMTP session duration (rusmes_smtp_session_duration_seconds): 100 ms, 500 ms, 1 s, 5 s, 10 s, 30 s, 60 s, 120 s, 300 s, 600 s

§OpenTelemetry / Distributed Tracing

See the tracing sub-module for span helpers (smtp_span, imap_span, jmap_span, mailet_span, delivery_span) and the init_tracing function that wires up an OTLP exporter with configurable gRPC or HTTP transport.

Modules§

tls_label
TLS label values for the rusmes_tls_sessions_total counter.
tracing
OpenTelemetry distributed tracing integration for RusMES

Structs§

ConnectionGuard
RAII guard that decrements the active-connections gauge on drop.
GlobalMetricsAlreadySet
Error returned by set_global_metrics when the global has already been initialised.
HealthChecks
Individual health checks
HealthResponse
Health check response
LiveResponse
Liveness probe response
MetricsCollector
Server metrics collector
ReadyResponse
Readiness probe response
Timer
Timer for tracking operation duration

Functions§

create_health_router
Create health check router
global_metrics
Get the process-wide metrics collector, lazily installing a fresh one the first time it is requested if set_global_metrics was never called.
set_global_metrics
Install the process-wide MetricsCollector so protocol crates can record events without having to thread the handle through every constructor.

Type Aliases§

DomainStatsSource
Callback type used to feed the per-recipient-domain counter from a queue.