Crate rusmes_metrics

Expand description

Observability layer for RusMES

This crate provides a complete observability stack for the RusMES mail server:

Prometheus-compatible metrics exported over HTTP (pull-based scraping)
OpenTelemetry distributed tracing via the OTLP exporter (see tracing module)
Kubernetes-compatible health probes (/health, /ready, /live)
Grafana dashboard support via standard Prometheus metric naming

§Key Features

Counter metrics for SMTP, IMAP, and JMAP protocol operations (connections, messages, commands, errors)
Gauge metrics for queue depth, mailbox count, message count, and storage bytes
Histogram metrics with carefully chosen bucket boundaries for:
- Message processing latency: 1 ms – 10 s
- SMTP session duration: 100 ms – 600 s
Lock-free atomic counters (AtomicU64) — no contention on the hot path
Mutex-guarded histogram state for thread-safe observation
Integration with tracing-opentelemetry for correlating traces and logs

§Usage

use rusmes_metrics::MetricsCollector;
use rusmes_config::MetricsConfig;

// Create a shared metrics collector (cheap to clone, backed by Arc)
let metrics = MetricsCollector::new();

// Increment counters on protocol events
metrics.inc_smtp_connections();
metrics.inc_smtp_messages_received();

// Time an operation with a histogram
let timer = metrics.start_message_processing_timer();
// ... process message ...
timer.observe();     // records elapsed seconds into the histogram

// Expose a Prometheus-scrape endpoint
let config = MetricsConfig {
    enabled: true,
    bind_address: "0.0.0.0:9090".to_string(),
    path: "/metrics".to_string(),
    basic_auth: None,
};
metrics.start_http_server(config).await?;

§HTTP Endpoints

Path	Description
`/metrics`	Prometheus text-format metrics
`/health`	JSON health report with component checks
`/ready`	Kubernetes readiness probe (HTTP 200)
`/live`	Kubernetes liveness probe (HTTP 200)

curl http://localhost:9090/metrics
curl http://localhost:9090/health
curl http://localhost:9090/ready
curl http://localhost:9090/live

§Histogram Buckets

Message processing latency (rusmes_message_processing_latency_seconds): 1 ms, 5 ms, 10 ms, 25 ms, 50 ms, 100 ms, 250 ms, 500 ms, 1 s, 2.5 s, 5 s, 10 s
SMTP session duration (rusmes_smtp_session_duration_seconds): 100 ms, 500 ms, 1 s, 5 s, 10 s, 30 s, 60 s, 120 s, 300 s, 600 s

§OpenTelemetry / Distributed Tracing

See the tracing sub-module for span helpers (smtp_span, imap_span, jmap_span, mailet_span, delivery_span) and the init_tracing function that wires up an OTLP exporter with configurable gRPC or HTTP transport.

Modules§

tls_label: TLS label values for the rusmes_tls_sessions_total counter.
tracing: OpenTelemetry distributed tracing integration for RusMES

Structs§

ConnectionGuard: RAII guard that decrements the active-connections gauge on drop.
GlobalMetricsAlreadySet: Error returned by set_global_metrics when the global has already been initialised.
HealthChecks: Individual health checks
HealthResponse: Health check response
LiveResponse: Liveness probe response
MetricsCollector: Server metrics collector
ReadyResponse: Readiness probe response
Timer: Timer for tracking operation duration

Functions§

create_health_router: Create health check router
global_metrics: Get the process-wide metrics collector, lazily installing a fresh one the first time it is requested if set_global_metrics was never called.
set_global_metrics: Install the process-wide MetricsCollector so protocol crates can record events without having to thread the handle through every constructor.

Type Aliases§

DomainStatsSource: Callback type used to feed the per-recipient-domain counter from a queue.

Crate rusmes_metrics

Crate rusmes_metrics Copy item path

§Key Features

§Usage

§HTTP Endpoints

§Histogram Buckets

§OpenTelemetry / Distributed Tracing

Modules§

Structs§

Functions§

Type Aliases§

Crate rusmes_metrics