Module metrics

Module metrics 

Source
Expand description

Metrics for observability.

Exports Prometheus-compatible metrics for:

  • Peer connection status
  • Stream tailing performance
  • Replication lag
  • Deduplication stats
  • Circuit breaker state
  • Batch processing stats

§Metric Naming Convention

All metrics are prefixed with replication_ and follow Prometheus conventions:

  • Counters end in _total
  • Gauges represent current state
  • Histograms track distributions (duration, size)

§Usage

use replication_engine::metrics;
use std::time::Duration;

// In hot_path after reading events
metrics::record_cdc_events_read("peer-1", 42);

// In batch processor after flush
metrics::record_batch_flush("peer-1", 100, 85, 5, 8, 2, Duration::from_millis(50));

Functions§

cursor_retries_total
Record cursor SQLite retry (for SQLITE_BUSY/SQLITE_LOCKED).
record_adaptive_batch_size
Record current adaptive batch size for a peer.
record_backpressure_pause
Record backpressure pause (sync-engine under load).
record_batch_dedup
Record batch dedup stats (for monitoring dedup efficiency).
record_batch_flush
Record batch flush with detailed stats.
record_cdc_events_applied
Record CDC events applied (not deduplicated).
record_cdc_events_deduped
Record CDC events deduplicated (skipped).
record_cdc_events_read
Record CDC events read from a peer.
record_circuit_call
Record circuit breaker call outcome.
record_circuit_rejection
Record circuit breaker rejection (circuit was open).
record_cursor_flush
Record cursor flush batch (debounced writes).
record_cursor_persist
Record cursor persistence.
record_error
Record errors by type.
record_event_processing_latency
Record event processing latency.
record_merkle_divergence
Record divergent peer detected during repair.
record_peer_circuit_state
Record peer circuit breaker state change.
record_peer_connection
Record a peer connection event.
record_peer_operation_latency
Record peer Redis operation latency by operation type. Useful for tracking Merkle queries, item fetches, etc.
record_peer_ping
Record peer ping result.
record_peer_ping_latency
Record peer ping latency (for idle peer health checks).
record_peer_state
Record peer connection state.
record_repair_cycle
Record cold path repair cycle.
record_repair_cycle_complete
Record cold path repair cycle completion.
record_repair_skipped
Record cold path repair cycle skipped.
record_replication_lag
Record replication lag (time since last successful sync).
record_replication_lag_events
Record replication lag in events (how many events behind stream head).
record_replication_lag_ms
Record replication lag in milliseconds (based on stream ID timestamps).
record_slo_violation
Record an SLO violation (latency threshold exceeded).
record_stream_read
Record stream read result.
record_stream_read_latency
Record stream read (XREAD) latency.
record_stream_trimmed
Record stream trimmed event (potential data gap).
set_circuit_state
Set circuit breaker state gauge (0=closed, 1=half_open, 2=open).
set_connected_peers
Gauge for number of connected peers.
set_engine_state
Gauge for engine state.
set_replication_lag_slo
Set current replication lag gauge (for SLO monitoring).