ibc-monitor
A monitoring tool for IBC light clients.
Overview
ibc-monitor
tracks the status of IBC light clients across chains, alerting operators before clients expire.
There are multiple integration points for modern observability stacks: structured logs, Prometheus metrics, and webhook notifications e.g Slack or Discord.
Quick Start
-
Install from crates.io
-
Obtain RPC and gRPC endpoints for each chain
Public endpoints can be found at:
- https://cosmos.directory
- https://polkachu.com/rpc
- Chain documentation sites
You need:
- RPC endpoint (typically port 26657)
- gRPC endpoint (typically port 9090)
-
Identify the IBC connections to monitor
Find your channels using:
- Block explorers (mintscan.io, etc.)
hermes query channels --chain <chain-id>
- Chain registry data
You only need to specify channels, client IDs are auto-discovered, unless you specify them.
-
Create your configuration
Edit
monitor.toml
:- Set your RPC/gRPC endpoints
- Add monitors for each client you want to track
- Adjust thresholds based on your requirements
- Add webhook URL if using Slack notifications
-
Test your configuration
This performs a single check and displays results.
-
Run continuous monitoring
Access metrics at http://localhost:9090/metrics
Installation
Usage
Check all configured clients once:
Run continuous monitoring:
Configuration
The monitor reads a TOML configuration file specifying global settings and client endpoints:
[]
= 300 # seconds
= 120 # hours
= 24 # hours
= "https://hooks.slack.com/..." # optional
= true # enable prometheus metrics (default: true)
= "0.0.0.0:9090" # metrics endpoint address (default: 0.0.0.0:9090)
[[]]
= "Penumbra on Cosmos Hub"
= "cosmoshub-4"
= "https://cosmos-rpc.example.com/"
= "http://cosmos-grpc.example.com:9090"
= "07-tendermint-1317" # optional - will auto-discover from channel
= "channel-940"
Client IDs can be omitted and will be discovered automatically via channel queries.
Observability
Metrics
The monitor exposes Prometheus metrics on a configurable endpoint (default: http://localhost:9090/metrics).
To disable metrics collection:
[]
= false
To change the metrics endpoint address:
[]
= "127.0.0.1:9191" # custom address and port
Available metrics:
# HELP ibc_client_hours_until_expiry Hours until IBC client expires
# TYPE ibc_client_hours_until_expiry gauge
ibc_client_hours_until_expiry{chain="penumbra-1",client="07-tendermint-0",counterparty="cosmoshub-4"} 2906.3333333333335
ibc_client_hours_until_expiry{chain="cosmoshub-4",client="07-tendermint-1317",counterparty="penumbra-1"} 2881.3333333333335
# HELP ibc_client_status Current status of IBC client (0=expired, 1=critical, 2=warning, 3=healthy)
# TYPE ibc_client_status gauge
ibc_client_status{chain="penumbra-1",client="07-tendermint-0",counterparty="cosmoshub-4",status="healthy"} 1
ibc_client_status{chain="cosmoshub-4",client="07-tendermint-1317",counterparty="penumbra-1",status="healthy"} 1
# HELP ibc_monitor_check_duration_seconds Duration of monitor check
# TYPE ibc_monitor_check_duration_seconds histogram
ibc_monitor_check_duration_seconds_bucket{le="0.005"} 0
ibc_monitor_check_duration_seconds_bucket{le="0.01"} 0
ibc_monitor_check_duration_seconds_bucket{le="0.025"} 0
ibc_monitor_check_duration_seconds_bucket{le="0.05"} 0
ibc_monitor_check_duration_seconds_bucket{le="0.1"} 0
ibc_monitor_check_duration_seconds_bucket{le="0.25"} 0
ibc_monitor_check_duration_seconds_bucket{le="0.5"} 4
ibc_monitor_check_duration_seconds_bucket{le="1"} 8
ibc_monitor_check_duration_seconds_bucket{le="2.5"} 20
ibc_monitor_check_duration_seconds_bucket{le="5"} 20
ibc_monitor_check_duration_seconds_bucket{le="10"} 20
ibc_monitor_check_duration_seconds_bucket{le="+Inf"} 20
ibc_monitor_check_duration_seconds_sum 23.177515625
ibc_monitor_check_duration_seconds_count 20
# HELP ibc_monitor_checks_total Total number of monitor checks
# TYPE ibc_monitor_checks_total counter
ibc_monitor_checks_total 20
Structured Logging
The monitor emits structured log events for state changes:
INFO
- Client recovered to healthy stateWARN
- Client approaching expiryERROR
- Client critical or expired
Enable JSON output for log aggregation:
LOG_FORMAT=json
Webhook Notifications
Configure a webhook URL to receive alerts on state changes. The monitor sends formatted messages compatible with Slack incoming webhooks.
Integration
Prometheus + Alertmanager
scrape_configs:
- job_name: 'ibc-monitor'
static_configs:
- targets:
groups:
- name: ibc
rules:
- alert: IBCClientExpiringSoon
expr: ibc_client_hours_until_expiry < 24
annotations:
summary: "IBC client {{ $labels.client }} expiring in {{ $value }} hours"
Loki + Promtail
clients:
- url: http://loki:3100/loki/api/v1/push
scrape_configs:
- job_name: ibc-monitor
static_configs:
- targets:
- localhost
labels:
job: ibc-monitor
__path__: /var/log/ibc-monitor.log
Logwatch
Add to /etc/logwatch/conf/logfiles/ibc-monitor.conf
:
LogFile = /var/log/ibc-monitor.log
Archive = /var/log/ibc-monitor.log.*
Architecture
The monitor performs the following operations each check interval:
- Queries each configured chain's gRPC endpoint
- Retrieves client and consensus states
- Calculates time until trusting period expiry
- Compares against configured thresholds
- Emits appropriate metrics and log events
- Sends webhook notifications on state changes
State tracking prevents duplicate alerts when a client remains in the same state across multiple checks.
Client States
- Healthy: More than
warning_threshold
hours until expiry - Warning: Between
warning_threshold
andcritical_threshold
hours remaining - Critical: Less than
critical_threshold
hours remaining - Expired: Trusting period has elapsed
The unbonding period represents the hard deadline after which a client cannot be revived.
Development
Build from source: