Vigil
Microservices Status Page. Monitors a distributed infrastructure and sends alerts to Slack.
Vigil is an open-source Status Page you can host on your infrastructure, used to monitor all your servers and apps, and visible to your users (on a domain of your choice, eg. status.example.com
).
It is useful in microservices contexts to monitor both apps and backends. If a node goes down in your infrastructure, you receive a status change notification in a Slack channel.
👉 See a live demo of Vigil on Crisp Status Page.
Who uses it?
👋 You use Vigil and you want to be listed there? Contact me.
Features
- Monitors automatically your infrastructure services
- Notifies you when a service gets down or gets back up (via a configured channel, eg. Slack or Email)
- Generates a status page, that you can host on your domain for your public users (eg.
https://status.example.com
)
How does it work?
Vigil monitors all your infrastructure services. You first need to configure target services to be monitored, and then Vigil does the rest for you.
There are two kinds of services Vigil can monitor:
- HTTP / TCP services: Vigil frequently probe a HTTP or TCP target and checks for reachability
- Application services: Install the Vigil Reporter library eg. on your NodeJS app and get reports when your app gets down, as well as when the host server system is overloaded
It is recommended to configure Vigil or Vigil Reporter to send frequent probe checks, as to ensure you are quickly notified when a service gets down (thus to reduce unexpected downtime on your services).
How to use it?
Installation
Install from releases:
The best way to install Vigil is to pull the latest release from the Vigil releases page.
Make sure to pick the correct server architecture (either Intel 32 bits, Intel 64 bits, or ARM).
Install from Cargo:
If you prefer managing vigil
via Rust's Cargo, install it directly via cargo install
:
Ensure that your $PATH
is properly configured to source the Crates binaries, and then run Vigil using the vigil
command.
Install from sources:
The last option is to pull the source code from Git and compile Vigil via cargo
:
You can find the built binaries in the ./target/release
directory.
Configuration
Use the sample config.cfg configuration file and adjust it to your own environment.
Available configuration options are commented below, with allowed values:
[server]
log_level
(type: string, allowed:debug
,info
,warn
,error
, default:warn
) — Verbosity of logging, set it toerror
in productioninet
(type: string, allowed: IPv4 / IPv6 + port, default:[::1]:8080
) — Host and TCP port the Vigil public status page should listen onworkers
(type: integer, allowed: any number, default:4
) — Number of workers for the Vigil public status page to run onreporter_token
(type: string, allowed: secret token, default: no default) — Reporter secret token (ie. secret password)
[assets]
path
(type: string, allowed: UNIX path, default:./res/assets/
) — Path to Vigil assets directory
[branding]
page_title
(type: string, allowed: any string, default:Status Page
) — Status page titlepage_url
(type: string, allowed: URL, no default) — Status page URLcompany_name
(type: string, allowed: any string, no default) — Company name (ie. your company)icon_color
(type: string, allowed: hexadecimal color code, no default) — Icon color (ie. your icon background color)icon_url
(type: string, allowed: URL, no default) — Icon URL, the icon should be your squared logo, used as status page favicon (PNG format recommended)logo_color
(type: string, allowed: hexadecimal color code, no default) — Logo color (ie. your logo primary color)logo_url
(type: string, allowed: URL, no default) — Logo URL, the logo should be your full-width logo, used as status page header logo (SVG format recommended)website_url
(type: string, allowed: URL, no default) — Website URL to be used in status page headersupport_url
(type: string, allowed: URL, no default) — Support URL to be used in status page header (ie. where users can contact you if something is wrong)custom_html
(type: string, allowed: HTML, default: empty) — Custom HTML to include in status pagehead
(optional)
[metrics]
poll_interval
(type: integer, allowed: seconds, default:120
) — Interval for which to probe nodes inpoll
modepoll_retry
(type: integer, allowed: seconds, default:2
) — Interval after which to try probe for a second time nodes inpoll
mode (only when the first check fails)poll_http_status_healthy_above
(type: integer, allowed: HTTP status code, default:200
) — HTTP status above whichpoll
checks to HTTP replicas reports ashealthy
poll_http_status_healthy_below
(type: integer, allowed: HTTP status code, default:400
) — HTTP status under whichpoll
checks to HTTP replicas reports ashealthy
poll_delay_dead
(type: integer, allowed: seconds, default:30
) — Delay after which a node inpoll
mode is to be considereddead
(ie. check response delay)poll_delay_sick
(type: integer, allowed: seconds, default:10
) — Delay after which a node inpoll
mode is to be consideredsick
(ie. check response delay)push_delay_dead
(type: integer, allowed: seconds, default:20
) — Delay after which a node inpush
mode is to be considereddead
(ie. time after which the node did not report)push_system_cpu_sick_above
(type: float, allowed: system CPU loads, default:0.90
) — System load indice for CPU above which to consider a node inpush
modesick
(ie. UNIX system load)push_system_ram_sick_above
(type: float, allowed: system RAM loads, default:0.90
) — System load indice for RAM above which to consider a node inpush
modesick
(ie. percent RAM used)
[notify]
[notify.email]
to
(type: string, allowed: email address, no default) — Email address to which to send emailsfrom
(type: string, allowed: email address, no default) — Email address from which to send emailssmtp_host
(type: string, allowed: hostname, IPv4, IPv6, default:localhost
) — SMTP host to connect tosmtp_port
(type: integer, allowed: TCP port, default:587
) — SMTP TCP port to connect tosmtp_username
(type: string, allowed: any string, no default) — SMTP username to use for authentication (if any)smtp_password
(type: string, allowed: any string, no default) — SMTP password to use for authentication (if any)smtp_encrypt
(type: boolean, allowed:true
,false
, default:true
) — Whether to encrypt SMTP connection withSTARTTLS
or not
[notify.slack]
hook_url
(type: string, allowed: URL, no default) — Slack hook URL (ie.https://hooks.slack.com/[..]
)
[probe]
[[probe.service]]
id
(type: string, allowed: any unique lowercase string, no default) — Unique identifier of the probed service (not visible on the status page)label
(type: string, allowed: any string, no default) — Name of the probed service (visible on the status page)
[[probe.service.node]]
id
(type: string, allowed: any unique lowercase string, no default) — Unique identifier of the probed service node (not visible on the status page)label
(type: string, allowed: any string, no default) — Name of the probed service node (visible on the status page)mode
(type: string, allowed:poll
,push
, no default) — Probe mode for this node (ie.poll
is direct HTTP or TCP poll to the URLs set inreplicas
, whilepush
is for Vigil Reporter nodes)replicas
(type: array[string], allowed: TCP or HTTP URLs, default: empty) — Node replica URLs to be probed (only used ifmode
ispoll
)
Run Vigil
Vigil can be run as such:
./vigil -c /path/to/config.cfg
Usage recommendations
Consider the following recommendations when using Vigil:
- Vigil should be hosted on a safe, separate server. This server should run on a different physical machine and network than your monitored infrastructure servers.
- Make sure to whitelist the Vigil server public IP (both IPv4 and IPv6) on your monitored HTTP services; this applies if you use a bot protection service that challenges bot IPs, eg. Distil Networks or Cloudflare. Vigil will see the HTTP service as down if a bot challenge is raised.
What status variants look like?
Vigil has 3 status variants, either healthy
(no issue ongoing), sick
(services under high load) or dead
(outage):
Healthy status variant
Sick status variant
Dead status variant
How can I integrate Vigil Reporter in my code?
Vigil Reporter is used to actively submit health information to Vigil from your apps. Apps are best monitored via application probes, which are able to report detailed system information such as CPU and RAM load. This lets Vigil show if an application host system is under high load.
📦 Vigil Reporter Libraries:
- NodeJS: node-vigil-reporter
- Rust: rs-vigil-reporter
- Golang: go-vigil-reporter
👉 Cannot find the library for your programming language? Build your own and be referenced here! (contact me)
:fire: Report A Vulnerability
If you find a vulnerability in Vigil, you are more than welcome to report it directly to @valeriansaliou by sending an encrypted email to valerian@valeriansaliou.name. Do not report vulnerabilities in public GitHub issues, as they may be exploited by malicious people to target production servers running an unpatched Vigil server.
:warning: You must encrypt your email using @valeriansaliou GPG public key: :key:valeriansaliou.gpg.pub.asc.
:gift: Based on the severity of the vulnerability, I may offer a $100 (US) bounty to whomever reported it.