# AzothBalancer - Resilient RPC Infrastructure for Decentralized Solvers
[](https://www.rust-lang.org/)
[](https://github.com/AzothSolver/azoth-balancer/blob/main/LICENSE-MIT)
[](https://github.com/AzothSolver/azoth-balancer)
## What's New in v0.3.1 (2025-09-30)
### Key Features
- **Environment Variable Support**: Secure configuration via `.env` files
- **IP Whitelist Security**: Protected `/reload` endpoint with configurable IP restrictions
- **Endpoint Naming System**: Custom names for better observability in metrics and logs
- **Enhanced 3-Tier Routing**: Fixed LRU-based fair load distribution across all tiers
- **Load Testing Toolkit**: Added Bash load testing script with mixed single/batch request simulation, rate limiting detection, and performance analytics
### 🔧 Improvements
- **Strategy Logic**: Fixed LRU tie-breaker that was causing incorrect endpoint prioritization
- **Endpoint Identification**: Human-readable names instead of URLs in metrics and logs
- **Security**: IP-based access control for configuration reloads
- **Configuration**: Environment variable support for sensitive endpoint URLs
[See full changelog](#changelog)
<p align="center">
<img src="https://raw.githubusercontent.com/AzothSolver/azoth-balancer/main/azoth-balancer-logo.png" alt="AzothBalancer Logo" width="150"/>
</p>
> ⚠️ **Experimental Software / Security Notice**
> AzothBalancer is open-source and usable today, but it is still **early-stage software**.
> - Expect breaking changes, incomplete features.
> - **Do not expose the service publicly.** Run only on localhost or within a private network.
> - **Keep server ports behind a firewall or bound to `127.0.0.1`** to avoid external access.
> - **Reload endpoint now has IP whitelisting** via `RELOAD_ALLOWED_IP` environment variable
> - No TLS/HTTPS or authentication exists yet for the main RPC endpoint
> Feedback and contributions are welcome — do not rely on this as your sole production RPC layer.
**AzothBalancer** is a Rust-based, high-performance JSON-RPC load balancer engineered for blockchain infrastructure. Built with a consistency-first architecture, it guarantees request integrity while fully leveraging Rust's performance for optimal throughput. It delivers the reliability and cost-efficiency required by demanding DeFi applications, featuring a token bucket rate limiter that ensures a seamless client experience. Currently stable at v0.3.1.
---
## Core Functionality
* **3-Tier Endpoint Routing:** Prioritizes requests based on configurable weights (Tier 1 ≥100 → Tier 3 1–49)
* **Health Monitoring & Failover:** Automatic cooldown for failing or rate-limited endpoints
* **Per-Endpoint Rate Limiting:** Supports burstable limits to prevent provider throttling
* **Batch Request Handling:** Handles JSON-RPC batch requests safely
* **Hot Configuration Reloading:** `/reload` endpoint updates endpoints without downtime
* **Environment Variable Support:** Secure configuration via `.env` files and environment variables
* **IP Whitelist Security:** Configurable IP restriction for the `/reload` endpoint
* **Endpoint Naming:** Configurable names with automatic domain-based generation for better observability
* **Prometheus Metrics:** `/metrics` exposes health and performance stats with readable endpoint names
* **Graceful Shutdown:** Completes in-flight requests before termination
---
## Technical Architecture
### Intelligent 3-Tier Routing Strategy
AzothBalancer implements a sophisticated 3-tier priority system for optimal endpoint selection:
**Tier 1 (Local Nodes | Weight ≥ 100)**
- **Priority:** Highest - always selected first when available
- **Sorting:** Weight → LRU (Least Recently Used) - ensures fair rotation
- **Use Case:** Local nodes, premium dedicated endpoints
**Tier 2 (Premium Services | Weight 50–99)**
- **Priority:** Secondary fallback with cost-based optimization
- **Sorting:** Weight → Lowest total cost → LRU
- **Use Case:** Premium RPC providers (Alchemy, QuickNode, Infura) - balances performance and reliability
**Tier 3 (Free/Public | Weight 1–49)**
- **Priority:** Final fallback for maximum reliability
- **Sorting:** Weight → LRU (cost-agnostic) - ensures fair usage
- **Use Case:** Public RPCs, emergency backup
**Key Features:**
- **LRU Fair Distribution:** Prevents endpoint starvation across all tiers
- **Cost-Aware Tier 2:** Combines latency and error penalties for optimal premium endpoint selection
- **Graceful Degradation:** Automatic tier fallback during outages
- **Batch-aware Rate Limiting:** Respects RPC provider quotas for large requests
- **Environment Variable Support:** Secure configuration management via `.env` files
- **Endpoint Naming:** Human-readable names in metrics and logs for better observability
### Current Implementation (v0.3.1)
**Core Infrastructure:**
- Enhanced 3-tier priority routing with LRU-based fairness
- Environment variable support for secure configuration
- IP-based access control for configuration reloads
- Exponential backoff cooldown system
- Per-endpoint rate limiting with burst support
- **Endpoint naming system** with automatic domain-based generation
- Prometheus metrics integration (`/metrics`) with endpoint names
- Hot configuration reloading (`/reload`)
**Security & Configuration:**
- `.env` file support for environment variables
- `RELOAD_ALLOWED_IP` environment variable for endpoint security
- **Endpoint names instead of URLs** in logs and metrics for security
- Thread-safe state management (`Arc<RwLock<...>>`)
- Comprehensive test suite (30+ tests)
- Graceful shutdown handling
- Async Rust foundation (Tokio, Axum, Reqwest)
---
## Planned Enhancements
* **transaction-type aware routing:** Route sensitive RPC methods (e.g., `eth_sendRawTransaction`) to secure endpoints (eg. MEV Blocker)
* **Response Caching:** Cache common RPC calls (`eth_call`, `eth_getLogs`) to reduce latency
* **Enhanced Security:** HTTPS/TLS termination and optional API key authentication
* **Production Dashboards:** Grafana dashboards for monitoring performance and health
---
## Ecosystem Impact
* **Increase Solver Reliability:** Reduce infrastructure-related settlement failures
* **Lower Operational Costs:** Optimized routing for premium and free endpoints
* **Lower Barrier to Entry:** Enable new solver operators to run reliable infrastructure easily
* **Open-Source Contribution:** Reusable component for CoW ecosystem solvers and dApp builders
---
## Quick Start
---
### Installation via Cargo
```bash
cargo install azoth-balancer
```
After installing, copy the example config and environment template:
```bash
curl -O https://raw.githubusercontent.com/AzothSolver/azoth-balancer/main/example.config.toml
curl -O https://raw.githubusercontent.com/AzothSolver/azoth-balancer/main/.env.example
cp .env.example .env
# Edit .env with your RPC endpoints and RELOAD_ALLOWED_IP
```
Then run:
```bash
azoth-balancer --config example.config.toml
```
> Allows you to try AzothBalancer without cloning the repo or building manually.
---
### Installation via Clone & Build
```bash
git clone https://github.com/AzothSolver/azoth-balancer.git
cd azoth-balancer
cp example.config.toml config.toml # Then edit your config
cp .env.example .env # Set up your environment variables
cargo build --release
./target/release/azoth-balancer --config config.toml
```
Default server: `0.0.0.0:8549`
---
### Download Prebuilt Binary
```bash
curl -LO https://github.com/AzothSolver/azoth-balancer/releases/download/v0.3.1/azoth-balancer-v0.3.1-x86_64-unknown-linux-gnu.tar.gz
tar -xzvf azoth-balancer-v0.3.1-x86_64-unknown-linux-gnu.tar.gz
./azoth-balancer --config example.config.toml
```
> Allows you to try AzothBalancer immediately without building or installing via Cargo.
---
## Configuration with Endpoint Naming
### Endpoint Configuration with Names
You can now configure custom names for your endpoints or use automatic generation:
```toml
[[balancer.endpoints]]
name = "quicknode-premium" # Custom name for better observability
url = "${RPC_URL_QUICKNODE}"
rate_limit_per_sec = 25
weight = 100
[[balancer.endpoints]]
# No name specified - will auto-generate: "001_arbitrum_one_public_nodies_app"
url = "https://arbitrum-one-public.nodies.app"
rate_limit_per_sec = 10
weight = 50
```
**Benefits of Endpoint Naming:**
- **Readable metrics** - names like `quicknode-premium` instead of full URLs
- **Better logs** - easier debugging and monitoring
- **Security** - URLs aren't exposed in metrics or logs
- **Consistent identification** - endpoints tracked by name across the system
### Environment Variables & Security
AzothBalancer supports environment variables for secure configuration management:
#### Configuration Variables
Set these in your `.env` file or environment:
```bash
# RPC Endpoint URLs
RPC_URL_QUICKNODE="https://your-quicknode-endpoint"
RPC_URL_ALCHEMY="https://your-alchemy-endpoint"
RPC_URL_INFURA="https://your-infura-endpoint"
# ... and other RPC_URL_* variables
# Security
RELOAD_ALLOWED_IP="127.0.0.1" # IP address allowed to access /reload endpoint
```
#### Using Environment Variables in Config
Reference environment variables in your `config.toml`:
```toml
[[balancer.endpoints]]
name = "alchemy-mainnet"
url = "${RPC_URL_ALCHEMY}" # Will be resolved from environment
rate_limit_per_sec = 25
weight = 100
```
The balancer automatically resolves `${VARIABLE_NAME}` placeholders from:
1. `.env` file in current directory
2. System environment variables
3. Falls back with warning if variable not found
### Security Configuration
The `/reload` endpoint is protected by IP whitelisting. Set the allowed IP address:
```bash
export RELOAD_ALLOWED_IP="127.0.0.1" # Localhost only (recommended)
# or in your .env file:
# RELOAD_ALLOWED_IP="127.0.0.1"
```
Unauthorized reload attempts are logged and rejected with HTTP 403.
---
## CLI & Multiple Configs
* **Custom Config Path:** Specify a configuration file:
```bash
./target/release/azoth-balancer --config config.toml
# or
cargo run --release -- --config config.toml
```
* **Chain-Specific Configs:** Maintain separate configs for different networks:
```text
config-eth.toml # Ethereum RPC endpoints
config-arbitrum.toml # Arbitrum RPC endpoints
config-solana.toml # Solana RPC endpoints
```
* Start with a specific network config:
```bash
./target/release/azoth-balancer --config config-arbitrum.toml
```
> Note: Each config must contain RPC endpoints from the same chain.
---
## Docker
* Dockerfile included
* `docker-compose.yml` available:
```bash
docker-compose up --build
```
---
## Monitoring with Prometheus & Grafana
AzothBalancer exposes comprehensive metrics at the `/metrics` endpoint with **readable endpoint names**. Here's how to set up monitoring:
### 1. Prometheus Configuration
Add this job to your `prometheus.yml`:
```yaml
scrape_configs:
- job_name: "azoth-balancer"
static_configs:
- targets: ["127.0.0.1:8549"] # Adjust host/port if different
metrics_path: /metrics
scrape_interval: 15s
scrape_timeout: 10s
```
### 2. Grafana Dashboard Import
1. Download the dashboard JSON:
```bash
curl -O https://raw.githubusercontent.com/AzothSolver/azoth-balancer/main/grafana-dashboards/azoth-balancer-monitoring-dashboard-beta.json
```
2. In Grafana:
- Navigate to **Create → Import**
- Upload the JSON file or paste the contents
- Select your Prometheus datasource
- Click **Import**
### 3. Quick Start with Docker Compose (Optional)
Create `docker-compose.monitoring.yml`:
```yaml
version: "3.8"
services:
prometheus:
image: prom/prometheus:latest
ports:
- "9090:9090"
volumes:
- ./prometheus.yml:/etc/prometheus/prometheus.yml
- prometheus_data:/prometheus
command:
- '--config.file=/etc/prometheus/prometheus.yml'
- '--storage.tsdb.path=/prometheus'
- '--web.console.libraries=/etc/prometheus/console_libraries'
- '--web.console.templates=/etc/prometheus/consoles'
- '--storage.tsdb.retention.time=200h'
- '--web.enable-lifecycle'
restart: unless-stopped
grafana:
image: grafana/grafana:latest
ports:
- "3000:3000"
environment:
- GF_SECURITY_ADMIN_USER=admin
- GF_SECURITY_ADMIN_PASSWORD=admin
- GF_USERS_ALLOW_SIGN_UP=false
volumes:
- grafana_data:/var/lib/grafana
- ./grafana-dashboards:/etc/grafana/provisioning/dashboards
restart: unless-stopped
depends_on:
- prometheus
volumes:
prometheus_data:
grafana_data:
```
And `prometheus.yml`:
```yaml
global:
scrape_interval: 15s
scrape_configs:
- job_name: "azoth-balancer"
static_configs:
- targets: ["azoth-balancer:8549"]
metrics_path: /metrics
```
Run with:
```bash
docker-compose -f docker-compose.monitoring.yml up -d
```
- **Grafana**: http://localhost:3000 (admin/admin)
- **Prometheus**: http://localhost:9090
### Available Metrics with Endpoint Names
The dashboard provides **readable endpoint names** in all metrics:
- **Real-time RPS & error rates** - track `quicknode-premium` instead of full URLs
- **Endpoint health & latency distributions** - with human-readable names
- **Rate limiting & cooldown events** - named endpoints for easy identification
- **Priority-based routing analytics** - tier-based performance monitoring
- **Concurrency & batch size monitoring**
The dashboard automatically detects your endpoints and provides tier-based analytics with the endpoint names you've configured.
---
## Load Testing
Validate your AzothBalancer configuration under realistic conditions with our comprehensive load testing tool.
**Features:**
- Realistic mixed single/batch request workloads
- Configurable RPS targets and duration
- Rate limiting detection and performance metrics
- Automated health status checking
**Quick Start:**
```bash
cd load-test
./load-test.sh --help
```
**[See Load Testing Documentation](./load-test/README.md)** for complete usage instructions and examples.
---
## License
**MIT or Apache 2.0**
* [LICENSE-MIT](LICENSE-MIT)
* [LICENSE-APACHE](LICENSE-APACHE)
---
## Repository
* GitHub: [AzothBalancer](https://github.com/AzothSolver/azoth-balancer)
## Contact
For questions, suggestions, or contributions, please open an issue on [GitHub Issues](https://github.com/AzothSolver/azoth-balancer/issues).
---
## Changelog
## [0.3.1] - 2025-09-30
### Added
- **Environment Variable Support**: Added `.env` file handling for secure configuration management
- **IP Whitelist Security**: Implemented IP-based access control for `/reload` endpoint via `RELOAD_ALLOWED_IP` environment variable
- **Endpoint Naming System**:
- Added `name` field to endpoint configuration structure
- Configurable custom names in `config.toml` for better identification
- Automatic domain-based name generation when name not specified
- Names displayed in monitoring dashboard and metrics for improved observability
- **Load Testing Toolkit**: Added Bash load testing script with mixed single/batch request simulation, rate limiting detection, and performance analytics
### Changed
- **Metrics & Logging**: Updated to use endpoint names instead of URLs for enhanced security and clarity
- **Configuration Loading**: Environment variables resolved during config finalization process
### Fixed
- **Tier Routing Logic**: Fixed incorrect LRU tie-breaker implementation in endpoint selection
- **Security**: Endpoint URLs no longer exposed in metrics or status endpoints
- **Observability**: Human-readable names in Grafana dashboard legends and metric labels