runbound 0.4.0

RFC-compliant DNS resolver — drop-in Unbound with REST API, ACME auto-TLS, HMAC audit log, and master/slave HA
# Security Architecture

This document covers the security model, defensive layers, and the audit findings
fixed in version 0.2.0.

---

## Defensive layers

```
Internet / LAN
┌─────────────────────────────────────┐
│  ACL check (allow / deny / refuse)  │  ← per-subnet rules, IPv4+IPv6
│  Rate limiter (token bucket)        │  ← per-source-IP, DashMap+ahash
│  Inflight semaphore (max 4096)      │  ← hard OOM backstop
├─────────────────────────────────────┤
│  XDP fast path (optional)           │  ← same ACL + rate limit enforced
├─────────────────────────────────────┤
│  DNS engine (hickory-server)        │
│  Zone lookup / forwarding           │
└─────────────────────────────────────┘
┌─────────────────────────────────────┐
│  REST API (port 8081, configurable)  │
│  Bearer token (timing-safe cmp)     │  ← subtle::ConstantTimeEq
│  Entry limits (10k DNS, 100k BL)    │
│  zones_mutex (atomic write+swap)    │
└─────────────────────────────────────┘
```

---

## ACL (Access Control List)

Rules are evaluated in order; first match wins. Default if no rule matches: **REFUSE**.

```
access-control: 127.0.0.0/8    allow
access-control: 10.0.0.0/8     allow
access-control: 0.0.0.0/0      refuse   ← secure default
```

**IPv4-mapped IPv6 normalisation (SEC-03):** Clients connecting via IPv6 as
`::ffff:10.0.0.1` are normalised to `10.0.0.1` before ACL matching, ensuring
IPv4 rules apply correctly regardless of transport.

---

## Rate limiting

Token-bucket rate limiter, one bucket per source IP.

```
rate-limit: 500    # max queries per second per IP
```

- Implemented with `DashMap<IpAddr, IpBucket>` and `ahash` for low-contention
  concurrent access.
- Excess queries receive a REFUSED response — no amplification possible.
- Shared between the standard path and the XDP fast path.
- Disable with `rate-limit: 0` (not recommended for public-facing resolvers).

---

## Anti-OOM memory protection

Runbound has two independent, always-active defences against memory exhaustion:

### 1. Inflight concurrency semaphore

Hard cap of **4,096 concurrent in-flight requests**. When the semaphore is exhausted,
new requests receive REFUSED immediately without allocating any additional memory.
This provides a hard backstop even at line rate and is immune to amplification —
no bytes are allocated for the rejected request.

### 2. Memory pressure guard

A background task reads `/proc/meminfo` every **30 seconds**. When system RAM usage
reaches **80 %**, two caches are purged atomically:

- **Rate-limiter DashMap** — all token buckets cleared. Each IP rebuilds its bucket
  on the next query; no query is lost, only the accumulation of per-IP state.
- **hickory-resolver cache** — the resolver is rebuilt from config and atomically
  swapped via ArcSwap. In-flight queries hold their Arc reference and complete
  normally; new queries use the fresh resolver with an empty cache.

After the purge, the new usage level is logged. If usage is still above 50 %, a
second warning is emitted so the operator knows a permanent fix (more RAM, reduced
rate limit, fewer feed subscriptions) may be needed.

On non-Linux systems or containers without `/proc/meminfo`, the guard silently
skips its check — DNS service continues normally.

```
WARN Memory pressure — purging DNS caches  used_pct=82.3%  avail_mb=312  total_mb=1753
WARN DNS resolver cache flushed and rate limiter cleared  freed_buckets=8241
WARN Memory after purge  used_pct=44.1%  status="below 50% target"
```

**The memory guard is always active — no configuration required.**

---

## REST API security

**Authentication:** Bearer token via `Authorization` header. Compared using
`subtle::ConstantTimeEq` — not vulnerable to timing attacks.

**API key management:**
```bash
# Set via environment variable — never write in config files
export RUNBOUND_API_KEY="$(openssl rand -hex 32)"
```

**Entry limits:** Enforced server-side to prevent authenticated DoS:
- DNS entries: max 10,000
- Blacklist entries: max 100,000

**Concurrent write safety (SEC-01):** The entire load → validate → write → ArcSwap
sequence is performed inside `zones_mutex`. Two concurrent API writes cannot
overwrite each other.

---

## Feed security

**SSRF protection (SEC-04):** A custom `reqwest` redirect policy blocks:
- HTTPS → HTTP downgrades
- Redirects to private or loopback IP addresses (`10.x`, `172.16.x`, `192.168.x`,
  `127.x`, `169.254.x`, `::1`, etc.)

**TOCTOU re-validation (SEC-05):** Feed URLs are re-validated on every fetch, not
just at subscription time. A compromised feed record cannot redirect to a private
address after being subscribed.

**HTTPS enforcement:** HTTP feed URLs are **rejected with 400 Bad Request** —
only `https://` URLs are accepted. This prevents man-in-the-middle injection of malicious
block-list data at the API layer before any network connection is made.

**Credential stripping (v0.3.3):** Feed URLs containing embedded credentials
(`user:pass@host`) are rejected before any network request to prevent credential leakage.

**File permissions (SEC-07):** Serialised feed files are written with `chmod 640` —
owner and group readable only.

---

## XDP path security

**ACL enforcement in XDP (SEC-02):** The AF/XDP fast path applies the full ACL before
answering any query. There is no bypass. `Deny` → silent drop; `Refuse` → REFUSED
frame crafted directly in the XDP worker.

---

## File permissions reference

| File | Recommended permissions | Owner |
|---|---|---|
| `/etc/runbound/runbound.conf` | `640` | `runbound:runbound` |
| `/etc/runbound/env` (API key) | `640` | `runbound:runbound` |
| `/etc/runbound/key.pem` (TLS key) | `640` | `runbound:runbound` |
| `/etc/runbound/cert.pem` | `644` | `runbound:runbound` |
| `/var/lib/runbound/*.json` (store) | `640` | `runbound:runbound` |
| `/var/log/runbound/` | `750` | `runbound:runbound` |

---

## Systemd hardening

The provided unit file applies:
- `NoNewPrivileges=yes`
- `PrivateTmp=yes`
- `ProtectSystem=strict`
- `ProtectHome=yes`
- `ProtectKernelTunables=yes`
- `CapabilityBoundingSet=CAP_NET_BIND_SERVICE` (port 53 only — no root)

See [systemd.md](systemd.md) for the full unit file.

---

## Audit findings

### v0.2.0 — v0.3.x

| ID | Severity | Title | Fixed in |
|---|---|---|---|
| SEC-01 | High | Race condition on concurrent API writes | v0.2.0 |
| SEC-02 | High | XDP fast path bypassed ACL entirely | v0.2.0 |
| SEC-03 | Medium | IPv4-mapped IPv6 skipped ACL rules | v0.2.0 |
| SEC-04 | Medium | SSRF via HTTP redirect in feed fetcher | v0.2.0 |
| SEC-05 | Medium | TOCTOU on feed URL validation | v0.2.0 |
| SEC-06 | Medium | Unbounded data-store growth | v0.2.0 |
| SEC-07 | Low | Feed data files world-readable | v0.2.0 |
| SEC-08 | Low | Plaintext HTTP feeds accepted silently | v0.2.0 |
| SEC-09 | High | `POST /rotate-key` was a silent no-op (read frozen env var) | v0.3.3 |
| SEC-10 | Medium | CHAOS class queries returned NOERROR instead of NOTIMP | v0.3.3 |
| SEC-11 | Medium | Body limit dropped TCP instead of returning HTTP 413 | v0.3.3 |
| SEC-12 | Medium | Negative TTL caused panic instead of HTTP 422 | v0.3.3 |
| SEC-13 | Medium | Production `unwrap()` / `expect()` could crash the process | v0.3.3 |
| SEC-14 | Medium | Sync Bearer comparison was timing-vulnerable | v0.3.3 |
| SEC-15 | Low | Feed URLs with embedded credentials were not rejected | v0.3.3 |
| SEC-16 | Low | `rate-limit: u64::MAX` silently disabled rate limiting | v0.3.3 |

See [security-audit.md](security-audit.md) for the full white-box audit report.

---

## Reporting a vulnerability

Send a report to **redlemonbe@codix.be** with subject line `[SECURITY] Runbound`.
Please include a description of the vulnerability, reproduction steps, and
your assessment of its impact. We aim to respond within 48 hours.

Do not open a public GitHub issue for security vulnerabilities.