eggrd 0.2.0

A drop-in Rust edge proxy that gives any app a secure front door: auth, rate limiting, and hardened response headers, with zero changes to the upstream app.
Documentation
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
502
503
504
505
506
507
508
509
510
511
512
513
514
515
516
517
518
519
520
521
522
523
524
525
526
527
528
529
530
531
532
533
534
535
536
537
538
539
540
541
542
543
544
545
546
547
548
549
550
551
552
553
554
555
556
557
558
559
560
561
562
563
564
565
566
567
568
569
570
571
572
573
574
575
576
577
578
579
580
581
582
583
584
585
586
587
588
589
590
591
592
593
594
595
596
597
598
599
600
601
602
603
604
605
606
607
608
609
610
611
612
613
614
615
616
617
618
619
620
621
622
623
624
625
626
627
628
629
630
631
632
633
634
635
636
637
638
639
640
641
642
643
644
645
646
647
648
649
650
651
652
653
654
655
656
657
658
659
660
661
662
663
664
665
666
667
668
669
670
671
672
673
674
675
676
677
678
679
680
681
682
683
684
685
686
687
688
689
690
691
692
693
694
695
696
697
698
699
700
701
702
703
704
705
706
707
708
709
710
711
712
713
714
715
716
717
718
719
720
721
722
723
724
725
726
727
728
729
730
731
732
733
734
735
736
737
738
739
740
741
742
743
744
745
746
747
748
749
750
751
752
753
754
755
756
757
758
759
760
761
762
763
764
765
766
767
768
769
770
771
772
773
774
775
776
777
778
779
780
781
782
783
784
785
786
787
788
789
790
791
792
793
794
795
796
797
798
799
800
801
802
803
804
805
806
807
808
809
810
811
812
813
814
815
816
817
818
819
820
821
822
823
824
825
826
827
828
829
830
831
832
833
834
835
836
837
838
839
840
841
# EdgeGuard

A drop-in Rust edge proxy that gives any HTTP app a secure front door — **authentication,
rate limiting, TLS, and hardened response headers** — with secure-by-default config and
**zero code changes** to the upstream app.

It's the missing front door for apps that were generated (vibe-coded) without one.
EdgeGuard owns the request path (auth, rate-limit, validation) and the response path
(CSP/HSTS/cookie hardening) in a single static binary.

[![License: Apache-2.0](https://img.shields.io/badge/license-Apache--2.0-blue.svg)](#license)

> **Status: v0 shipped, v1 landed, v2 landed, v2.5 underway.** The barebone v0 slice is stable;
> the **v1** (self-hostable & production-usable) feature set below has landed and is tested
> in-process. The one exception is **ACME**, which is implemented and compiled but can only be
> proven against a live CA (see [Platform support]#tls--acme). The **v2 (WAF-lite)** phase has
> landed: [input inspection]#waf-lite-input-rules (off by default), a [shared-store rate
> limiter](#distributed-rate-limiting) for multi-replica deployments, and an optional
> [public/private split]#publicprivate-split for the ops endpoints. (The Redis limiter
> backend, like ACME, is compiled but proven only against a live store.) **v2.5 (static/edge
> surface)** is underway: an [`edgeguard generate`]#static--edge-hosts config generator for
> static hosts and a [Rust→WASM Cloudflare Worker]worker/README.md (the worker, like ACME and
> Redis, compiles but is proven only against a live deploy). Codename "EdgeGuard" is a working
> title — see the [roadmap]docs/ROADMAP.md.

## Where it fits

EdgeGuard is a **reverse proxy you put directly in front of one app** — the secure front door
between the public internet (or your platform's load balancer) and your application process. It
terminates/authenticates the request, forwards it to your app *unchanged*, and hardens the
response on the way back.

```text
         public internet
    (clients · bots · scanners)
               │   :443 / :8080
               │   TLS · auth · rate-limit · WAF · request validation
       ┌──────────────────┐
       │     EdgeGuard     │   ◀── this project (the secure front door)
       └──────────────────┘
               │   plain HTTP on APP_PORT, localhost only
               │   CSP · HSTS · cookie hardening · leaky-header stripping on the way back
       ┌──────────────────┐
       │      your app     │   ◀── unchanged (Node / Python / Go / Rust / …)
       └──────────────────┘
        DB · internal APIs
```

In the larger picture it sits **between your edge (CDN / platform LB / DNS) and your app** — one
hop, one upstream:

```text
  DNS ─▶ [ CDN / platform LB ] ─▶ [ EdgeGuard ] ─▶ [ your app ] ─▶ [ DB / internal APIs ]
          optional: caching, DDoS    this project     unchanged
```

Run it as the container **entrypoint that wraps your app**, or as a **separate front service**
pointing at an upstream URL — see [Two deployment modes](#two-deployment-modes).

### What it does *not* replace

EdgeGuard is a focused security front door, not a platform. It does **not** replace:

- **Your CDN / DDoS edge** (Cloudflare, Fastly, CloudFront) — it hardens one origin; no global
  caching, anycast, or volumetric DDoS absorption. Run it *behind* the CDN.
- **A full WAF** (ModSecurity, Coraza, AWS WAF) — the built-in [WAF-lite]#waf-lite-input-rules
  is heuristic and off by default: signatures, not a managed rule feed.
- **An identity provider** (Auth0, Keycloak, Cognito) — it *verifies* tokens (JWT/JWKS) and gates
  with Basic / API-key; it doesn't issue tokens, manage users, or run OAuth flows.
- **An API gateway / service mesh** (Kong, Istio, Envoy mesh) — one upstream, no service
  discovery, routing fabric, or request transformation beyond the security pipeline.
- **Your app's own authorization** — it's a coarse front-door gate (is this request allowed in at
  all); per-user / per-resource permissions still live in your app.
- **Platform-terminated TLS** — on most PaaS you leave TLS off and let the platform manage certs;
  TLS termination here is for the VPS / front-proxy path.

### Moving an existing app behind it

- **Exposed directly today, no proxy** (a Node/Python/Go server on a VPS or PaaS): make EdgeGuard
  the entrypoint and bind your app to `APP_PORT` (localhost). One Dockerfile change (see
  [`examples/`]examples/) and the app gains auth + rate-limit + headers with **zero code
  changes** — your app stops listening on the public port, EdgeGuard does.
  ```bash
  # before:  node server.js                      # app listens on $PORT, public
  # after:   EdgeGuard binds $PORT, runs the app on APP_PORT
  PORT=8080 APP_PORT=3000 edgeguard --config edgeguard.toml --wrap "node server.js"
  ```
- **Behind plain nginx / Caddy** (TLS + reverse proxy only, no auth/limits): drop EdgeGuard
  between that proxy and the app, or replace the proxy — point `UPSTREAM` at your app and let
  EdgeGuard own TLS too.
  ```bash
  UPSTREAM=http://127.0.0.1:3000 edgeguard --config edgeguard.toml
  ```
- **Coming off a hosted gate** (oauth2-proxy + nginx, or Cloudflare Access in front): EdgeGuard
  folds the auth gate + rate-limit + header hardening into one static binary next to the app —
  fewer moving parts, no extra network hop, self-hostable and portable across providers.
- **Static / edge host** (Vercel, Netlify, Cloudflare Pages) where you can't run a long-lived
  proxy: use [`edgeguard generate`]#static--edge-hosts to emit the hardening config, or the
  [Cloudflare Worker]worker/README.md for auth + hardening at the edge.

## What it does

- **Reverse proxy** to one upstream (a wrapped child process, or an external URL).
- **Streaming / LLM-proxy passthrough** (`validation.stream_passthrough`, off by default):
  forward `text/event-stream` (SSE) responses **unbuffered, frame-by-frame** — so EdgeGuard fronts
  **streaming LLM backends** (OpenAI-compatible token streams) and any SSE app without collapsing
  time-to-first-byte. Egress bytes are still counted as frames flow.
- **WebSocket / `Upgrade` passthrough** (`validation.websocket_passthrough`, off by default):
  forward an authenticated, rate-limited upgrade request intact and splice the connections into a
  raw bidirectional tunnel on the upstream's `101` — so EdgeGuard fronts WebSocket apps (chat,
  live dashboards, dev HMR). Off by default because the normal path strips the hop-by-hop
  `Upgrade` header.
- **IP access control** (`[access]`, allow-all by default): coarse CIDR allow/deny lists evaluated
  by client IP before auth and rate limiting — lock the app to an office/VPN range, or drop an
  abusive subnet.
- **Request IDs** (`X-Request-Id`): reuse a well-formed inbound id or mint a UUID v4, forward it
  upstream, echo it on every response, and tag the access log with it — one id correlates the
  client, EdgeGuard, and the app.
- **Per-path upstreams** (`[[upstreams]]`, single upstream by default): a static path-prefix map
  so `/api` can go to a backend and everything else to a static frontend (longest prefix wins).
- **Response compression** (`validation.compress_responses`, off by default): gzip for clients
  that ask, skipping already-compressed types and (always) SSE streams.
- **Co-process supervisor**: launches your app, restarts it on crash, and forwards
  termination signals on shutdown (acts as a tiny container init). *Full process-group
  signaling on Unix; best-effort child kill on Windows.*
- **Authentication** — pick one gate via `auth.mode`:
  - **HTTP Basic** (plaintext for dev, or `$argon2` PHC hashes);
  - **static API key / bearer token** (constant-time compare, `Authorization: Bearer` or
    `X-API-Key`);
  - **JWT** (HS/RS/ES/PS/EdDSA) with a static key or a fetched, cached **JWKS** (keys
    selected by `kid`; the configured algorithm is pinned to block `alg` substitution).
- **Rate limiting** (GCRA), returns `429`: a global **per-IP** limit, optional **per-route**
  overrides (longest-prefix match), and an optional **per-key** limit keyed by the authenticated
  principal — backed by the in-process `governor` limiter or, for multi-replica deployments, a
  shared **Redis** store so N instances enforce one global limit.
- **WAF-lite input inspection** (`[waf]`, **off by default**): heuristic **SQLi / XSS /
  path-traversal** rulesets plus operator-defined **deny patterns**, with a **report-only**
  rollout mode. A match is logged and counted (`edgeguard_waf_hits_total`); in `block` mode it
  returns `403`. Screens the URL path/query by default (also percent-decoded); headers and body
  are opt-in.
- **CORS** (`[cors]`, **off by default**): answer browser **preflight** `OPTIONS` requests (before
  auth — preflights carry no credentials) and decorate responses with the matching
  `Access-Control-*` headers, so a separate-origin frontend can call the app. A credentialed
  wildcard is rejected at startup.
- **TLS termination** via `rustls`, with optional **ACME / Let's Encrypt** (HTTP-01)
  automatic certificates.
- **Prometheus metrics** at `/__edgeguard/metrics` (request rate by outcome, rate-limit
  hits, WAF hits, latency histogram, CSP reports).
- **Public/private split** (optional): serve the internal `/__edgeguard/*` ops endpoints
  (health, readiness, metrics) on a separate **private listener** so they aren't exposed on the
  public port.
- **Config hot-reload**: edit the config file and policy swaps in atomically — no dropped
  connections, and a broken edit is rejected without taking the proxy down.
- **Response hardening**: injects CSP (with **report-only** mode + a violation **report
  sink**), HSTS, `X-Content-Type-Options`, `X-Frame-Options`, `Referrer-Policy`,
  `Permissions-Policy`; forces `Secure; HttpOnly; SameSite` on `Set-Cookie`; strips leaky
  headers (`Server`, `X-Powered-By`).
- **Static & edge output** (`edgeguard generate`): emit the response-hardening policy as a
  `_headers` file / `vercel.json` / edge-middleware snippet for static hosts, or run the
  [Cloudflare Worker]worker/README.md edge build (hardening **+** auth) where the proxy can't.
- **Body-size limit** (`413`), **header-size limit** (`431`), and **method allowlist**
  (`405`).
- **Env-first config** (`PORT` / `APP_PORT` / `UPSTREAM`) with an optional TOML overlay.
- **Structured JSON access logs** and `/__edgeguard/health` + `/__edgeguard/ready`
  endpoints.

## Two deployment modes

1. **Co-process (default for PaaS/VPS)** — EdgeGuard is the container entrypoint, binds the
   platform's `$PORT`, and runs your app on `APP_PORT`:
   ```bash
   edgeguard --config edgeguard.toml --wrap "node server.js"
   ```
2. **Front proxy (separate service)** — point EdgeGuard at an external upstream:
   ```bash
   UPSTREAM=http://app.internal:3000 edgeguard --config edgeguard.toml
   ```

## Quickstart (local)

The fastest path is to let EdgeGuard scaffold itself in your app's repo, then validate the config
before you run it:

```bash
edgeguard init                       # writes edgeguard.toml + Dockerfile.edgeguard (detects Node/Python/Go/Rust)
echo -n 'your-password' | edgeguard --hash   # paste the $argon2id$… into edgeguard.toml
edgeguard doctor --config edgeguard.toml     # validate + warn on foot-guns (exits non-zero on errors)
```

Or build and wrap an app directly:

```bash
cargo build --release

# Wrap any app that reads PORT from the env. EdgeGuard sets PORT=$APP_PORT for it.
PORT=8080 APP_PORT=3000 ./target/release/edgeguard \
  --config edgeguard.toml \
  --wrap "node server.js"

# now hit it (set the user/pass you configured in edgeguard.toml)
curl -u "$EDGEGUARD_USER:$EDGEGUARD_PASS" http://localhost:8080/
```

> ⚠️ The shipped config requires you to set a credential before it will authenticate —
> the default `users` value is a non-working placeholder. **Before exposing anything**,
> replace it with an `$argon2` hash (see [Configuration]#configuration).

## Onboard an app with `edgeguard init` and `edgeguard doctor`

These two commands take you from "an app with no front door" to "a validated, secure-by-default
config" without hand-writing TOML or reading this whole README. Run them from your app's repo root.

### 1. Scaffold the config — `edgeguard init`

`init` detects your app's runtime from the files in the directory (Node / Python / Go / Rust) and
writes two files:

```bash
$ edgeguard init
Detected runtime: Node.js
Wrote edgeguard.toml and Dockerfile.edgeguard.

Next steps:
1. Set a real credential — the shipped edgeguard.toml ships a non-working placeholder:
     echo -n 'your-password' | edgeguard --hash
     ...
```

- **`edgeguard.toml`** — the annotated, secure-by-default config reference (the *same* one
  documented below, embedded into the binary so it can't drift).
- **`Dockerfile.edgeguard`** — a wrap-your-app Dockerfile that copies the `edgeguard` binary from
  the published image and makes it your container entrypoint (binding `$PORT`, running your app on
  `APP_PORT`), tailored to the detected runtime.

`init` **never overwrites** an existing `edgeguard.toml` or `Dockerfile.edgeguard` — re-run with
`edgeguard init --force` if you actually want to regenerate them.

### 2. Set a real credential

The shipped config ships a deliberately **non-working** placeholder so nothing is exposed by
accident. Replace it with an Argon2id hash (the helper reads the password on stdin, so it never
lands in your shell history or the process list):

```bash
echo -n 'your-password' | edgeguard --hash
# → $argon2id$v=19$m=19456,t=2,p=1$<salt>$<hash>
```

Paste that string as the user's value in `edgeguard.toml`:

```toml
[auth]
mode = "basic"
users = { admin = "$argon2id$v=19$m=19456,t=2,p=1$..." }
```

### 3. Validate before you deploy — `edgeguard doctor`

`doctor` loads your config through the **exact same path the proxy uses** (so it can't drift from
real startup) and then lints for the foot-guns people actually ship. It exits non-zero on a hard
error, so you can gate a deploy with it in CI.

Run it against the freshly-scaffolded config and it will flag the still-placeholder credential:

```bash
$ edgeguard doctor --config edgeguard.toml
✗ [error] auth.users["admin"] is not a valid argon2 hash (the shipped placeholder?): no one can
  authenticate. Run `edgeguard --hash` and paste the result.
ℹ [info] tls.enabled = false: EdgeGuard serves plain HTTP. Fine when your platform terminates TLS
  in front of it; ...

1 error(s), 0 warning(s)
$ echo $?
1
```

Once you've pasted a real hash, it comes back clean:

```bash
$ edgeguard doctor --config edgeguard.toml
ℹ [info] tls.enabled = false: ...

0 error(s), 0 warning(s)
$ echo $?
0
```

`doctor` checks, among others: the placeholder/plaintext credential, `auth.mode = "none"` on a
public port, secrets committed to the file (prefer the env vars / `*_FILE`), an over-permissive or
credentialed-wildcard CORS policy, a `redis` limiter store with no URL, and `enforce_quota` without
managed mode. Wire it into CI to fail the build on a misconfiguration:

```yaml
# .github/workflows — fail the pipeline on a bad EdgeGuard config
- run: edgeguard doctor --config edgeguard.toml
```

### 4. Run it

```bash
# Co-process (the generated Dockerfile does this for you):
PORT=8080 APP_PORT=3000 edgeguard --config edgeguard.toml --wrap "npm start"

# now hit it with the credential you set
curl -u admin:your-password http://localhost:8080/
```

That's the whole loop: **`init` → `--hash` → `doctor` → run**. Editing `edgeguard.toml` while
EdgeGuard is running hot-reloads the policy in place (see [Configuration](#configuration)).

## Project layout

```text
.
├── Cargo.toml
├── edgeguard.toml        # annotated config reference (secure defaults)
├── README.md
├── CHANGELOG.md
├── CONTRIBUTING.md
├── LICENSE               # Apache-2.0
├── src/
│   ├── main.rs           # CLI (serve / init / doctor / --hash / generate), bootstrap, shutdown
│   ├── config.rs         # env-first config + TOML overlay, size/rate parsing
│   ├── proxy.rs          # request/response pipeline (auth, limit, hardening, WS tunnel)
│   ├── access.rs         # IP allow/deny CIDR matching (`[access]`)
│   ├── cors.rs           # CORS preflight + response decoration (`[cors]`)
│   ├── doctor.rs         # config linter (`edgeguard doctor`)
│   ├── scaffold.rs       # project scaffolding (`edgeguard init`)
│   ├── generate.rs       # static-host / edge config generator (`edgeguard generate`)
│   └── supervisor.rs     # co-process supervisor (Unix signals / Windows fallback)
├── docs/
│   ├── REQUIREMENTS.md   # product requirements & scope
│   ├── DEPLOYMENT.md     # deployment & integration strategy
│   └── ROADMAP.md        # phased dev plan → OSS launch → product roadmap
├── examples/
│   ├── Dockerfile.node   # wrap-your-app template (Node)
│   ├── Dockerfile.python # wrap-your-app template (Python)
│   ├── docker-compose.yml # front EdgeGuard + app (file-mounted secrets)
│   ├── edgeguard.service # systemd unit (VPS / front-proxy path)
│   ├── render.yaml       # Render blueprint
│   └── fly.toml          # Fly.io config
└── worker/               # Rust→WASM Cloudflare Worker (edge build; detached crate)
```

## Configuration

All config is optional; EdgeGuard ships secure defaults. See
[`edgeguard.toml`](./edgeguard.toml) for the annotated reference. Environment overrides:

| Env | Meaning | Default |
|---|---|---|
| `PORT` | public listen port | `8080` |
| `APP_PORT` | internal port for the wrapped app | `3000` |
| `ADMIN_PORT` | private listener for the ops endpoints (overrides `server.admin_port`) | `0` (off) |
| `UPSTREAM` | external upstream URL (separate-service mode) | derived from `APP_PORT` |
| `WRAP_CMD` | start command (alternative to `--wrap`) ||
| `EDGEGUARD_CONFIG` | config path (alternative to `--config`) ||
| `EDGEGUARD_JWT_SECRET` | HS* JWT secret (overrides `auth.jwt.secret`) ||
| `EDGEGUARD_API_KEYS` | API keys, comma-separated (overrides `auth.api_keys`) ||
| `EDGEGUARD_REDIS_URL` | shared-store limiter URL (overrides `ratelimit.redis_url`) ||
| `EDGEGUARD_CP_URL` | managed-mode control-plane URL (overrides `control_plane.url`) ||
| `EDGEGUARD_CP_EDGE_TOKEN` | managed-mode per-tenant edge token (overrides `control_plane.edge_token`) ||
| `RUST_LOG` | log filter, e.g. `info`, `edgeguard=debug` | `info` |

**Secrets from files (`*_FILE`).** Every secret env var above also accepts a `*_FILE` variant
(`EDGEGUARD_JWT_SECRET_FILE`, `EDGEGUARD_API_KEYS_FILE`, `EDGEGUARD_REDIS_URL_FILE`,
`EDGEGUARD_CP_EDGE_TOKEN_FILE`) pointing at a file whose contents are the value — the convention
Docker / Kubernetes secret mounts and systemd `LoadCredential` use, so the secret never lands in
the config file or a `ps`-visible environment. The direct variable wins when both are set; a
`*_FILE` that can't be read is a hard startup error (a misconfigured mount fails loudly rather
than silently running with no secret). A trailing newline is trimmed.

Auth, rate limits, TLS/ACME, CSP reporting, the header/size limits, CORS, and IP access are
configured in the TOML file — see the annotated [`edgeguard.toml`](./edgeguard.toml). Editing that
file while EdgeGuard is running **hot-reloads** the policy in place (auth, limits, headers,
validation except `compress_responses`, CORS, access); changing compression, the listen port, or
TLS settings still needs a restart.

**Hashing a password** (do this before any real deployment — the shipped placeholder will
not authenticate). EdgeGuard ships a built-in helper so you don't need an external tool:

```bash
# reads the password on stdin, prints an argon2id PHC hash
echo -n 'your-password' | edgeguard --hash
# paste the $argon2id$... string as the user's value in edgeguard.toml

# (alternatively, any argon2 PHC tool works, e.g. the `argon2` CLI:)
echo -n 'your-password' | argon2 "$(openssl rand -base64 16)" -id -e
```

> ⚠️ A plaintext `users` value is a **dev-only convenience** — it's compared in constant
> time but never stored hashed. Always use an `$argon2` hash for anything reachable.

## Deploy

Drop one of the wrap-your-app templates into your app repo as `Dockerfile`:

- [`examples/Dockerfile.node`]./examples/Dockerfile.node
- [`examples/Dockerfile.python`]./examples/Dockerfile.python

Then deploy to a container PaaS (the v0 target — where a long-running proxy can sit in
front of your app):

- **Railway** — deploy the repo; set `APP_PORT=3000`. Railway injects `$PORT`.
- **Render** — see [`examples/render.yaml`]./examples/render.yaml; health check
  `/__edgeguard/health`.
- **Fly.io** — see [`examples/fly.toml`]./examples/fly.toml.
- **VPS / Coolify** — run the binary under systemd with the hardened
  [`examples/edgeguard.service`]./examples/edgeguard.service unit (sandboxed, binds 80/443 via
  `CAP_NET_BIND_SERVICE`, loads secrets via `LoadCredential` + `*_FILE`), or use
  [`examples/docker-compose.yml`]./examples/docker-compose.yml with EdgeGuard as the front
  container and the app only reachable through it.

The full strategy (interface patterns, platform matrix, rollout order) is in
[docs/DEPLOYMENT.md](docs/DEPLOYMENT.md).

> Static/edge hosts (Vercel, Netlify, Cloudflare Pages) can't run a long-lived proxy
> process — see [Static & edge hosts]#static--edge-hosts for the `edgeguard generate` config
> generator and the Rust→WASM Cloudflare Worker that cover that surface.

## Architecture

**Request:** client-IP resolution (honors `X-Forwarded-For` only when
`server.trust_forwarded_for = true`; disabled by default so clients cannot spoof
their IP) → IP access control (`[access]`, allow-all by default) → header-size limit →
rate-limit (per-IP / per-route) → CORS preflight
(answered before auth, when `[cors]` is on) → auth → per-key
rate-limit → method allowlist → WebSocket/`Upgrade` tunnel (when
`websocket_passthrough` is on) → body-size limit → WAF input inspection (off by default) →
forward to upstream (bounded by `validation.upstream_timeout`, default 30s — a stalled
upstream returns `504` instead of pinning the request).
**Response:** security-header injection (CSP/CSP-Report-Only) → cookie hardening → strip
leaky headers → CORS response headers (when `[cors]` is on) → `X-Request-Id` stamped on every
response → optional gzip (when `compress_responses` is on, never for SSE).

Built on `axum` + `hyper` (server), `hyper-util` (upstream client), `governor` + `redis` (rate
limit), `argon2` + `jsonwebtoken` (auth), `rustls` + `instant-acme` (TLS/ACME), `arc-swap`
+ `notify` (hot-reload), `regex` (WAF), `tracing` (logs).

Internal endpoints (dedicated routes, exempt from auth/limits — restrict access to
`/__edgeguard/*` at the network layer, or move them to a private listener with
`server.admin_port`; see [Public/private split](#publicprivate-split)). The `/__edgeguard/*`
namespace is reserved — it is never forwarded upstream:

| Endpoint | Purpose |
|---|---|
| `/__edgeguard/health` | **liveness** — is EdgeGuard up (always `200`) |
| `/__edgeguard/ready` | **readiness**`200` only when the upstream accepts a connection, else `503` |
| `/__edgeguard/metrics` | Prometheus metrics (text exposition v0.0.4) |
| `/__edgeguard/csp-report` | CSP violation report sink (`POST`, logs + counts, returns `204`) |

For a turnkey way to validate and visualize the `edgeguard_*` metrics, the
[`monitoring/`](monitoring/) directory ships a self-contained **Prometheus + Grafana** stack
(Podman/Docker Compose) plus a reusable **reference Grafana dashboard** you can import into your
own Grafana — `podman compose -f monitoring/compose.yaml up -d`, then open
http://localhost:3000. See [`monitoring/README.md`](monitoring/README.md). It also ships reusable
**Prometheus alert rules** ([`monitoring/prometheus/alerts.yml`](monitoring/prometheus/alerts.yml):
upstream-5xx ratio, p95 latency, auth-failure / WAF / rate-limit spikes, limiter-store errors) you
can drop into your own Prometheus.

## Platform support

**Unix (Linux/macOS) is the supported production path.** The co-process supervisor uses
POSIX process groups and signals (`setsid`, then `SIGTERM`/`SIGKILL` to the child's group)
to cleanly start, restart, and shut down the wrapped app together with any grandchildren.

**Windows is best-effort.** With no POSIX process groups or signals, the supervisor falls
back to a `cmd /C` launch with a plain child kill on shutdown (`kill_on_drop`); grandchild
processes the wrapped command spawns may not be reaped. The proxy itself — auth,
rate-limiting, validation, header hardening — works fine on Windows; only the `--wrap`
supervisor is degraded. On Windows, prefer **front-proxy mode** (point `UPSTREAM` at a
separately managed app) over `--wrap`.

## TLS & ACME

Set `tls.enabled = true` and point `tls.cert_path` / `tls.key_path` at a PEM certificate
chain and private key to serve HTTPS directly (rustls; HTTP/1.1 via ALPN).

To get certificates automatically, enable `[tls.acme]` with your `domains`, contact `email`,
and `accept_tos = true`. EdgeGuard runs the ACME **HTTP-01** challenge (it binds **port 80**
for the validation, so that must be reachable from the internet), writes the issued chain and
key to `tls.cert_path` / `tls.key_path`, and serves them.

> ⚠️ ACME requires a real public domain and inbound port 80, so it can't be exercised by the
> in-process test suite. The flow is implemented against `instant-acme` and compiled in CI,
> but is **only proven against a live CA**. The default directory is **Let's Encrypt staging**
> (`tls.acme.directory_url`) so a first run can't burn the strict production rate limits —
> switch to production explicitly once it works against staging.

For a managed-TLS platform (most PaaS) you typically leave TLS off and let the platform
terminate it; TLS termination here is for the VPS / front-proxy path.

## WAF-lite (input rules)

EdgeGuard can screen requests for common attack signatures before they reach your app. It is
**off by default** and built to roll out safely. Configure it in `[waf]`:

```toml
[waf]
mode = "off"            # "off" (default) | "report" | "block"
sqli = true             # built-in SQL-injection heuristics
xss = true              # built-in cross-site-scripting heuristics
path_traversal = true   # built-in path-traversal heuristics
inspect_path = true     # screen the URL path + query (matched raw AND percent-decoded)
inspect_headers = false # screen header values (noisy — opt in)
inspect_body = false    # screen the (size-capped) request body (opt in)

# Operator-defined deny patterns, evaluated alongside the built-ins:
# [[waf.rules]]
# id = "block-wp-probes"
# pattern = "(?i)/wp-(admin|login)"
# target = "path"       # "path" | "headers" | "body" | "all"
```

- **Modes / report-first.** `report` evaluates every rule and **logs + counts** matches
  (`edgeguard_waf_hits_total`) but still forwards the request; `block` returns **`403
  Forbidden`** on a match (also counted under the `forbidden` request outcome). Start in
  `report`, watch the metric for false positives, then switch to `block`.
- **Heuristics, honestly.** The built-in rules are signature heuristics, not a full WAF — they
  will miss novel payloads and occasionally false-positive on benign input. That is exactly why
  they default off and ship a report-first workflow; tune them per category (`sqli` / `xss` /
  `path_traversal`) or lean on your own `[[waf.rules]]`.
- **Custom patterns are ReDoS-safe.** Patterns use the `regex` crate's RE2 syntax, which
  matches in linear time and rejects backreferences/lookaround, so an operator pattern can't
  cause catastrophic backtracking. A pattern that fails to compile (or an unknown `target`) is
  rejected at startup, and on a bad **hot-reload** the previous policy is kept — just like any
  other config error.
- **Scope.** Only locations whose `inspect_*` flag is on are examined; `inspect_headers` and
  `inspect_body` default off because header/body bytes (cookies, tokens, opaque blobs) are
  noisy. The path is matched both raw and percent-decoded, so `%2e%2e%2f` is caught as `../`.
- **Where it runs.** After auth and the size/method checks, just before the request is
  forwarded — so the internal `/__edgeguard/*` endpoints are never inspected.

## CORS

A drop-in front door is often deployed in front of an app whose browser frontend lives on a
**different origin** — a separate static host, a preview deployment, `http://localhost:5173`
during development. Browsers block those cross-origin `fetch`/`XHR` calls unless the server answers
with the right `Access-Control-*` headers. EdgeGuard adds a small, explicit CORS policy
(`[cors]`, **off by default** — opening cross-origin access is a deliberate choice):

```toml
[cors]
enabled = true
allow_origins = ["https://app.example.com", "http://localhost:5173"]  # or ["*"] for any
allow_methods = []          # [] = GET/POST/PUT/PATCH/DELETE/OPTIONS/HEAD
allow_headers = []          # [] = reflect the browser's Access-Control-Request-Headers
expose_headers = []         # response headers the page may read
allow_credentials = false   # send Access-Control-Allow-Credentials; requires explicit origins
max_age = "600s"            # how long the browser may cache the preflight ("0" omits it)
```

- **Preflight, before auth.** A browser preflight (`OPTIONS` + `Origin` +
  `Access-Control-Request-Method`) is answered by EdgeGuard directly with `204` + the allow
  headers. This runs *before* the auth gate, because a preflight carries no credentials — gating
  it would make every cross-origin call fail. A plain `OPTIONS` (no preflight headers) falls
  through to normal handling.
- **Decoration.** Actual responses get `Access-Control-Allow-Origin` (echoing the request `Origin`
  with `Vary: Origin` for an explicit allow-list, or the cacheable `*` for a wildcard), plus the
  credentials / expose-headers directives.
- **Credentialed wildcard is rejected.** `allow_credentials = true` with `allow_origins = ["*"]` is
  refused at startup/reload — the Fetch spec forbids it and browsers ignore the combination, so
  EdgeGuard fails fast rather than emitting a policy that silently doesn't work (`edgeguard doctor`
  flags it too).

## IP access control

A coarse network gate by client IP, evaluated **before auth and rate limiting** (`[access]`,
allow-all by default). Both lists accept plain IPs and CIDR ranges (IPv4 and IPv6):

```toml
[access]
allow = ["10.0.0.0/8", "203.0.113.7"]   # only these may connect (empty = allow all)
deny  = ["198.51.100.0/24"]             # always rejected — wins over allow
```

`deny` takes precedence over `allow`; a non-empty `allow` is a whitelist (anything outside it gets
`403`). A denied client is dropped before consuming a limiter token or any auth work. It keys on
the **resolved** client IP — the same one rate limiting uses — so behind a trusted proxy set
`server.trust_forwarded_for = true` for this to see the real client rather than the proxy. A bad
CIDR fails at startup/reload.

## WebSocket passthrough

By default EdgeGuard strips the hop-by-hop `Upgrade`/`Connection` headers (per RFC 7230), so a
WebSocket handshake forwarded through it would fail — fine for the request/response apps that are
the common case, but a blocker for chat, live dashboards, or dev HMR. Turn on
`validation.websocket_passthrough` to tunnel them:

```toml
[validation]
websocket_passthrough = true
```

An upgrade request (`Connection: upgrade` + `Upgrade:`) is still **authenticated and
rate-limited**, then forwarded to the upstream intact; on the upstream's `101 Switching Protocols`,
EdgeGuard splices the client and upstream connections into a raw bidirectional byte tunnel for the
life of the socket. Response hardening and WAF body inspection don't apply to a tunneled connection
(there's no buffered response to rewrite). A non-`101` upstream reply is passed back unchanged, so a
rejected handshake surfaces normally.

## Request IDs

For every request EdgeGuard resolves an `X-Request-Id`: it reuses a well-formed inbound one (a
short, printable-ASCII token a CDN/LB already set — validated so a hostile client can't inject
control characters into the log) or mints a UUID v4. That id is **forwarded upstream**, **echoed on
the response** (including error responses), and added to the JSON access-log line — so a single id
ties together the client, EdgeGuard, and the app's own logs. Always on; no configuration.

## Per-path upstreams

EdgeGuard fronts one app by default, but the one split many setups want is "static frontend + an
`/api` backend". `[[upstreams]]` is a minimal static prefix map for exactly that:

```toml
[[upstreams]]
path = "/api/"
target = "http://api.internal:4000"
```

Longest matching prefix wins; anything unmatched goes to the default `server.upstream`/`app_port`.
This is deliberately **not** a gateway — no service discovery, load balancing, health-based
routing, or request rewriting (the path is forwarded unchanged). For those, keep EdgeGuard in front
of one app and put a real gateway/mesh behind it. The readiness probe still tracks the default
upstream.

## Response compression

`validation.compress_responses = true` gzip-compresses responses for clients that send
`Accept-Encoding: gzip`. It skips small and already-compressed responses (the standard predicate)
and **always** skips `text/event-stream`, so SSE streaming is never held back by the compressor.
It's a listener-level setting (applied when the server starts), so toggling it needs a restart
rather than a hot-reload.

## Distributed rate limiting

By default the limiter is in-process (`governor`), so each replica counts independently — three
instances behind a load balancer allow 3× the configured rate. For multi-replica deployments,
point the limiter at a shared store so all instances enforce **one global limit**:

```toml
[ratelimit]
enabled = true
rate = "60/min"
burst = 20
store = "redis"                     # "local" (default) | "redis" | "memory"
redis_url = "redis://10.0.0.5:6379" # or rediss:// for TLS; prefer EDGEGUARD_REDIS_URL
redis_prefix = "edgeguard"          # key namespace, so deployments can share one Redis
fail_open = false                   # store error -> 503 (closed). true -> allow (open).
```

The same `rate` / `burst`, `[[ratelimit.routes]]`, and `[ratelimit.per_key]` settings apply —
only the *storage* of the GCRA state changes. EdgeGuard runs the GCRA check-and-update
atomically as a Redis Lua script, so concurrent replicas can't race it. Replica clocks should be
roughly in sync (NTP).

- **`store = "local"`** (default): in-process `governor`. Fast, no dependency, per-replica.
- **`store = "redis"`**: shared across replicas via Redis (`redis://`, or TLS `rediss://`). The
  connection is established lazily and auto-reconnects.
- **`store = "memory"`**: the shared-store code path backed by an in-process map — equivalent to
  `local` for a single replica; mainly a reference/testing backend.
- **`fail_open`**: when the store is unreachable, fail **closed** (`503`, default — an outage
  can't silently disable limiting) or **open** (allow the request — availability over strict
  limiting).

### Why a shared store

A rate limit is only a real protection if it holds *in total*. The in-process limiter counts
per replica, so horizontal scaling silently multiplies your effective rate by the replica count —
scale a `"60/min"` policy to 10 pods and you've actually allowed `600/min`. That undermines the
exact things a limit is for:

- **Abuse / DoS throttling** — a brute-force or scraping cap that 10×'s under autoscale isn't a cap.
- **Protecting a fragile upstream** — keep total load on the origin under a known ceiling regardless
  of how many edges front it.
- **Staying inside an upstream API quota** — one global budget, not one-per-replica.

`store = "redis"` keeps the cap fixed no matter how many replicas run, by holding the GCRA state in
Redis and running the check-and-update atomically (Lua script). The cost is one shared dependency
and a Redis round-trip per limited request; `store = "local"` stays the right default for a single
instance.

### Run it

**Local — one Redis, one proxy:**

```sh
docker run --rm -d -p 6379:6379 redis:7-alpine
# edgeguard.toml has [ratelimit] store = "redis"
EDGEGUARD_REDIS_URL=redis://127.0.0.1:6379 edgeguard --config edgeguard.toml
```

**Multi-replica — three proxies share one Redis and enforce one global limit** (compose):

```yaml
services:
  redis:
    image: redis:7-alpine
  edge:
    image: mancube/eggrd:latest
    deploy: { replicas: 3 }          # three edges, one shared limit
    command: ["--config", "/etc/edgeguard.toml"]
    environment:
      EDGEGUARD_REDIS_URL: redis://redis:6379   # overrides ratelimit.redis_url
      UPSTREAM: http://app:3000
    volumes: ["./edgeguard.toml:/etc/edgeguard.toml:ro"]   # store = "redis"
  app:
    image: ghcr.io/example/app:latest
```

Drive more than `burst` requests through the load balancer: the **total** admitted across the three
replicas tracks the configured cap (~1×), versus ~3× with `store = "local"`. The [`loadtest/`](loadtest/README.md)
harness automates this comparison (`--scale edgeguard=3`).

> The Redis backend is exercised by `#[ignore]`d proof tests against a live server — the GCRA
> algorithm and the in-memory store are covered by the default suite. To run the live ones:
> `docker run -p 6379:6379 redis:7-alpine` then `cargo test --lib redis_ -- --ignored`. See
> [docs/ROADMAP.md]docs/ROADMAP.md, Phase 4.

## Public/private split

The internal `/__edgeguard/*` endpoints are normally served on the public port. To keep the ops
surface (health, readiness, **metrics**) off the public internet, give EdgeGuard a private admin
listener:

```toml
[server]
admin_port = 9090         # 0 (default) = keep internal endpoints on the public port
admin_addr = "127.0.0.1"  # loopback (same-host scraper); "0.0.0.0" for a private NIC
```

When `admin_port` is set, EdgeGuard binds a second, plain-HTTP listener serving
`/__edgeguard/health`, `/__edgeguard/ready`, and `/__edgeguard/metrics`. The public port then
serves only the proxy plus the browser-facing CSP report sink, and any other `/__edgeguard/*`
request to the public port returns `404` (never forwarded upstream). **Point your platform's
health check at the admin port** when you enable this, and keep the admin listener on a trusted
network (loopback, a private subnet, or a service-mesh interface) — it is plain HTTP with no auth.

## Static & edge hosts

Static/edge hosts (Netlify, Cloudflare Pages, Vercel) can't run EdgeGuard's long-lived proxy, but
you can still get EdgeGuard's hardening there in one of two ways.

**1. Generate platform config.** `edgeguard generate` renders the `[headers]` policy into the
native config a static host understands — from the *same* source of truth (`security_headers`) the
live proxy uses, so the generated output can't drift from runtime behavior:

```bash
edgeguard generate --target _headers            # Netlify / Cloudflare Pages `_headers` file
edgeguard generate --target vercel              # vercel.json headers block
edgeguard generate --target vercel-middleware   # Vercel Edge Middleware (middleware.ts)
edgeguard generate --target netlify-edge        # Netlify Edge Function
edgeguard generate --target _headers --out ./public/_headers   # write to a file (else stdout)
```

A static `_headers` file (or header-only middleware) can only *add* response headers — it can't do
EdgeGuard's cookie hardening, leaky-header stripping, auth, or rate limiting. For those at the
edge, use the worker.

**2. Deploy the Cloudflare Worker.** [`worker/`](worker/README.md) is a Rust→WASM build of the
response-hardening **and** edge-auth subset (HTTP Basic / static API key). It authenticates the
request, forwards to your origin, and hardens the response — the same security-header set and
cookie hardening as the proxy. Build and deploy with `worker-build` + `wrangler`; configure the
origin, auth, and hardening via `wrangler.toml` vars / secrets. Like ACME, it compiles to wasm but
is proven only against a live deploy; rate limiting and JWT are out of scope for the edge subset.

## Managed mode (optional)

By default each EdgeGuard runs standalone, configured from its local file/env. **Managed mode**
(`[control_plane]`, off by default) instead lets a fleet of edges pull their policy from — and
report to — a central **control plane**:

- **Policy pull + hot-reload.** The edge polls the control plane for its policy (a conditional
  `GET` with an ETag, so an unchanged policy is a cheap `304`) and applies it through the *same*
  hot-reload path a local file edit uses — no restart, no dropped connections. The control plane
  pushes the **policy** sections (auth, rate limits, validation, headers, WAF); the edge keeps its
  own local `[server]` / `[tls]` (its port, upstream, and certs are edge-specific).
- **Metrics reporting.** The edge accumulates per-period request and bandwidth counts and POSTs a
  delta to the control plane (for fleet visibility).
- **Quota enforcement** (`enforce_quota`, off by default). Beyond *reporting* metrics, the edge can
  *enforce* a configured quota: it polls the control plane's quota verdict and, while the
  edge is over its configured ceiling, rejects traffic with `429` + a `Retry-After` reset hint —
  the internal `/__edgeguard/*` endpoints stay served so health checks don't flap. A failed poll
  keeps the last verdict (fail-static). This is what turns the meter into a cap; leave it off to
  meter without blocking.
- **CSP forwarding.** Received CSP violation reports are forwarded to the control plane for
  aggregate analytics, in addition to the local count.

```toml
[control_plane]
enabled = true
url = "https://cp.example"        # control-plane base URL
tenant_id = "acme"                # this edge's tenant id
# edge_token via EDGEGUARD_CP_EDGE_TOKEN (a per-tenant bearer); poll/report intervals tunable.
```

It's a thin, generic "pull config / report usage to a URL" client (`src/cp.rs`) — with no
`[control_plane]` configured, the binary is byte-for-byte the standalone proxy.

## Build & test

```bash
cargo build              # debug build
cargo build --release    # optimized static-ish binary
cargo test               # unit tests + in-process integration tests (stub upstream)
cargo clippy --all-targets -- -D warnings   # lints (CI denies warnings)
cargo fmt --all -- --check                  # formatting (CI checks this)
```

This crate is a member of the parent workspace; from the repo root you can target it with
`cargo build -p eggrd` (the crates.io package id; the binary/lib stay `edgeguard`). The edge worker in [`worker/`](worker/README.md) is a separate
(detached) crate — test its pure logic with `cargo test` in that directory, and build the wasm
artifact with `worker-build`.

## Roadmap

| Phase | Scope | Outcome |
| --- | --- | --- |
| v0 | Proxy + Basic auth + per-IP limit + header injection + logs | OSS single binary (this) |
| v1 | JWT/API-key auth, per-route limits, TLS/ACME, metrics, hot reload, CSP report-only | Self-hostable, production-usable |
| v2 | Heuristic input rules, custom deny patterns, distributed limiter | WAF-lite |
| v2.5 | Static-host config generator (`_headers`/edge middleware) + Rust→WASM Cloudflare Worker | Static/edge surface |

The detailed, checklisted plan lives in [docs/ROADMAP.md](docs/ROADMAP.md).

## Contributing

Contributions are welcome — see [CONTRIBUTING.md](CONTRIBUTING.md). For design rationale
and scope, see [docs/REQUIREMENTS.md](docs/REQUIREMENTS.md).

## License

Licensed under the [Apache License, Version 2.0](LICENSE).

Unless you explicitly state otherwise, any contribution intentionally submitted for
inclusion in this project by you, as defined in the Apache-2.0 license, shall be
licensed as above, without any additional terms or conditions.