gephyr 1.16.8

Gephyr headless AI relay service for Google AI services
Documentation
# Gephyr Architecture

## Runtime Component Map

Core runtime state is assembled in `src/proxy/server.rs` and carried via `AppState` (`src/proxy/state.rs`):

- `CoreServices`
  - `token_manager`: account pool, selection, sticky/session bindings, compliance guards (`src/proxy/token/manager.rs`)
  - `upstream`: outbound HTTP client (`src/proxy/upstream/client.rs`)
  - `monitor`: request logs + counters (`src/proxy/monitor.rs`)
  - `integration` + `account_service`: system integration and account management
- `ConfigState`
  - mutable runtime config mirrors (mapping, security, timeouts, etc.)
  - synchronized hot-apply entrypoint: `apply_proxy_config`
- `RuntimeState`
  - runtime-only controls (running flag, port, proxy-pool runtime state/manager)

## Routing and Middleware

Public and admin route composition:

- Public proxy routes: `src/proxy/routes/mod.rs` -> `build_proxy_routes`
- Admin routes: `src/proxy/routes/admin.rs` -> grouped builders in `src/proxy/routes/admin_groups.rs`
- Admin capabilities snapshot: `GET /api/version/routes` from `admin_version_route_capabilities`

Middleware order for proxy routes (`src/proxy/routes/mod.rs`):

1. IP filter
2. auth
3. monitor

Admin routes are guarded by `admin_auth_middleware` in `build_admin_routes`.

Server-level CORS is applied in `src/proxy/server.rs` via `cors_layer`, sourced from `proxy.cors` config:

- default: `strict` with localhost allowlist
- opt-in: `permissive` mode for local/dev compatibility

Client IP resolution for middleware (`src/proxy/middleware/client_ip.rs`) is trust-gated:

- default (`proxy.trusted_proxies` empty): use socket `ConnectInfo` only
- if peer socket IP matches trusted proxy IP/CIDR, forwarded headers may be used (`x-forwarded-for`, then `x-real-ip`)

## Config Mutation Flow

### Full config path

- Endpoint: `POST /api/config` (`src/proxy/admin/runtime/config_pool.rs`)
- Flow:
  1. load submitted config
  2. validate
  3. persist (`save_app_config`)
  4. hot-apply to runtime (`ConfigState::apply_proxy_config` + token-manager sync + upstream Google outbound policy sync)
  5. emit structured admin audit

### Scoped proxy patch path

Scoped endpoints use shared patch helper in `src/proxy/admin/runtime/config_patch.rs`:

1. resolve actor
2. load persisted config
3. apply endpoint patch closure
4. validate
5. persist
6. return `before/after + runtime_apply_policy`
7. endpoint applies runtime mutation and returns `runtime_apply` in API response

Scoped endpoints in this path:

- `POST /api/proxy/sticky`
- `POST /api/proxy/request-timeout`
- `POST /api/proxy/compliance`
- `POST /api/proxy/pool/strategy`
- `POST /api/proxy/pool/runtime`

## Hot-Reload Policy

Policy enum: `RuntimeApplyPolicy` in `src/proxy/admin/runtime/config_patch.rs`.

Operational visibility for Google outbound hardening:

- `GET /api/proxy/google/outbound-policy` returns effective runtime policy (mode, host-header behavior, metadata, passthrough allow/block contract, debug redaction contract)
- full `POST /api/config` updates hot-apply the upstream Google outbound policy without restart

Current scoped endpoint policy mapping:

- `always_hot_applied`
  - sticky config
  - request-timeout
  - compliance config
- `hot_applied_when_safe`
  - proxy-pool strategy
  - proxy-pool runtime knobs
- `requires_restart`
  - currently not assigned to scoped proxy update endpoints (reserved for fields that cannot be safely hot-applied)

API contract for scoped updates includes:

- `runtime_apply.policy`
- `runtime_apply.applied`
- `runtime_apply.requires_restart`

This makes hot-apply behavior explicit for operators and scripts.

## Token Manager Data Flow

Main selection and control paths:

- request -> account selection (`manager_runtime.rs` + selection/rotation modules)
- sticky bindings:
  - in-memory map + optional persisted `session_bindings.json`
  - debug snapshot via `GET /api/proxy/session-bindings`
- compliance controls:
  - global/account RPM windows
  - in-flight concurrency
  - cooldown windows
  - debug snapshot via `GET /api/proxy/compliance`

Operational snapshot endpoint:

- `GET /api/proxy/metrics` aggregates runtime/monitor/sticky/proxy-pool/compliance data (including `runtime.tls_backend`).
- proxy-pool metrics include shared-fallback usage and strict fail-closed rejection counters.
- `GET /api/proxy/metrics` also exposes `runtime_apply_policies_supported` for machine-readable policy discovery.

## Observability and Audit

- Request monitor:
  - stats/logs in `src/proxy/monitor.rs`
  - admin endpoints in `src/proxy/admin/runtime/logs.rs`
- Structured admin audit:
  - actor resolution/logging in `src/proxy/admin/runtime/audit.rs`
  - event model in `src/proxy/admin/runtime/audit_event.rs`
  - emitted with `[ADMIN_AUDIT]` prefix for grep compatibility

## Failure Domains and Recovery

- Persisted config errors:
  - validation failure -> request rejected (400)
  - save/read failure -> internal error (500)
- Secret migration path:
  - run binary with `--reencrypt-secrets` to rewrite encrypted config/account fields into current ciphertext format
  - command exits after migration; normal proxy service startup is skipped in this mode
- Runtime drift risks:
  - minimized via shared scoped patch helper and explicit runtime apply policy
- Session stickiness restart behavior:
  - persistence controlled by `persist_session_bindings`
  - validated by restart smoke tests/scripts
- Admin auth lockout risk:
  - `POST /api/config` preserves existing API key when blank input is submitted
- OAuth linkage contract:
  - callback success means `authorization received`, not `account linked`
  - account linking is only successful when OAuth flow reaches terminal `linked`
  - if token exchange/user-info/account-save fails, terminal state must be explicit (`failed`, `rejected`, or `cancelled`) and never presented as linked
  - containerized/runtime prerequisite for reliable encrypted persistence: set `ENCRYPTION_KEY` (machine UID may be unavailable in some containers)
  - operator guidance: use a high-entropy `ENCRYPTION_KEY` (`>= 32` random characters); weak/short values produce startup warning
- Graceful shutdown path:
  - Ctrl+C signals accept-loop shutdown
  - optional admin stop hook can also signal graceful shutdown when `ADMIN_STOP_SHUTDOWN=true` and `POST /api/proxy/stop` is called
  - listener stops accepting new sockets
  - active connections are drained with bounded timeout (`SHUTDOWN_DRAIN_TIMEOUT_SECS`, default 10s, range 1-600), then aborted if needed
  - long-running streams may be aborted on shutdown once drain timeout is exceeded

## HTTP/2 Evaluation

- Current server runtime is HTTP/1.1 (`hyper::server::conn::http1::Builder`) by design.
- Local concurrency benchmark (`src/proxy/server.rs` test `http1_health_concurrency_smoke_benchmark`) measured:
  - `1500` requests
  - concurrency `64`
  - elapsed `~894ms`
  - throughput `~1676.90 req/s`
- Decision: no immediate HTTP/2 implementation; revisit only if real workloads show sustained multiplexing bottlenecks.

## Test Strategy Map

Unit-heavy coverage in `src/proxy/tests` with focused admin runtime suite:

- `src/proxy/tests/admin_runtime_endpoints.rs`
  - scoped config updates
  - auth regression
  - restart-like reinit persistence flow
  - route capability and group parity checks
  - metrics schema stability

Operator smoke scripts in `scripts/` validate live runtime behavior:

- admin restart + health + version routes (minimal readiness)
- session binding persistence
- compliance counters
- proxy pool strategy/runtime endpoints