gephyr 1.16.7

Gephyr headless AI relay service for Google AI services
Documentation

Gephyr

Rust License Docker

Gephyr is a headless local API relay/proxy service for Google AI services. It routes multiple client-facing API surfaces, including OpenAI-compatible, Claude-compatible, and Google AI (Gemini)-compatible endpoints, to Google's AI backends.

Features

  • 🔒 Secure API Authentication — Bearer token auth with configurable modes
  • 🔄 Multi-Account Support — Link multiple Google accounts and rotate between them
  • 🌐 Multi-Protocol API Compatibility — OpenAI-compatible, Claude-compatible, and Google AI (Gemini)-compatible endpoints
  • 🐳 Docker Native — One-command deployment
  • High Performance — Built with Rust + Axum for low latency

Quick Start

Prerequisites

1. Clone & Configure

git clone https://github.com/softerist/gephyr.git
cd gephyr

# Copy the example env file and edit with your values
cp .env.example .env.local

Edit .env.local with your API key and OAuth credentials:

API_KEY=gph_your_secure_api_key
GOOGLE_OAUTH_CLIENT_ID=your_client_id.apps.googleusercontent.com
GOOGLE_OAUTH_CLIENT_SECRET=GOCSPX-your_secret

# Optional: restrict accepted Google Workspace domains for identity verification
ALLOWED_GOOGLE_DOMAINS=example.com,subsidiary.example.com

# Optional: scheduler jitter window in seconds (defaults shown)
SCHEDULER_REFRESH_JITTER_MIN_SECONDS=30
SCHEDULER_REFRESH_JITTER_MAX_SECONDS=120

# Optional: deterministic per-account stagger before each batch refresh task
ACCOUNT_REFRESH_STAGGER_MIN_MS=250
ACCOUNT_REFRESH_STAGGER_MAX_MS=1500

# Optional: startup health-check smoothing (boot-time token refresh)
STARTUP_HEALTH_MAX_CONCURRENT_REFRESHES=5
STARTUP_HEALTH_JITTER_MIN_MS=150
STARTUP_HEALTH_JITTER_MAX_MS=1200

# Optional runtime TLS backend override when binary includes both stacks
TLS_BACKEND=rustls

# Optional startup TLS canary probe (recommended when changing TLS backend)
TLS_CANARY_URL=https://oauth2.googleapis.com/token
TLS_CANARY_TIMEOUT_SECS=5
TLS_CANARY_REQUIRED=false

2. Build & Run

# Build Docker image
docker build -t gephyr:latest -f docker/Dockerfile .

# Start the service
.\console.ps1 start

# Check status
.\console.ps1 status

3. Link Google Account (OAuth)

.\console.ps1 login
# Browser opens → Complete Google OAuth → Account linked

4. Test the API

.\console.ps1 api-test

API Reference

Base URL

http://127.0.0.1:8045

Authentication

All requests require a Bearer token in the Authorization header:

Authorization: Bearer YOUR_API_KEY

Endpoints

Endpoint Method Description
/healthz GET Health check
/v1/chat/completions POST OpenAI-compatible chat completions
/v1/messages POST Claude-compatible messages API
/v1beta/models/:model:generateContent POST Google AI (Gemini)-compatible generation API
/api/accounts GET List linked accounts (admin API)
/api/auth/url GET Get OAuth login URL (admin API)

Example: OpenAI-Compatible Chat Completion

curl http://127.0.0.1:8045/v1/chat/completions \
  -H "Authorization: Bearer $API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "gpt-5.3-codex",
    "messages": [{"role": "user", "content": "Hello!"}]
  }'

Example: OpenAI-Compatible Streaming

curl -N http://127.0.0.1:8045/v1/chat/completions \
  -H "Authorization: Bearer $API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "gpt-5.3-codex",
    "stream": true,
    "messages": [{"role": "user", "content": "Stream a short response."}]
  }'

Example: Claude-Compatible Messages

curl http://127.0.0.1:8045/v1/messages \
  -H "Authorization: Bearer $API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "claude-sonnet-4-5",
    "max_tokens": 256,
    "messages": [
      { "role": "user", "content": "Write a one-line haiku about Rust." }
    ]
  }'

Example: Claude-Compatible Streaming

curl -N http://127.0.0.1:8045/v1/messages \
  -H "Authorization: Bearer $API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "claude-sonnet-4-5",
    "stream": true,
    "max_tokens": 256,
    "messages": [
      { "role": "user", "content": "Stream a brief poem about latency." }
    ]
  }'

Example: Google AI (Gemini)-Compatible generateContent

curl http://127.0.0.1:8045/v1beta/models/gemini-2.5-flash:generateContent \
  -H "Authorization: Bearer $API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "contents": [
      {
        "role": "user",
        "parts": [{ "text": "Summarize what this proxy does in one sentence." }]
      }
    ]
  }'

Example: Google AI (Gemini)-Compatible streamGenerateContent

curl -N http://127.0.0.1:8045/v1beta/models/gemini-2.5-flash:streamGenerateContent \
  -H "Authorization: Bearer $API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "contents": [
      {
        "role": "user",
        "parts": [{ "text": "Stream 3 short bullet points about this project." }]
      }
    ]
  }'

Configuration

Environment Variables

Variable Required Default Description
API_KEY API key for console scripts and runtime auth
AUTH_MODE strict Auth mode: strict, off, all_except_health, auto
ALLOW_LAN_ACCESS false Bind to 0.0.0.0 instead of 127.0.0.1
ENABLE_ADMIN_API false Enable /api/* admin routes
GOOGLE_OAUTH_CLIENT_ID Google OAuth Client ID
GOOGLE_OAUTH_CLIENT_SECRET Google OAuth Client Secret
TLS_BACKEND compiled default Runtime TLS backend override (native-tls/rustls) when build includes both
TLS_CANARY_URL Optional startup TLS canary probe URL
TLS_CANARY_TIMEOUT_SECS 5 Startup TLS canary timeout seconds (clamped 1..60)
TLS_CANARY_REQUIRED false If true, startup fails when TLS canary probe fails
ALLOWED_GOOGLE_DOMAINS Optional comma-separated Workspace domain allowlist for identity verification
DATA_DIR ~/.gephyr Data directory path
PUBLIC_URL Public URL for OAuth callbacks (hosted deployments)
MAX_BODY_SIZE 104857600 Max request body size in bytes
SCHEDULER_REFRESH_JITTER_MIN_SECONDS 30 Min random delay before each scheduled quota-refresh batch
SCHEDULER_REFRESH_JITTER_MAX_SECONDS 120 Max random delay before each scheduled quota-refresh batch
ACCOUNT_REFRESH_STAGGER_MIN_MS 250 Min deterministic per-account delay before each batch refresh task
ACCOUNT_REFRESH_STAGGER_MAX_MS 1500 Max deterministic per-account delay before each batch refresh task
STARTUP_HEALTH_MAX_CONCURRENT_REFRESHES 5 Max concurrent token refreshes during startup health-check (clamped 1..32)
STARTUP_HEALTH_JITTER_MIN_MS 150 Min random per-account delay before startup health refresh
STARTUP_HEALTH_JITTER_MAX_MS 1200 Max random per-account delay before startup health refresh

Proxy-pool isolation knobs are config/API settings (not env vars):

  • proxy.proxy_pool.allow_shared_proxy_fallback
  • proxy.proxy_pool.require_proxy_for_account_requests

Persistent Session Bindings (Sticky Sessions Across Restart)

persist_session_bindings is a config-file setting (not an env var).
It controls whether sticky session bindings (session_id -> account_id) survive process/container restarts.

  • Default: true
  • Config file: config.json under your data dir (for example ~/.gephyr/config.json or %USERPROFILE%\.gephyr\config.json)

Example:

{
  "proxy": {
    "persist_session_bindings": true,
    "scheduling": {
      "mode": "balance",
      "max_wait_seconds": 60
    }
  }
}

Admin visibility:

  • GET /api/version/routes returns running version + key route capabilities (useful to detect old images).
  • GET /api/proxy/sticky returns sticky runtime config (persist_session_bindings, scheduling, preferred account).
  • POST /api/proxy/sticky updates sticky settings only (avoids full /api/config round-trip).
  • GET /api/proxy/request-timeout returns configured/effective runtime timeout.
  • POST /api/proxy/request-timeout updates timeout only (avoids full /api/config round-trip).
  • GET /api/proxy/pool/runtime returns proxy-pool runtime knobs (enabled, auto_failover, allow_shared_proxy_fallback, require_proxy_for_account_requests, health_check_interval) plus strategy snapshot.
  • POST /api/proxy/pool/runtime updates only proxy-pool runtime knobs (avoids full /api/config round-trip).
  • GET /api/proxy/pool/strategy returns current proxy-pool strategy snapshot.
  • POST /api/proxy/pool/strategy updates proxy-pool strategy only (avoids full /api/config round-trip).
  • GET /api/proxy/metrics returns runtime/monitor/sticky/proxy-pool/compliance aggregates (including TLS diagnostics: backend/requested/compiled/canary snapshot) and supported runtime-apply policy values.
  • GET /api/proxy/google/outbound-policy returns the effective Google outbound header policy snapshot (mode, host-header behavior, metadata shape, passthrough allow/block policy, debug redaction contract).
  • GET /api/proxy/tls-canary returns latest TLS canary probe snapshot.
  • POST /api/proxy/tls-canary/run runs TLS canary probe on demand and returns the latest canary snapshot.
  • GET /api/proxy/compliance returns live compliance counters/cooldowns (requires admin API enabled).
  • POST /api/proxy/compliance updates only compliance settings (avoids full /api/config round-trip).
  • scoped POST /api/proxy/* update responses include runtime_apply (policy, applied, requires_restart).

Example update call:

curl -X POST http://127.0.0.1:8045/api/proxy/compliance \
  -H "Authorization: Bearer YOUR_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "enabled": true,
    "max_global_requests_per_minute": 120,
    "max_account_requests_per_minute": 10,
    "max_account_concurrency": 1,
    "risk_cooldown_seconds": 300,
    "max_retry_attempts": 2
  }'

Sticky-only update call:

curl -X POST http://127.0.0.1:8045/api/proxy/sticky \
  -H "Authorization: Bearer YOUR_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "persist_session_bindings": true,
    "scheduling": {
      "mode": "Balance",
      "max_wait_seconds": 60
    }
  }'

Request-timeout-only update call:

curl -X POST http://127.0.0.1:8045/api/proxy/request-timeout \
  -H "Authorization: Bearer YOUR_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "request_timeout": 120
  }'

Proxy-pool-strategy-only update call:

curl -X POST http://127.0.0.1:8045/api/proxy/pool/strategy \
  -H "Authorization: Bearer YOUR_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "strategy": "round_robin"
  }'

Proxy-pool-runtime-only update call:

curl -X POST http://127.0.0.1:8045/api/proxy/pool/runtime \
  -H "Authorization: Bearer YOUR_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "enabled": true,
    "auto_failover": true,
    "allow_shared_proxy_fallback": false,
    "require_proxy_for_account_requests": true,
    "health_check_interval": 120
  }'

Google Outbound Policy (Config + Runtime)

Google-bound calls now use a shared policy with explicit defaults:

  • Always set: authorization, user-agent, accept-encoding: gzip
  • JSON requests additionally set: content-type: application/json
  • Passthrough policy: deny-by-default (only explicit allowlist keys are forwarded)
  • Optional explicit Host header: compat mode only (codeassist_compat + send_host_header=true)

Config example (config.json):

{
  "proxy": {
    "google": {
      "mode": "public_google",
      "headers": {
        "send_host_header": false
      },
      "identity_metadata": {
        "ide_type": "ANTIGRAVITY",
        "platform": "PLATFORM_UNSPECIFIED",
        "plugin_type": "GEMINI"
      }
    },
    "debug_logging": {
      "log_google_outbound_headers": false
    }
  }
}

Runtime note:

  • Saving config via POST /api/config hot-applies Google outbound policy to live upstream calls (no restart required).
  • Verify effective runtime policy with GET /api/proxy/google/outbound-policy.

Staging trace validation runbook:

  • See docs/agents/GOOGLE_TRACE_VALIDATION.md for a step-by-step manual diff workflow.

Manual TLS canary run:

curl -X POST http://127.0.0.1:8045/api/proxy/tls-canary/run \
  -H "Authorization: Bearer YOUR_API_KEY" \
  -H "Content-Type: application/json"

Practical notes:

  • Keep scheduling.mode as balance or cache_first to use sticky session behavior.
  • performance_first intentionally disables sticky session reuse.
  • runtime apply policy mapping:
  • sticky / request-timeout / compliance updates: always_hot_applied
  • proxy-pool strategy / runtime updates: hot_applied_when_safe
  • scheduling.max_wait_seconds keeps sticky binding during short bound-account rate-limit windows; long windows release/rebind.
  • allow_shared_proxy_fallback=false prevents borrowing an already-bound proxy for unbound accounts.
  • require_proxy_for_account_requests=true makes account requests fail closed when no eligible proxy is available (instead of app-upstream/direct fallback).
  • For maximum stickiness from clients, send a stable explicit session id:
  • Header: x-session-id (or x-client-session-id, x-gephyr-session-id, x-conversation-id, x-thread-id)
  • Payload: session_id / sessionId (also conversation_id / conversationId, thread_id / threadId)

Compliance Guardrails (Low-Risk Account Traffic Profile)

proxy.compliance applies runtime guardrails to reduce bursty/account-risky traffic patterns.

  • enabled: turns guardrails on/off (default: false)
  • max_global_requests_per_minute: global request budget across all accounts
  • max_account_requests_per_minute: per-account request budget (default 10)
  • max_account_concurrency: max in-flight requests per account (default 1)
  • risk_cooldown_seconds: temporary cooldown applied after risky upstream statuses (401, 403, 429, 500, 503, 529)
  • max_retry_attempts: hard cap for handler retry loops when compliance mode is enabled

Example:

{
  "proxy": {
    "compliance": {
      "enabled": true,
      "max_global_requests_per_minute": 120,
      "max_account_requests_per_minute": 10,
      "max_account_concurrency": 1,
      "risk_cooldown_seconds": 300,
      "max_retry_attempts": 2
    }
  }
}

One-IP Presets (Runbook)

Choose one profile and monitor /api/proxy/metrics proxy-pool counters.

  • Availability-first:

  • allow_shared_proxy_fallback=true

  • require_proxy_for_account_requests=false

  • keeps traffic flowing when pool is saturated, but allows shared-proxy reuse

  • Isolation-first:

  • allow_shared_proxy_fallback=false

  • require_proxy_for_account_requests=true

  • fails closed when no eligible proxy exists, reducing shared routing but increasing hard failures

Keep these in both modes:

  • max_account_requests_per_minute=10
  • max_account_concurrency=1
  • scheduler jitter enabled (SCHEDULER_REFRESH_JITTER_MIN_SECONDS, SCHEDULER_REFRESH_JITTER_MAX_SECONDS)

Console Commands

.\console.ps1 <command> [options]
Command Description
start Start the container
stop Stop and remove container
restart Restart container
status Show container and API status
logs Show container logs
health Check /healthz endpoint
login Start OAuth flow (opens browser)
accounts List linked accounts
api-test Run a test API completion
rotate-key Generate new API key
docker-repair Repair Docker builder cache for snapshot/export errors
logout Remove all linked accounts

Admin API Mode

The admin API (/api/* routes) is disabled by default for security. Enable it when you need to:

  • Bootstrap OAuth login
  • Manage accounts
  • Access admin configuration
# Start with admin API enabled
.\console.ps1 start -EnableAdminApi

# Or restart with admin API
.\console.ps1 restart -EnableAdminApi

Recommendation: Enable admin API only during setup/maintenance. Run normal proxy with it disabled.


Multiple Accounts

Gephyr supports linking multiple Google accounts:

.\console.ps1 restart -EnableAdminApi
.\console.ps1 login  # Link account A
.\console.ps1 login  # Link account B
.\console.ps1 login  # Link account C
.\console.ps1 accounts  # Verify all linked
.\console.ps1 restart  # Restart with admin API disabled

OAuth Setup

See OAUTH_SETUP.md for detailed Google Cloud OAuth configuration.

Quick summary:

  • Local/Docker: Use "Desktop app" OAuth client type
  • Hosted deployment: Use "Web application" client with PUBLIC_URL set

Data Directory

Platform Default Path
Linux/macOS ~/.gephyr
Windows %USERPROFILE%\.gephyr

Override with DATA_DIR environment variable.


Docker Build Troubleshooting

If Docker build fails with an error like:

failed to prepare extraction snapshot ... parent snapshot ... does not exist

This is typically a Docker BuildKit/builder cache issue on the host (not a Gephyr code issue).

Fast recovery

.\console.ps1 docker-repair
docker build -t gephyr:latest -f docker/Dockerfile .
./console.sh docker-repair
docker build -t gephyr:latest -f docker/Dockerfile .

If it still fails

Use aggressive mode (clears more builder cache; next build will be slower):

.\console.ps1 docker-repair -Aggressive
./console.sh docker-repair --aggressive

Preventive tips

  • Avoid force-closing Docker Desktop during builds.
  • Keep sufficient free disk space for image layers and cache.
  • If Docker was updated/restarted mid-build, rerun docker-repair before rebuilding.

Development

Build from Source

cargo build --release

TLS backend build profiles:

  • Default build uses native-tls.
  • Rustls build profile:
cargo build --release --no-default-features --features tls-rustls

Run Tests

cargo test

Code Quality

cargo fmt --check
cargo clippy
cargo audit

License

This project is licensed under CC-BY-NC-SA-4.0.