ferryllm 0.1.4

Universal LLM protocol middleware for OpenAI, Anthropic, Claude Code, and OpenAI-compatible backends.
Documentation
# Deployment Guide

This document describes the recommended deployment model for ferryllm as a standalone LLM middleware server.

The current repository provides example servers. A production release should expose a dedicated binary such as:

```bash
ferryllm serve --config /etc/ferryllm/config.toml
```

## Local Development

For local testing with the current example server:

```bash
export CODX_API_KEY="your-api-key"
RUST_LOG=info cargo run --example codexapis_server --features http
```

Health check:

```bash
curl http://127.0.0.1:3000/health
```

Claude Code through ferryllm:

```bash
ANTHROPIC_API_KEY=dummy \
ANTHROPIC_BASE_URL=http://127.0.0.1:3000 \
claude --bare --print --model claude-opus-4-6 \
  "Reply with exactly one short word: pong"
```

## Environment Variables

Use environment variables for secrets.

Example:

```bash
export CODX_API_KEY="..."
export RUST_LOG=info
```

Do not store provider keys directly in committed config files.

## Logging

Recommended local logging:

```bash
RUST_LOG=info ferryllm serve --config ferryllm.toml
```

Recommended debugging:

```bash
RUST_LOG=debug ferryllm serve --config ferryllm.toml
```

Recommended deep protocol troubleshooting:

```bash
RUST_LOG=trace ferryllm serve --config ferryllm.toml
```

For local Claude Code against codexapis with full trace logs:

```bash
export CODX_API_KEY="..."
RUST_LOG=ferryllm=trace,tower_http=debug,reqwest=debug \
  cargo run --features http --bin ferryllm -- serve --config examples/config/codexapis.toml
```

Then in another shell:

```bash
export ANTHROPIC_BASE_URL=http://127.0.0.1:3000
export ANTHROPIC_API_KEY=local-test-token
claude --model cc-gpt55
```

Production deployments should prefer JSON logs:

```toml
[logging]
level = "info"
format = "json"
```

## Docker

Build the image locally:

```bash
docker build -t ferryllm:local .
```

Run with the example Codex APIs config:

```bash
docker run --rm \
  -p 3000:3000 \
  -e CODX_API_KEY="..." \
  ferryllm:local
```

Run with a custom config:

```bash
docker run --rm \
  -p 3000:3000 \
  -e CODX_API_KEY="..." \
  -v ./ferryllm.toml:/etc/ferryllm/config.toml:ro \
  ferryllm:local \
  serve --config /etc/ferryllm/config.toml
```

Recommended container behavior:

- Run as a non-root user.
- Expose only the configured listen port.
- Emit logs to stdout/stderr.
- Support graceful shutdown on `SIGTERM`.
- Avoid writing secrets to disk.

## systemd Plan

Example service:

```ini
[Unit]
Description=ferryllm LLM protocol middleware
After=network-online.target
Wants=network-online.target

[Service]
User=ferryllm
Group=ferryllm
Environment=RUST_LOG=info
EnvironmentFile=/etc/ferryllm/secrets.env
ExecStart=/usr/local/bin/ferryllm serve --config /etc/ferryllm/config.toml
Restart=always
RestartSec=3
NoNewPrivileges=true

[Install]
WantedBy=multi-user.target
```

## Kubernetes Plan

Recommended Kubernetes endpoints:

- Liveness probe: `/healthz`
- Readiness probe: `/readyz`
- Metrics endpoint: `/metrics` once implemented

Example probe shape:

```yaml
livenessProbe:
  httpGet:
    path: /healthz
    port: 3000
  initialDelaySeconds: 5
  periodSeconds: 10

readinessProbe:
  httpGet:
    path: /readyz
    port: 3000
  initialDelaySeconds: 5
  periodSeconds: 10
```

Secrets should be injected through Kubernetes Secrets and referenced by environment variables.

## Operational Checklist

Before exposing ferryllm to users:

- Confirm provider keys are supplied through environment variables.
- Confirm model routes are explicit and reviewed.
- Confirm logs show incoming model and backend model.
- Confirm request body logging is disabled or redacted.
- Confirm health and readiness probes work.
- Confirm `/metrics` works when metrics are enabled.
- Confirm timeouts are configured.
- Confirm max body size is configured.
- Confirm API authentication is enabled for public deployments.
- Confirm rate limits are configured if serving untrusted clients.

## Public Relay Authentication

Enable API key authentication for public deployments:

```toml
[auth]
enabled = true
api_keys_env = "FERRYLLM_API_KEYS"
```

Then set keys at runtime:

```bash
export FERRYLLM_API_KEYS="key-one,key-two"
```

Clients should send one of:

```text
Authorization: Bearer key-one
x-api-key: key-one
```

## Security Notes

- Never log `Authorization`, `x-api-key`, or provider API keys.
- Do not log full prompts by default.
- Redact request bodies unless explicitly debugging in a safe environment.
- Use TLS at the ingress layer for production deployments.
- Restrict access to local-only deployments with firewall rules or `127.0.0.1` binding.