mirage-proxy 0.2.6

Invisible sensitive data filter for LLM APIs — secrets, credentials, and PII replaced with plausible fakes
mirage-proxy-0.2.6 is not a library.

mirage-proxy

A fast, invisible sensitive data filter for LLM APIs. Written in Rust.

Your coding agent reads your .env, your codebase, your credentials — and sends all of it to the cloud. Mirage sits between your client and the provider, silently replacing sensitive data with plausible fakes. The LLM never knows. Your secrets never leave.

How it works

You:     "Email sam@acme.com, key is AKIAIOSFODNN7EXAMPLE"
         ↓
Mirage:    "Email jordan.walker3@proton.me, key is AKIAHQ7RN2XK5M3B9Y1T"
         ↓
Provider: (sees only fake data, responds normally)
         ↓
Mirage:    (swaps fakes back to originals)
         ↓
You:     "Done! I've drafted the email to sam@acme.com"

No [REDACTED]. No [[PERSON_1]]. No brackets. The provider sees a completely normal request with completely fake data. Responses are rehydrated transparently.

Install

cargo install mirage-proxy

Or build from source:

git clone https://github.com/chandika/mirage-proxy
cd mirage-proxy
cargo build --release

Quick start

# Point mirage at your LLM provider
mirage-proxy --target https://api.openai.com

# Your client talks to localhost:8686 instead
export OPENAI_BASE_URL=http://localhost:8686

Claude Code

mirage-proxy --target https://api.anthropic.com
# Set ANTHROPIC_BASE_URL=http://localhost:8686 in your environment

Codex / GPT

mirage-proxy --target https://api.openai.com
export OPENAI_BASE_URL=http://localhost:8686

Cursor / Continue / Aider / OpenCode

Point the provider base URL to http://localhost:8686. Everything else works as before.

What makes this different

Invisible substitution. Most PII tools replace sensitive data with ugly tokens like [EMAIL_1] or <REDACTED>. The LLM sees these, knows data was removed, and produces worse output. Mirage replaces PII with plausible fakes that match the format and length of the original. The LLM has no idea redaction happened.

Session-aware consistency. Within a conversation, the same email always maps to the same fake. Across conversations, the same email maps to different fakes. History messages stay consistent — if sam@acme.com became jordan.walker3@proton.me in message 1, it stays that way through message 50.

Encrypted vault. Mappings persist across restarts in an AES-256-GCM encrypted file. You hold the key. The vault file is useless without it.

Sub-millisecond overhead. Pure Rust, pattern-based detection. No ML models, no Docker, no Python, no 500MB spaCy download. A single static binary under 5MB.

What it catches

Secrets & credentials

  • AWS keys (AKIA...)
  • GitHub tokens (ghp_..., ghs_...)
  • OpenAI / API keys (sk-..., sk-proj-...)
  • Slack tokens (xoxb-..., xoxp-...)
  • Google API keys (AIza...)
  • Bearer tokens
  • Connection strings (Postgres, MySQL, MongoDB, Redis)
  • PEM private keys (RSA, EC, DSA, OpenSSH)
  • High-entropy strings (Shannon entropy scanner catches unknown secret formats)

Personal data

  • Email addresses
  • Phone numbers (US/international, all common formats)
  • Social Security Numbers
  • Credit card numbers (Visa, MC, Amex, Discover)
  • IP addresses

Every detected value is replaced with a format-matching fake:

Original Fake
sam@acme.com jordan.walker3@proton.me
(555) 123-4567 (237) 153-1071
AKIAIOSFODNN7EXAMPLE AKIAHQ7RN2XK5M3B9Y1T
sk-proj-abc123def456... sk-proj-hR4kM9nQ2wX7...
postgres://user:pass@host/db postgres://alex:kR4m9Q2w@db37.internal:5432/app_12
123-45-6789 537-28-4071

Configuration

Mirage works with zero config. For fine-tuning, create a mirage.yaml:

target: "https://api.openai.com"
sensitivity: medium   # low | medium | high | paranoid

rules:
  # Always redact — LLM never needs the real value
  always_redact:
    - SSN
    - CREDIT_CARD
    - PRIVATE_KEY
    - AWS_KEY
    - GITHUB_TOKEN
    - API_KEY
    - BEARER_TOKEN

  # Replace with plausible fakes
  mask:
    - EMAIL
    - PHONE

  # Log but don't touch (too context-dependent)
  warn_only:
    - IP_ADDRESS
    - CONNECTION_STRING
    - SECRET

# Never redact these values
allowlist:
  - "192.168.1.*"
  - "sk-test-*"

audit:
  enabled: true
  path: "./mirage-audit.jsonl"

dry_run: false

Sensitivity levels

Level Behavior
low Only always_redact categories
medium always_redact + mask (default)
high Everything including warn_only
paranoid Redact all detected PII regardless of category

Encrypted vault

Mappings persist across restarts so your conversations stay consistent:

# With vault (encrypted persistence)
mirage-proxy --target https://api.openai.com --vault-key "my-passphrase"

# Or via environment variable
MIRAGE_VAULT_KEY="my-passphrase" mirage-proxy --target https://api.openai.com

The vault file (mirage-vault.enc) is AES-256-GCM encrypted. Without the passphrase, it's random bytes. Mappings are scoped per conversation session.

Without --vault-key, mappings live in memory only and reset on restart.

Dry run

See what would be redacted without actually changing anything:

mirage-proxy --target https://api.openai.com --dry-run

Requests pass through unmodified. Detections are logged to the audit file. Use this to tune your config before going live.

Audit log

Every detection is logged to mirage-audit.jsonl:

{"timestamp":"2026-02-19T14:30:00Z","kind":"EMAIL","action":"masked","confidence":1.0,"value_hash":"a3b2c1...","context_snippet":"Email sam@... about the deal"}

Original values are never logged by default. Enable audit.log_values: true only for debugging.

Streaming

Full SSE streaming support. Mirage handles text/event-stream responses from providers, rehydrating fakes in real-time as chunks arrive.

CLI reference

mirage-proxy [OPTIONS]

Options:
  -t, --target <URL>              Target LLM API base URL (required)
  -p, --port <PORT>               Listen port [default: 8686]
  -b, --bind <ADDR>               Bind address [default: 127.0.0.1]
  -c, --config <PATH>             Config file path
      --vault-key <PASSPHRASE>    Vault encryption passphrase (or MIRAGE_VAULT_KEY env)
      --vault-path <PATH>         Vault file path [default: ./mirage-vault.enc]
      --vault-flush-threshold <N> Auto-flush after N new mappings [default: 50]
      --dry-run                   Log detections without redacting
      --sensitivity <LEVEL>       low | medium | high | paranoid
      --log-level <LEVEL>         trace | debug | info | warn | error [default: info]
  -h, --help                      Print help
  -V, --version                   Print version

Why not PasteGuard / LLM Guard / Presidio?

mirage-proxy PasteGuard LLM Guard LiteLLM+Presidio
Install cargo install Docker pip + models pip + Docker + spaCy
Binary size ~5MB ~500MB+ ~2GB+ ~500MB+
Overhead <1ms 10-50ms 50-200ms 10-50ms
Substitution Plausible fakes [[PERSON_1]] tokens [REDACTED] <PERSON> tokens
LLM knows? No Yes Yes Yes
Session-aware Yes No No No
Encrypted vault Yes No No No
Rehydration Yes Yes No No
Streaming Yes Yes No Partial
Dependencies None Node + Presidio Python + ML models Python + spaCy + Docker

Roadmap

  • Pattern + entropy detection
  • Invisible plausible fake substitution
  • Session-scoped consistency
  • Encrypted vault persistence
  • Streaming rehydration
  • Audit log + dry-run
  • YAML config with sensitivity levels
  • Custom pattern definitions
  • Allowlist/blocklist glob matching
  • Optional ONNX NER for name/org detection
  • Multi-provider native format support (Anthropic Messages API)
  • Route mode (sensitive requests → local Ollama)
  • npm/brew/scoop distribution
  • Pre-built binaries for all platforms

License

MIT