doppel 0.0.2

Intercept secrets in byte payloads, replace them with structurally-equivalent fakes, and transparently restore originals in streaming responses.
Documentation

doppel

Swaps secrets from arbitrary payloads with structurally-equivalent fakes, then restores the originals transparently in the response.

The name comes from doppelgänger: each fake replacing a secret is its structural twin — same format, different value.

See SPEC.md for the behavioral contract.

How it works

              secrets.toml
     ┌─────────────────────────────┐
     │ [[pattern]] anthropic, …    │
     │ [[pattern]] db-password     │
     └──────────────┬──────────────┘
                 patterns
                    │
               ┌────▼────┐
  payload ───▶ │  swap   │── swapped payload ────▶ External
  sk-ant-REAL  └────┬────┘     sk-ant-FAKE         (eg. LLM)
                    │                                 │
                entries +                             │
               session_key                     response stream
                    │                        (may contain fakes)
 restored      ┌────▼────┐                            │
  payload ◀─── │ restore │◀───────────────────────────┘
  sk-ant-REAL  └─────────┘     sk-ant-FAKE

swap(payload, patterns)  →  (swapped_payload, entries, session_key)
restore(response_stream, entries, session_key)  →  restored_stream

You supply the patterns. swap applies exactly the patterns you pass — nothing more. Secrets matching those patterns are replaced with structurally-equivalent fakes before the payload leaves. restore reverses the substitution in the response stream using the encrypted entries and the session key.

Patterns

You decide what gets swapped. A pattern describes how to detect and replace one secret or one class of secrets. Every pattern is a [[pattern]] entry in the TOML file; the distinction between detecting by shape vs. detecting by value is made via the segment definitions:

Structural patterns

A structural pattern describes the shape of a secret class: an ordered sequence of Literal segments (fixed byte sequences), Variable segments (a character set with a length range), and optionally Opaque segments (fixed bytes for detection but re-derived in fake generation). Detection fires on any payload byte that matches that shape; no prior knowledge of the actual secret value is required.

The library ships built-in structural pattern definitions for 27 providers (Anthropic, OpenAI, AWS, GitHub, GCP, Stripe, Clerk, and more). These are available as a starting set — you opt into them; they are not applied automatically.

Registered secrets

A registered pattern covers a secret that does not conform to any known structural class: you know the actual value and want it swapped wherever it appears. You register the full secret bytes; the library derives a detection fingerprint and generates a fake deterministically from a salt. The original value is never stored.

// Simple registration (default options: 3-byte detection anchor)
let pat = register(b"my-super-secret-api-token")?;

// With options: longer anchor for lower false-positive rate
let pat = register_with_options(b"my-super-secret-api-token", &SecretOptions {
    anchor_len: 6,           // store 6 leading bytes as the detection anchor
    tail_anchor_len: 0,      // no trailing anchor
    restrict_charset: false, // fake uses wide charset by default
    force: false,            // reject secrets below 83-bit entropy
})?

SecretOptions controls the detection anchor length (anchor_len, default 3), an optional trailing anchor (tail_anchor_len), fake charset restriction, and an entropy override (force). register is shorthand for register_with_options with all defaults.

Source: doppel/src/secrets.rs.

Salt — stable fakes across runs

Every pattern carries a salt: a 32-byte random value generated once when the pattern is first registered. The salt is the stability guarantee:

same secret + same pattern + same salt → same fake, every run

Without a fixed salt, each process restart generates a new one and the same secret gets a different fake each time — correct within a single cycle but inconsistent across runs. For the CLI patterns file the salt is written into the file on first use and stays fixed forever; you own it along with the rest of the pattern definition.

Patterns file

The CLI reads patterns from a TOML file (version 3). You create it with init and extend it with register and define. Each entry embeds its salt, so fakes are stable across process restarts.

Library users can load a patterns file programmatically:

use doppel::{SecretsFile, swap};

let data = std::fs::read("secrets.toml")?;
let sf = SecretsFile::deserialize(&data)?;
let patterns = sf.to_patterns()?;
let result = swap(&payload, &patterns)?;

For long-running processes that call swap on every incoming request, use Detector. The free swap function rebuilds an internal multi-pattern search structure on every call; Detector builds it once at startup and reuses it across all requests, which makes a measurable difference at hundreds of requests per second.

Detector is Send + Sync, so you can store it in an Arc and share it across threads or async tasks. The full swap→restore cycle with Detector:

use doppel::{Detector, SecretsFile, restore};
use std::sync::Arc;

// At startup — build once:
let data = std::fs::read("secrets.toml")?;
let patterns = SecretsFile::deserialize(&data)?.to_patterns()?;
let detector = Arc::new(Detector::new(patterns));

// Per request — swap outgoing payload:
let result = detector.swap(&outgoing_payload)?;
// result.payload      — send to external service (secrets replaced with fakes)
// result.entries      — keep locally
// result.session_key  — keep locally

// Per response — restore incoming stream:
let mut restored = Vec::new();
restore(
    &mut response_stream,
    &mut restored,
    &result.entries,
    &result.session_key,
)?;
// restored now contains the original secret bytes

Create a new patterns file:

doppel init --patterns secrets.toml

This writes a self-describing TOML file with all built-in structural pattern definitions and freshly generated salts. The registered secrets list starts empty.

Patterns file structure:

version = 3

[[pattern]]
identifier = "anthropic"
salt = "47abb6fb..."   # 64 hex chars (32 bytes); generated by `doppel init`

[[pattern.segments]]
type = "literal"
value = "sk-ant-api03-"

[[pattern.segments]]
type = "variable"
charset = "url_safe_base64"
min = 93
max = 93

[[pattern.segments]]
type = "literal"
value = "AA"

# ... more [[pattern]] entries for other built-in providers ...

# Instance pattern (registered secret — added by `doppel register`):
[[pattern]]
identifier = "my-api-key"
salt = "ff3c005b..."         # 64 hex chars; unique per registration
digests = [
  "8a5843ef...",             # HMAC-SHA256(salt, secret)
]

[[pattern.segments]]
type = "opaque"              # detection anchor: first anchor_len bytes of the secret
value = "my-"

[[pattern.segments]]
type = "variable"
charset = "alphanumeric"
min = 33
max = 33

Valid charset names for structural pattern segments: alphanumeric, url_safe_base64, uppercase_alphanumeric, digits, hex_lower, wide.

(wide = 92 printable ASCII bytes: 0x21–0x7E excluding " and \; used by default for registered-secret variable segments.)

The file MUST be treated with the same sensitivity as the secrets it detects — it contains detection fragments. On Unix systems, all write operations (init, register, define) create or update the file with mode 0600.

CLI reference

init — create a patterns file

doppel init --patterns secrets.toml [--force]

Creates a new TOML patterns file with all built-in structural pattern definitions and freshly generated salts. Fails if the file already exists; use --force to overwrite (warning: regenerates all salts — existing fakes become invalid).

swap — swap a payload

doppel swap \
  --patterns secrets.toml \
  --entries  entries.json \
  --key-out  session.key \
  < request_body.json > swapped_body.json

Reads the complete payload from stdin, writes the swapped payload to stdout, writes the entries (ciphertext; not sensitive on its own) to --entries, and writes the session key (sensitive; mode 0600) to --key-out.

restore — restore a response stream

export DOPPEL_KEY=$(cat session.key)
doppel restore --entries entries.json < response_stream > restored.txt

Reads the response stream from stdin incrementally and writes restored output to stdout as each chunk resolves. The session key is supplied only via the DOPPEL_KEY environment variable — no --key flag exists (command-line arguments are visible in process listings and shell history).

register — register a secret

echo -n 'my-secret-value' | doppel register \
  --patterns    secrets.toml \
  --identifier  my-api-key \
  [--anchor-len N] \
  [--tail-anchor-len M] \
  [--restrict-charset] \
  [--force]

Reads the secret from stdin (raw bytes, no trimming), appends a new instance-pattern entry to the patterns file, and writes it back atomically. The secret never appears in command-line arguments. --identifier is required and must be unique within the file.

--anchor-len controls how many leading bytes of the secret become the detection anchor. Minimum 2 (hard fail for 0 or 1); default 3 is recommended. Values below 3 emit a warning — shorter anchors generate more false Aho-Corasick candidates per payload byte.

Alternatively, use --group <id> instead of --identifier to add this secret as an additional digest to an existing group pattern (for grouping multiple secrets under one detection rule).

Source: doppel/src/secrets.rs (registration logic) · doppel-cli/src/main.rs (run_register).

define — add a user-defined structural pattern

doppel define \
  --patterns   secrets.toml \
  --identifier MY_PATTERN \
  --segment    literal:MY_PREFIX_ \
  --segment    variable:alphanumeric:32:32

Adds a structural pattern. --segment is repeatable; pass it once per segment in order. Segment specs:

  • literal:<value> — fixed byte sequence
  • variable:<charset>:<min>:<max> — variable-length field from named charset

Valid charset names: alphanumeric, url_safe_base64, uppercase_alphanumeric, digits, hex_lower, wide.

At least one Variable segment is required. The identifier must be unique in the file. The first segment value must be at least 2 bytes (hard fail for shorter); values below 4 bytes emit a warning — short prefixes match too many positions in the payload.

list — list all patterns

doppel list --patterns secrets.toml

Prints a human-readable summary: each [[pattern]] entry's identifier, kind (family or instance), segment description, and digest count. Does not modify the file.

inspect — show detail for one pattern

doppel inspect --patterns secrets.toml --identifier anthropic
doppel inspect --patterns secrets.toml --identifier my-api-key

--identifier is required. Accepts any pattern kind (family or instance). Prints full detail for the matched entry: all segments, salt fingerprint (first 8 hex chars), kind, and digest count. Does not modify the file.

remove — remove a pattern

doppel remove --patterns secrets.toml --identifier anthropic
doppel remove --patterns secrets.toml --identifier my-api-key

--identifier is required. Removes the specified entry and writes the file back atomically. Removing a built-in structural pattern identifier emits a warning but succeeds; swap will no longer detect that secret class.

Streaming

restore processes a stream incrementally. It uses suspicion-driven buffering: chunks are held only while a potential match is in flight, bounded by the longest secret length across active patterns (typically 100–200 bytes).

Async streaming (async feature)

[dependencies]
doppel = { version = "0.0.1", features = ["async"] }

With the async feature, RestoreStream wraps entries and session key into a futures_core::Stream adapter. Pass it any Stream<Item = Result<Bytes, E>> and it yields restored Bytes chunks as they arrive — no runtime dependency beyond futures-core and bytes.

For the paranoid

Registered secrets are stored as: a 32-byte salt, an opaque segment holding the first anchor_len bytes of the secret (default 3), a variable segment encoding the remaining byte count, and one or more HMAC-SHA256 digests (HMAC(salt, secret)) in the digests array — never as the plaintext value. The source of truth is doppel/src/secrets.rs.

You can verify any registered entry against its original secret using only openssl, python3, and standard POSIX utilities, and independently reproduce the fake doppel will generate. See docs/for-the-paranoid.md for the full audit script and fake-derivation walkthrough.