mitm2openapi
Convert mitmproxy flow dumps and HAR files to OpenAPI 3.0 — fast, single binary, no Python.
A Rust rewrite of mitmproxy2swagger.
Why?
mitmproxy2swagger (the Python original by @alufers) works well, but requires Python, pip, and mitmproxy installed in the environment. For CI pipelines, slim Docker images, security audits, and one-off usage that's friction.
mitm2openapi ships as a single ~5 MB static binary — drop it into any environment, no runtime, no venv, no pip install. Same OpenAPI 3.0 output as the original, plus first-class HAR support and glob-based filters for fully unattended pipelines.
Credit to @alufers for the original tool that pioneered this workflow.
Features
- Fast — pure Rust, single-threaded, processes captures in milliseconds
- Single static binary — no Python, no venv, no pip, no runtime dependencies
- Two-format support — mitmproxy flow dumps (v19/v20/v21) and HAR 1.2
- Two-step workflow —
discoverfinds endpoints, you curate,generateemits clean OpenAPI 3.0 - Glob filters —
--exclude-patternsand--include-patternsfor automated pipelines - Error recovery — skips corrupt flows, continues processing
- Auto-detection — heuristic format detection from file content
- Battle-tested — integration tests against Swagger Petstore and OWASP crAPI with
oasdiffverification - Cross-platform — Linux, macOS, Windows pre-built binaries
Installation
From binary releases
Download a pre-built binary from GitHub Releases.
From source
Quick Start
# 1. Capture traffic with mitmproxy
# 2. Discover API endpoints
# 3. Edit templates.yaml — remove 'ignore:' prefix from paths you want to include
# 4. Generate OpenAPI spec
Skip the manual edit
If you know which paths you care about up front, use --exclude-patterns
and --include-patterns to let discover do the curation:
Paths matching --include-patterns are auto-activated (emitted without
the ignore: prefix). Paths matching --exclude-patterns are dropped
entirely. Everything else still gets ignore: for manual review.
discover
Scan captured traffic and produce a templates file listing all observed endpoints.
mitm2openapi discover [OPTIONS] -i <INPUT> -o <OUTPUT> -p <PREFIX>
| Option | Description |
|---|---|
-i, --input <PATH> |
Input file (flow dump or HAR) |
-o, --output <PATH> |
Output YAML templates file |
-p, --prefix <URL> |
API prefix URL to filter requests |
--format <FORMAT> |
Input format: auto, har, mitmproxy (default: auto) |
--exclude-patterns <GLOBS> |
Comma-separated globs; matching paths are dropped entirely. * = single segment, ** = any subtree. E.g. /static/**,*.css |
--include-patterns <GLOBS> |
Comma-separated globs; matching paths are emitted without ignore: (auto-activated for generate) |
--max-input-size <BYTES> |
Maximum input file size (default: 2GiB). Accepts suffixes: KiB, MiB, GiB |
--allow-symlinks |
Allow symlinked input files (default: rejected for safety) |
--strict |
Treat warnings as errors; exit code 2 if any cap fires, flow is rejected, or parse error occurs |
--report <PATH> |
Write a structured JSON processing report to the given path |
generate
Generate an OpenAPI 3.0 spec from captured traffic using a curated templates file.
mitm2openapi generate [OPTIONS] -i <INPUT> -t <TEMPLATES> -o <OUTPUT> -p <PREFIX>
| Option | Description |
|---|---|
-i, --input <PATH> |
Input file (flow dump or HAR) |
-t, --templates <PATH> |
Templates YAML file (from discover) |
-o, --output <PATH> |
Output OpenAPI YAML file |
-p, --prefix <URL> |
API prefix URL |
--format <FORMAT> |
Input format: auto, har, mitmproxy (default: auto) |
--openapi-title <TITLE> |
Custom title for the spec |
--openapi-version <VER> |
Custom spec version (default: 1.0.0) |
--exclude-headers <LIST> |
Comma-separated headers to exclude |
--exclude-cookies <LIST> |
Comma-separated cookies to exclude |
--include-headers |
Include headers in the spec |
--ignore-images |
Ignore image content types |
--suppress-params |
Suppress parameter suggestions |
--tags-overrides <JSON> |
JSON string for tag overrides |
--max-input-size <BYTES> |
Maximum input file size (default: 2GiB). Accepts suffixes: KiB, MiB, GiB |
--max-payload-size <BYTES> |
Maximum tnetstring payload size (default: 256MiB) |
--max-depth <N> |
Maximum tnetstring nesting depth (default: 256) |
--max-body-size <BYTES> |
Maximum request/response body size (default: 64MiB) |
--allow-symlinks |
Allow symlinked input files (default: rejected for safety) |
--strict |
Treat warnings as errors; exit code 2 if any cap fires, flow is rejected, or parse error occurs |
--report <PATH> |
Write a structured JSON processing report to the given path |
Resource Limits
To prevent denial-of-service when processing untrusted captures, mitm2openapi
enforces several configurable limits:
| Flag | Default | Purpose |
|---|---|---|
--max-input-size |
2 GiB | Reject files larger than this before reading |
--max-payload-size |
256 MiB | Cap on individual tnetstring payload allocation |
--max-depth |
256 | Recursion depth limit for nested tnetstring structures |
--max-body-size |
64 MiB | Maximum request/response body considered during schema inference |
--allow-symlinks |
off | By default, symlinked inputs are rejected to prevent path-traversal on shared CI runners |
In addition to the configurable limits above, the following per-field caps are applied unconditionally to prevent data corruption:
| Field | Cap | Behaviour |
|---|---|---|
| Header name | 8 KiB | Dropped (other headers still processed) |
| Header value | 64 KiB | Truncated to cap |
| Form fields per request | 1 000 | Excess fields ignored |
| URL scheme | http / https only |
Non-HTTP flows silently skipped |
| Port number | 1–65 535 | Out-of-range port drops the request |
| HTTP status code | 100–599 | Invalid codes treated as no response |
Identity fields (scheme, host, path, method, header names) require valid UTF-8. Flows with non-UTF-8 identity bytes are skipped to prevent data aliasing through replacement-character collisions. Control characters in paths are stripped automatically.
Increase --max-input-size if you work with captures larger than 2 GiB (e.g.
--max-input-size 8GiB). The other limits rarely need tuning.
Both mitmproxy flow files and HAR files are processed incrementally — memory usage stays bounded regardless of input size.
Diagnostics
When the tnetstring parser encounters corruption in a mitmproxy flow file, it halts and emits a warn-level log with the byte offset, number of successfully parsed entries, and an error classification. No resync is attempted — binary payloads can contain bytes that mimic valid tnetstring length prefixes, so scanning forward would produce phantom flows.
Structured report (--report)
Pass --report <PATH> to either discover or generate to write a JSON
processing summary. This is useful for CI pipelines that need structured data
instead of log scraping.
Strict mode
Pass --strict to either discover or generate to treat any warning-level
event as a hard failure. The process exits with code 2 if any resource cap
fired, a flow was rejected, or a parse error was encountered.
This is designed for CI gates where silent degradation is unacceptable:
||
Without --strict, the same conditions are logged at warn level and processing
continues (exit code 0).
Supported Formats
| Format | Versions | Extension |
|---|---|---|
| mitmproxy flow dumps | v19, v20, v21 | .flow |
| HAR (HTTP Archive) | 1.2 (incrementally parsed) | .har |
Format is auto-detected from file content. Use --format to override.
Migration from Python mitmproxy2swagger
Python (mitmproxy2swagger) |
Rust (mitm2openapi) |
|---|---|
pip install mitmproxy2swagger |
Single binary, no runtime |
mitmproxy2swagger -i <file> -o <spec> -p <prefix> |
Two-step: discover then generate |
| Edits spec file in-place | Separate templates file for curation |
| Requires Python 3.x + mitmproxy | Standalone binary |
| Supports mitmproxy only | Supports mitmproxy flow dumps + HAR |
Key differences
- Two-step workflow:
discoverproduces a templates file; you curate it;generateproduces the final spec. This separates endpoint selection from spec generation. - Templates file: Discovered endpoints are prefixed with
ignore:. Remove the prefix to include an endpoint. This replaces editing the output spec directly. - No Python dependency: Ships as a single static binary for Linux, macOS, and Windows.
- HAR support: Process HAR exports from browser DevTools or other HTTP tools.
Benchmarks
Automated CI benchmark runs weekly against the Python original
(mitmproxy2swagger). See
docs/benchmarks.md for the latest timing and memory
comparison on a ~80 MB synthetic capture, or
trigger a fresh run via
Actions → Benchmark.
Reproduce locally with the commands documented in the workflow file.
Contributing
Contributions welcome! See CONTRIBUTING.md for local testing setup (unit tests, Petstore golden test, crAPI integration, demo GIF pipeline).
License
MIT