mitm2openapi 0.5.0

Convert mitmproxy flow dumps and HAR files to OpenAPI 3.0 specs — fast Rust rewrite of mitmproxy2swagger
Documentation

mitm2openapi

Convert mitmproxy flow dumps and HAR files to OpenAPI 3.0 — fast, single binary, no Python.

A Rust rewrite of mitmproxy2swagger.

CI Nightly Integration Crates.io Downloads docs.rs License: MIT

Why?

mitmproxy2swagger (the Python original by @alufers) works well, but requires Python, pip, and mitmproxy installed in the environment. For CI pipelines, slim Docker images, security audits, and one-off usage that's friction.

mitm2openapi ships as a single ~5 MB static binary — drop it into any environment, no runtime, no venv, no pip install. Same OpenAPI 3.0 output as the original, plus first-class HAR support and glob-based filters for fully unattended pipelines.

Credit to @alufers for the original tool that pioneered this workflow.

Features

  • Fast — pure Rust, single-threaded, processes captures in milliseconds
  • Single static binary — no Python, no venv, no pip, no runtime dependencies
  • Two-format support — mitmproxy flow dumps (v19/v20/v21) and HAR 1.2
  • Two-step workflowdiscover finds endpoints, you curate, generate emits clean OpenAPI 3.0
  • Glob filters--exclude-patterns and --include-patterns for automated pipelines
  • Error recovery — skips corrupt flows, continues processing
  • Auto-detection — heuristic format detection from file content
  • Battle-tested — integration tests against Swagger Petstore and OWASP crAPI with oasdiff verification
  • Cross-platform — Linux, macOS, Windows pre-built binaries

Installation

From binary releases

Download a pre-built binary from GitHub Releases.

From source

cargo install --git https://github.com/Arkptz/mitm2openapi

Quick Start

# 1. Capture traffic with mitmproxy
mitmdump -w capture.flow

# 2. Discover API endpoints
mitm2openapi discover -i capture.flow -o templates.yaml -p "https://api.example.com"

# 3. Edit templates.yaml — remove 'ignore:' prefix from paths you want to include

# 4. Generate OpenAPI spec
mitm2openapi generate -i capture.flow -t templates.yaml -o openapi.yaml -p "https://api.example.com"

Skip the manual edit

If you know which paths you care about up front, use --exclude-patterns and --include-patterns to let discover do the curation:

mitm2openapi discover \
  -i capture.flow -o templates.yaml -p "https://api.example.com" \
  --exclude-patterns '/static/**,/images/**,*.css,*.js,*.svg' \
  --include-patterns '/api/**,/v2/**'

mitm2openapi generate \
  -i capture.flow -t templates.yaml -o openapi.yaml -p "https://api.example.com"

Paths matching --include-patterns are auto-activated (emitted without the ignore: prefix). Paths matching --exclude-patterns are dropped entirely. Everything else still gets ignore: for manual review.

discover

Scan captured traffic and produce a templates file listing all observed endpoints.

mitm2openapi discover [OPTIONS] -i <INPUT> -o <OUTPUT> -p <PREFIX>
Option Description
-i, --input <PATH> Input file (flow dump or HAR)
-o, --output <PATH> Output YAML templates file
-p, --prefix <URL> API prefix URL to filter requests
--format <FORMAT> Input format: auto, har, mitmproxy (default: auto)
--exclude-patterns <GLOBS> Comma-separated globs; matching paths are dropped entirely. * = single segment, ** = any subtree. E.g. /static/**,*.css
--include-patterns <GLOBS> Comma-separated globs; matching paths are emitted without ignore: (auto-activated for generate)
--max-input-size <BYTES> Maximum input file size (default: 2GiB). Accepts suffixes: KiB, MiB, GiB
--allow-symlinks Allow symlinked input files (default: rejected for safety)
--strict Treat warnings as errors; exit code 2 if any cap fires, flow is rejected, or parse error occurs
--report <PATH> Write a structured JSON processing report to the given path

generate

Generate an OpenAPI 3.0 spec from captured traffic using a curated templates file.

mitm2openapi generate [OPTIONS] -i <INPUT> -t <TEMPLATES> -o <OUTPUT> -p <PREFIX>
Option Description
-i, --input <PATH> Input file (flow dump or HAR)
-t, --templates <PATH> Templates YAML file (from discover)
-o, --output <PATH> Output OpenAPI YAML file
-p, --prefix <URL> API prefix URL
--format <FORMAT> Input format: auto, har, mitmproxy (default: auto)
--openapi-title <TITLE> Custom title for the spec
--openapi-version <VER> Custom spec version (default: 1.0.0)
--exclude-headers <LIST> Comma-separated headers to exclude
--exclude-cookies <LIST> Comma-separated cookies to exclude
--include-headers Include headers in the spec
--ignore-images Ignore image content types
--suppress-params Suppress parameter suggestions
--tags-overrides <JSON> JSON string for tag overrides
--max-input-size <BYTES> Maximum input file size (default: 2GiB). Accepts suffixes: KiB, MiB, GiB
--max-payload-size <BYTES> Maximum tnetstring payload size (default: 256MiB)
--max-depth <N> Maximum tnetstring nesting depth (default: 256)
--max-body-size <BYTES> Maximum request/response body size (default: 64MiB)
--allow-symlinks Allow symlinked input files (default: rejected for safety)
--strict Treat warnings as errors; exit code 2 if any cap fires, flow is rejected, or parse error occurs
--report <PATH> Write a structured JSON processing report to the given path

Resource Limits

To prevent denial-of-service when processing untrusted captures, mitm2openapi enforces several configurable limits:

Flag Default Purpose
--max-input-size 2 GiB Reject files larger than this before reading
--max-payload-size 256 MiB Cap on individual tnetstring payload allocation
--max-depth 256 Recursion depth limit for nested tnetstring structures
--max-body-size 64 MiB Maximum request/response body considered during schema inference
--allow-symlinks off By default, symlinked inputs are rejected to prevent path-traversal on shared CI runners

In addition to the configurable limits above, the following per-field caps are applied unconditionally to prevent data corruption:

Field Cap Behaviour
Header name 8 KiB Dropped (other headers still processed)
Header value 64 KiB Truncated to cap
Form fields per request 1 000 Excess fields ignored
URL scheme http / https only Non-HTTP flows silently skipped
Port number 1–65 535 Out-of-range port drops the request
HTTP status code 100–599 Invalid codes treated as no response

Identity fields (scheme, host, path, method, header names) require valid UTF-8. Flows with non-UTF-8 identity bytes are skipped to prevent data aliasing through replacement-character collisions. Control characters in paths are stripped automatically.

Increase --max-input-size if you work with captures larger than 2 GiB (e.g. --max-input-size 8GiB). The other limits rarely need tuning.

Both mitmproxy flow files and HAR files are processed incrementally — memory usage stays bounded regardless of input size.

Diagnostics

When the tnetstring parser encounters corruption in a mitmproxy flow file, it halts and emits a warn-level log with the byte offset, number of successfully parsed entries, and an error classification. No resync is attempted — binary payloads can contain bytes that mimic valid tnetstring length prefixes, so scanning forward would produce phantom flows.

Structured report (--report)

Pass --report <PATH> to either discover or generate to write a JSON processing summary. This is useful for CI pipelines that need structured data instead of log scraping.

{
  "report_version": 1,
  "tool_version": "0.2.3",
  "input": {
    "path": "capture.flow",
    "format": "Auto",
    "size_bytes": 102400
  },
  "result": {
    "flows_read": 150,
    "flows_emitted": 148,
    "paths_in_spec": 12
  },
  "events": {
    "parse_error": {
      "TNetString parse error at byte 98304: unexpected end of input": 1
    }
  }
}

Strict mode

Pass --strict to either discover or generate to treat any warning-level event as a hard failure. The process exits with code 2 if any resource cap fired, a flow was rejected, or a parse error was encountered.

This is designed for CI gates where silent degradation is unacceptable:

mitm2openapi discover -i capture.flow -o templates.yaml -p https://api.example.com --strict \
  || echo "FAIL: corrupt or over-limit flows detected"

Without --strict, the same conditions are logged at warn level and processing continues (exit code 0).

Supported Formats

Format Versions Extension
mitmproxy flow dumps v19, v20, v21 .flow
HAR (HTTP Archive) 1.2 (incrementally parsed) .har

Format is auto-detected from file content. Use --format to override.

Migration from Python mitmproxy2swagger

Python (mitmproxy2swagger) Rust (mitm2openapi)
pip install mitmproxy2swagger Single binary, no runtime
mitmproxy2swagger -i <file> -o <spec> -p <prefix> Two-step: discover then generate
Edits spec file in-place Separate templates file for curation
Requires Python 3.x + mitmproxy Standalone binary
Supports mitmproxy only Supports mitmproxy flow dumps + HAR

Key differences

  • Two-step workflow: discover produces a templates file; you curate it; generate produces the final spec. This separates endpoint selection from spec generation.
  • Templates file: Discovered endpoints are prefixed with ignore:. Remove the prefix to include an endpoint. This replaces editing the output spec directly.
  • No Python dependency: Ships as a single static binary for Linux, macOS, and Windows.
  • HAR support: Process HAR exports from browser DevTools or other HTTP tools.

Benchmarks

A GitHub Actions workflow runs hyperfine against the release binary on every push to main. Results are uploaded as build artifacts for manual inspection. No automated regression gate is enforced yet — the artifacts provide a historical record for eyeballing trends.

Contributing

Contributions welcome! See CONTRIBUTING.md for local testing setup (unit tests, Petstore golden test, crAPI integration, demo GIF pipeline).

License

MIT