dw2md
Grab an entire DeepWiki and compile it into a single markdown file — ready to drop into an LLM context window.
DeepWiki generates excellent structured documentation for open-source repositories, but it's spread across dozens of client-rendered pages with no export button. dw2md talks directly to DeepWiki's MCP server to pull the full wiki structure and contents, then assembles everything into one clean document.
Install
From crates.io (recommended)
Homebrew (macOS/Linux)
Coming soon! Homebrew tap is in progress.
From source
Pre-built binaries
Download from GitHub Releases for Linux, macOS (x86_64/ARM64), and Windows.
Produces a single static binary (~6MB, no OpenSSL dependency).
Usage
dw2md [OPTIONS] <REPO>
<REPO> accepts any of:
| Format | Example |
|---|---|
owner/repo |
tinygrad/tinygrad |
| Full URL | https://deepwiki.com/tinygrad/tinygrad |
| Page URL (extracts repo) | https://deepwiki.com/tokio-rs/tokio/3.1-runtime |
Options
| Flag | Short | Default | Description |
|---|---|---|---|
--output <FILE> |
-o |
stdout | Write to file instead of stdout |
--format <FMT> |
-f |
markdown |
Output format: markdown or json |
--timeout <SECS> |
-t |
30 |
Per-request timeout in seconds |
--pages <FILTER> |
-p |
all | Comma-separated page slugs to include |
--exclude <FILTER> |
-x |
none | Comma-separated page slugs to exclude |
--no-toc |
Omit the structure tree from output | ||
--no-metadata |
Omit the metadata header | ||
--list |
-l |
Print the table of contents and exit | |
--interactive |
-i |
Interactively select which sections to include | |
--quiet |
-q |
Suppress progress on stderr | |
--verbose |
-v |
Show debug info |
Examples
Dump an entire wiki to a file:
Pipe straight to clipboard (macOS):
|
See what sections are available before downloading:
├── 1 Overview [1-overview]
│ └── 1.1 Repository Structure and Packages [1-1-repository-structure-and-packages]
├── 2 Feature Flags System [2-feature-flags-system]
├── 3 Build System and Package Distribution [3-build-system-and-package-distribution]
│ ...
└── 7 Developer Tools and Debugging [7-developer-tools-and-debugging]
Interactively pick which sections to include:
Shows a multi-select prompt where you can toggle sections on/off with space, then press enter to fetch only what you selected. All sections are selected by default.
Only grab specific sections (if you know the slugs):
Exclude sections you don't need:
JSON output for programmatic use:
From a DeepWiki URL you already have open:
Minimal output — no metadata, no TOC, just content:
Output
Markdown (default)
The output format is designed for LLM and agent workflows — a tree-structured table of contents for fast orientation, and grep-friendly section delimiters so agents can selectively extract the sections they need.
├── 1 Overview
│ └── 1.1 Repository Structure and Packages
├── 2 Feature Flags System
├── 3 Build System and Package Distribution
│ ├── 3.1 Setup.py and Package Config
│ └── 3.2 CI/CD Pipeline
└── 4 Core Architecture
<<< SECTION: 1 Overview [1-overview] >>>
[page content with original heading levels preserved]
<<< SECTION: 1.1 Repository Structure and Packages [1-1-repository-structure-and-packages] >>>
[page content]
<<< SECTION: 2 Feature Flags System [2-feature-flags-system] >>>
[page content]
Why this format?
The <<< SECTION: Title [slug] >>> delimiter is designed to be trivially grep-able:
# List all sections in a compiled wiki
# Extract a specific section (content between two delimiters)
# Regex captures both title and slug in one match
# ^<<< SECTION: (.+?) \[(.+?)\] >>>$
This matters because LLMs and agents working with large documents need to scan structure cheaply, then pull in only the sections relevant to their current task — rather than stuffing the entire document into context.
Other format choices:
- Tree TOC —
├──/└──characters show hierarchy at a glance (same astreecommand), scannable faster than indented bullet lists - Original heading levels preserved — no heading-level bumping; the delimiter handles section boundaries, so page content keeps its source structure
- Token efficient — no repeated horizontal rules (
---), no anchor link markup, no extra#characters from heading bumping - HTML comment on line 1 — machine-parseable metadata invisible to most renderers
- Mermaid blocks preserved as fenced code blocks
- Source annotations preserved (
Sources: file.py:1-50) for code location context
JSON
With --format json:
Useful for feeding individual pages into separate context windows, building retrieval indexes, etc.
How It Works
dw2md is a minimal MCP client that talks to DeepWiki's public JSON-RPC endpoint (https://mcp.deepwiki.com/mcp). No API key, no auth, no browser automation.
- Initialize — MCP handshake with the server
- Fetch structure —
read_wiki_structurereturns the table of contents - Fetch content —
read_wiki_contentsreturns all pages in one response - Match & compile — split content by page markers, match to structure, assemble output
Failed requests are retried 3 times with exponential backoff (1s, 2s, 4s).
Contributing
Contributions welcome! Please ensure:
- All tests pass:
cargo test - Code is formatted:
cargo fmt - Clippy is happy:
cargo clippy
License
MIT