undocx
Fast, accurate DOCX to Markdown converter written in Rust with Python bindings.
Conversion Demo
Click images to see full GitHub-rendered files. Headings, bold/italic/underline, tables, nested lists, footnotes, code blocks, track changes -- all converted automatically.
Install
# Rust library
[]
= "0.3"
Quick Start
CLI
Python
= # from path
= # from bytes
Rust
use ;
let options = ConvertOptions ;
let converter = new;
let markdown = converter.convert?;
Supported Features
| Category | Elements |
|---|---|
| Text | Bold, italic, underline, strikethrough, superscript/subscript |
| Structure | Heading 1-9, Title, Subtitle, alignment (center/right) |
| Lists | Ordered (decimal, letter, roman, Korean, circled), unordered, nested |
| Tables | Colspan, rowspan, nested tables, multi-paragraph cells |
| Links | External, internal bookmarks, TOC anchors |
| Images | Inline, floating, VML legacy -- base64 embed, save to dir, or skip |
| Notes | Footnotes, endnotes, comments (as Markdown [^ref]) |
| Track changes | Insertions (<ins>), deletions (~~strikethrough~~) |
| Other | Page/column/line breaks, SDT, field codes, bookmarks, symbols |
Options
| Field | Default | Description |
|---|---|---|
image_handling |
Inline |
Inline / SaveToDir(path) / Skip |
preserve_whitespace |
false |
Keep original spacing |
html_underline |
true |
<u> tags for underline |
html_strikethrough |
false |
<s> tags instead of ~~ |
strict_reference_validation |
false |
Fail on broken note/comment refs |
Advanced: Custom Pipeline
Replace the default extractor or renderer:
let converter = with_components;
See docs/API_POLICY.md for stability guarantees on these traits.
Development
License
MIT