dm2xcod
A high-performance DOCX to Markdown converter written in Rust, with Python bindings.
Features
- Fast & Efficient: Written in Rust for maximum performance.
- Rich Formatting: Preserves bold, italic, underline, strikethrough, and more.
- Uses HTML tags (
<strong>,<em>) for better cross-parser compatibility.
- Uses HTML tags (
- Structure Preservation: Handles heading hierarchy, lists (ordered/unordered), and tables.
- Image Support: Extracts and embeds images.
- Cross-Platform: Pre-built wheels for macOS (Intel/Apple Silicon), Windows, and Linux.
- Simple API: Native Python bindings provided via PyO3.
Requirements
- Rust: 1.75+ (for building from source)
- Python: 3.12+ (Universal ABI3 support - works with 3.12, 3.13, 3.14+, etc.)
Installation
Python
Install via pip:
CLI
Install via cargo:
Rust Library
Add to your Cargo.toml:
[]
= "0.3"
Usage
CLI
Python
# Basic conversion
=
# With options (if applicable in future versions)
# markdown = dm2xcod.convert_docx("document.docx", image_dir="images")
Rust
use ;
Development
Build from Source
# Build Rust library/CLI
# Development with Python
License
MIT