markitdown-rs
markitdown-rs is a Rust library designed to facilitate the conversion of various document formats into markdown text. It is a Rust implementation of the original markitdown Python library.
Features
It supports:
- Excel(.xlsx)
- Word(.docx)
- PowerPoint
- Images
- Audio
- HTML
- CSV(UTF-8)
- Text-based formats (.xml, .rss, .atom)
- ZIP
Usage
Command-Line
Installation
cargo install markitdown
Convert a File
markitdown path-to-file.pdf
Or use -o to specify the output file:
markitdown path-to-file.pdf -o document.md
Rust API
Installation
Add the following to your Cargo.toml:
[]
= "0.1.11"
Initialize MarkItDown
use MarkItDown;
let mut md = new;
Convert a File
use ;
// Basic conversion - file type is auto-detected
let result = md.convert?;
// Or explicitly specify options
let options = ConversionOptions ;
let result = md.convert?;
// To use Large Language Models for image descriptions
let options = ConversionOptions ;
let result = md.convert?;
if let Some = result else
Convert from Bytes
use ;
let file_bytes = read?;
// Auto-detect file type from bytes
let result = md.convert_bytes?;
// Or specify options explicitly
let options = ConversionOptions ;
let result = md.convert_bytes?;
if let Some = result
Register a Custom Converter
You can extend MarkItDown by implementing the DocumentConverter trait for your custom converters and registering them:
use ;
use MarkitdownError;
;
let mut md = new;
md.register_converter;
License
MarkItDown is licensed under the MIT License. See LICENSE for more details.