markitdown-rs
markitdown-rs is a Rust library designed to facilitate the conversion of various document formats into markdown text. It is a Rust implementation of the original markitdown Python library.
Features
It supports:
- Excel(.xlsx)
- Word(.docx)
- PowerPoint
- Images
- Audio
- HTML
- CSV(UTF-8)
- Text-based formats (.xml, .rss, .atom)
- ZIP
Usage
Command-Line
Installation
cargo install markitdown
Convert a File
markitdown path-to-file.pdf
Or use -o to specify the output file:
markitdown path-to-file.pdf -o document.md
Rust API
Installation
Add the following to your Cargo.toml:
[]
= "0.1.8"
Initialize MarkItDown
use MarkItDown;
let mut md = new;
Convert a File
use ;
let options = ConversionOptions ;
let result: = md.convert;
// To use Large Language Models for image descriptions, provide llm_client and llm_model, like:
let options = ConversionOptions ;
let result: = md.convert;
if let Some = result else
Register a Custom Converter
You can extend MarkItDown by implementing the DocumentConverter trait for your custom converters and registering them:
use ;
;
let mut md = new;
md.register_converter;
License
MarkItDown is licensed under the MIT License. See LICENSE for more details.