ebook
A comprehensive Rust tool for reading, writing, and operating on various ebook formats. Available as a CLI, MCP server (via rmcp, the Rust Model Context Protocol SDK), and a Rust library.
Why this project (and the MCP server) exists
Most long-form knowledge still lives in ebook containers, not in clean Markdown or HTML on disk. EPUB, Kindle (MOBI/AZW/KF8), FB2, CBZ, and PDF package text, structure, fonts, images, and metadata in ways that general-purpose file tools and LLM context windows do not understand out of the box. Assistants and automation therefore hit a wall: they cannot reliably open, navigate, validate, convert, or summarize those files without a dedicated format layer.
This crate exists to be that layer: one API (and one MCP surface) over many ebook formats so tools and agents can treat books like first-class data—whether you run it as ebook … in a shell, embed it in Rust, or attach the MCP server to a client so the model can call read_ebook, convert_ebook, validate_ebook, and friends on real paths.
What you get
- Format detection - Identify EPUB, MOBI, AZW, PDF, CBZ, FB2, TXT, and more from structure and extension
- Metadata and TOC - Titles, authors, chapters, and navigation where the format supports it
- Content and assets - Text plus image extraction where applicable
- Conversion and repair - Pipeline between supported formats and basic healing of damaged files
- Agent-ready MCP - Standard protocol and tool schemas so clients do not reimplement ZIP/XML/PDF/MOBI stacks
This crate ties these capabilities together for CLI use, library use, and MCP-hosted assistants.
Supported Formats
- EPUB (2.0 & 3.0) - Electronic Publication format
- MOBI - Mobipocket format
- AZW - Kindle format with DRM detection
- AZW3 (KF8) - Kindle Format 8
- FB2 - FictionBook 2.0
- CBZ - Comic Book Archive with ComicInfo.xml support
- TXT - Plain text files with encoding detection
- PDF - Portable Document Format
Features
Core Operations
- ✅ Read ebook metadata, content, and table of contents
- ✅ Write/create ebooks in all supported formats
- ✅ Extract images from ebooks (EPUB, CBZ, PDF)
- ✅ Validate ebook file structure and integrity
- ✅ Repair corrupted ebook files
- ✅ Convert between formats (TXT ↔ EPUB, TXT ↔ PDF, TXT ↔ MOBI, EPUB → PDF, etc.)
Advanced Features
- ✅ Image optimization - Resize and compress images in EPUB/CBZ files
- ✅ Streaming support - Handle large files efficiently (10MB+ TXT, 50MB+ EPUB)
- ✅ Progress indicators - Visual feedback for long operations
- ✅ Encoding detection - Automatic character encoding detection for TXT files
- ✅ Format auto-detection - Works based on file extension
Integration
- ✅ MCP Server - AI assistant integration via Model Context Protocol
- ✅ Library API - Use as a Rust library in your projects
- ✅ CLI - Full-featured command-line interface
Installation
From source
The binary will be available at target/release/ebook (repository root).
As a library
Add to your Cargo.toml:
[]
= "0.1.2"
Usage
CLI Examples
Read an ebook
# Display full content
# Show metadata only (title, author, etc.)
# Show table of contents
# Extract images to a directory
# Read specific format (auto-detected by extension)
Write/Create an ebook
# Create from a text file
# Create an EPUB with all metadata
# Create a PDF
# Create a CBZ comic archive
Get ebook information
# Quick info display
# Output example:
# Format: EPUB
# Title: The Great Book
# Author: John Doe
# Size: 1.2 MB
# Valid: Yes
Validate an ebook
# Validate file structure
# Returns detailed validation results
Repair an ebook
# Repair in place (creates backup)
# Repair and save to new file
Convert between formats
# TXT to EPUB (for e-readers)
# EPUB to PDF (for printing/sharing)
# MOBI to TXT (extract text)
# FB2 to EPUB
Optimize images in ebooks
# Optimize all images in an EPUB (reduces file size)
# Custom dimensions and quality
# Optimize without resizing (compression only)
MCP server (Model Context Protocol)
The MCP server uses rmcp on stdio (newline-delimited JSON-RPC), matching what mainstream MCP clients expect. It exposes the same ebook operations as tools with JSON Schema arguments generated from Rust types—no hand-maintained protocol loop.
Starting the server
Clients must complete the normal MCP handshake: send initialize with a valid params object, then send notifications/initialized after receiving the initialize result, before tools/list or tools/call. Hosted clients (Claude Desktop, Cursor, etc.) do this automatically.
Available MCP tools
| Tool | Description |
|---|---|
read_ebook |
Read content, metadata, and table of contents |
write_ebook |
Create new ebooks in any supported format |
extract_images |
Extract images from ebooks |
validate_ebook |
Validate ebook file structure |
get_ebook_info |
Get detailed ebook information |
convert_ebook |
Convert between formats |
optimize_images |
Optimize images in EPUB/CBZ files |
Quick Setup for Claude Desktop
Add to ~/Library/Application Support/Claude/claude_desktop_config.json:
Example AI workflows
Summarize a book:
User: Read the ebook at ~/Documents/book.epub and summarize chapter 1
Claude: [Uses read_ebook tool, analyzes content, provides summary]
Convert a document:
User: Convert ~/Downloads/novel.txt to EPUB format
Claude: [Uses convert_ebook tool, creates novel.epub]
Extract images:
User: Extract all images from the comic book at ~/comics/issue1.cbz
Claude: [Uses extract_images tool, returns images with metadata]
See docs/MCP.md for tool parameters and examples.
Library usage
Use the ebook crate as a Rust library for formats, conversion, and MCP hosting.
Embed the MCP server (rmcp)
You can run the same tool surface from your own binary using EbookMcp and rmcp’s ServiceExt (see the rmcp crate for transports other than stdio):
use EbookMcp;
use ;
async
McpServer in ebook::mcp is the thin wrapper used by the ebook mcp CLI subcommand.
Basic example (formats API)
use TxtHandler;
use ;
Working with different formats
use ;
use EbookReader;
// Read EPUB
let mut epub = new;
epub.read_from_file?;
let toc = epub.get_toc?;
println!;
// Read MOBI
let mut mobi = new;
mobi.read_from_file?;
let metadata = mobi.get_metadata?;
// Read PDF
let mut pdf = new;
pdf.read_from_file?;
let content = pdf.get_content?;
Format detection
use detect_format;
let format = detect_format?;
assert_eq!;
Conversion
use Converter;
convert?;
See ARCHITECTURE.md for detailed library documentation.
Architecture
The project follows a trait-based architecture for consistent API across all formats:
Core Traits
EbookReader- Read operations: content, metadata, table of contents, imagesEbookWriter- Write operations: create ebooks with content and metadataEbookOperator- Advanced operations: convert, validate, repair
Format Handlers
Each format has a dedicated handler implementing all applicable traits:
| Handler | Read | Write | Metadata | TOC | Convert | Images |
|---|---|---|---|---|---|---|
EpubHandler |
✅ | ✅ | ✅ | ✅ | ✅ | ✅ |
MobiHandler |
✅ | ✅ | ✅ | ✅ | ✅ | ❌ |
AzwHandler |
✅ | ❌ | ✅ | ✅ | ❌ | ❌ |
Fb2Handler |
✅ | ✅ | ✅ | ✅ | ✅ | ❌ |
CbzHandler |
✅ | ✅ | ✅ | ❌ | ✅ | ✅ |
TxtHandler |
✅ | ✅ | ✅ | ❌ | ✅ | ❌ |
PdfHandler |
✅ | ✅ | ✅ | ❌ | ✅ | ✅ |
Key Features
- Streaming - Large files are processed in chunks (10MB+ TXT, 50MB+ EPUB)
- Progress bars - Visual feedback for long-running operations
- Error recovery - Helpful error messages with suggestions
- Thread-safe - Safe for concurrent use
Project Status
Version: 0.1.1
License: Apache-2.0
Test Status: ✅ All 103 tests passing
Supported Platforms: macOS, Linux, Windows (Rust-supported platforms)
Recent Updates:
- MCP server implemented with rmcp (stdio, spec handshake, schema-derived tools); library exposes
EbookMcpfor embedding - AZW format support with DRM detection
- Image optimization for EPUB/CBZ files
- EPUB 3.0 support (nav.xhtml, semantic markup, version switching)
- Streaming for large file handling (10MB+ TXT, 50MB+ EPUB thresholds)
- Comprehensive format conversion with CLI and MCP integration
- Progress indicators for long operations
- 103 comprehensive tests with full coverage
Planned Features:
- DJVU and CHM format support
- OCR for scanned PDFs
- Enhanced metadata editing
- Web service API
- Batch processing
See TODO.md for complete roadmap and known issues.
Documentation
- ARCHITECTURE.md - Detailed architecture documentation
- SPEC.md - Original specification document
- docs/MCP.md - MCP server integration guide
- TODO.md - Development roadmap and known issues
Contributing
Contributions are welcome! Please feel free to submit a Pull Request.
License
Licensed under the Apache License, Version 2.0 (LICENSE or http://www.apache.org/licenses/LICENSE-2.0)
Development
Build
# Debug build
# Release build (optimized)
Run tests
# Run all tests
# Run specific test
# Run with output
# Run tests in parallel
Test Coverage: 103 tests covering:
- Format handlers (EPUB, MOBI, AZW, FB2, CBZ, TXT, PDF)
- CLI integration tests
- MCP integration tests
- Conversion tests
- Streaming tests
- Image optimization tests
- EPUB 3.0 features
- Error handling
Run benchmarks
# Performance benchmarks (requires criterion)
Benchmarks available for:
- EPUB read/write performance
- CBZ read/write performance
- Image optimization performance
Example files
# Run with example file
# Create an EPUB
Enable logging
# Info level
RUST_LOG=info
# Debug level (verbose)
RUST_LOG=debug
# Trace level (very verbose)
RUST_LOG=trace