epubie-lib
A Rust library for parsing EPUB files. This library provides a simple and efficient way to extract metadata, chapters, table of contents, and file contents from EPUB documents. mostly made for my own purposes needing content by chapter rather than by the TOC.
Features
- ✅ Parse EPUB metadata (title, author, language, etc.)
- ✅ Extract chapters and their content
- ✅ Generate table of contents
- ✅ Access individual files within the EPUB
- ✅ HTML content parsing and extraction
- ✅ Support for and 3.0 format
Installation
Add this to your Cargo.toml:
[]
= "0.1.0"
Quick Start
use Epub;
Examples
Extracting Metadata
use Epub;
let epub = new?;
println!;
println!;
println!;
println!;
println!;
if let Some = epub.get_description
Working with Chapters
use Epub;
let epub = new?;
println!;
for in epub.get_chapters.iter.enumerate
Table of Contents
use Epub;
let epub = new?;
let toc = epub.get_table_of_contents;
println!;
for entry in toc.get_entries
Accessing File Contents
use Epub;
let epub = new?;
for file in epub.get_all_files
API Reference
Epub
The main struct for working with EPUB files.
Methods
new(file_path: String) -> Result<Epub, Box<dyn std::error::Error>>- Create a new EPUB instanceget_title() -> &str- Get the book titleget_creator() -> &str- Get the book author/creatorget_language() -> &str- Get the book languageget_identifier() -> &str- Get the book identifierget_date() -> &str- Get the publication dateget_publisher() -> Option<String>- Get the publisherget_description() -> Option<String>- Get the book descriptionget_rights() -> Option<String>- Get the rights informationget_cover() -> Option<String>- Get the cover image pathget_tags() -> Option<Vec<String>>- Get book tagsget_chapters() -> &Vec<Chapter>- Get all chaptersget_chapter_count() -> usize- Get the number of chaptersget_table_of_contents() -> &TableOfContents- Get the table of contentsget_all_files() -> &Vec<EpubFile>- Get all files in the EPUBget_file_count() -> usize- Get the total number of files
Chapter
Represents a chapter in the EPUB.
Methods
get_title() -> &str- Get the chapter titleget_files() -> &Vec<EpubFile>- Get files in this chapterget_file_count() -> usize- Get the number of files in this chapter
EpubFile
Represents a file within the EPUB.
Methods
get_id() -> &str- Get the file IDget_href() -> &str- Get the file href/pathget_title() -> Option<&str>- Get the file titleget_content() -> &str- Get the file content as stringget_media_type() -> &str- Get the MIME typeget_html_bytes() -> &[u8]- Get raw HTML content as bytesis_html() -> bool- Check if the file is HTMLget_parsable_html() -> Option<String>- Get parsable HTML content
TableOfContents
Represents the table of contents.
Methods
get_entries() -> &Vec<TocEntry>- Get all TOC entriesget_entry_count() -> usize- Get the number of TOC entries
TocEntry
Represents an entry in the table of contents.
Methods
get_title() -> &str- Get the entry titleget_href() -> &str- Get the entry href/linkget_level() -> u32- Get the nesting level
Running Examples
The library includes example code demonstrating various use cases:
# Run the basic usage example
# Run tests
Dependencies
chrono- Date and time handlinguuid- UUID generation and parsingzip- ZIP file handling (EPUB files are ZIP archives)regex- Regular expression supportserde- Serialization frameworkserde-xml-rs- XML parsing
Supported EPUB Features
- ✅ EPUB 2.0 and 3.0 formats
- ✅ OCF (Open Container Format) parsing
- ✅ OPF (Open Packaging Format) metadata extraction
- ✅ Navigation document parsing
- ✅ NCX (Navigation Control XML) support
- ✅ HTML content extraction
- ✅ Chapter organization and grouping
Contributing
Contributions are welcome! Please feel free to submit a Pull Request.
Changelog
0.1.0
- Initial release
- Basic EPUB parsing functionality
- Metadata extraction
- Chapter and file organization
- Table of contents generation