Skip to main content

Crate rbook

Crate rbook 

Source
Expand description

A fast, format-agnostic, ergonomic ebook library with a current focus on EPUB.

The primary goal of rbook is to provide an easy-to-use high-level API for reading, creating, and modifying ebooks. Most importantly, this library is designed with future formats in mind (CBZ, FB2, MOBI, etc.) via core traits defined within the ebook and reader module, allowing all formats to share the same “base” API.

Traits such as Ebook allow formats to be handled generically. For example, retrieving the data of a cover image agnostic to the concrete format (e.g., Epub):

// Here `ebook` may be of any supported format.
fn cover_image_bytes<E: Ebook>(ebook: &E) -> Option<Vec<u8>> {
    // 1 - An ebook may not have a `cover_image` entry, hence the try operator (`?`).
    // 2 - `read_bytes` returns a `Result`; `ok()` converts the result into `Option`.
    ebook.manifest().cover_image()?.read_bytes().ok()
}

§Features

Here is a non-exhaustive list of the features rbook provides:

FeatureOverview
EPUB 2 and 3EPUB builder and parser, providing simple and advanced read/write views for EPUB 2 and 3.
Streaming ReaderRandom‐access or sequential iteration over readable content.
Detailed TypesAbstractions built on expressive traits and types.
MetadataTyped access to titles, creators, publishers, languages, tags, roles, attributes, and more.
ManifestLookup and traverse contained resources such as readable content (XHTML) and images.
SpineChronological reading order and preferred page direction.
Table of Contents (ToC)Navigation points, including the EPUB 2 guide and EPUB 3 landmarks.
ResourcesLazy retrieval of bytes or strings for any manifest resource; data is not loaded up-front until requested.

§Default crate features

These are toggleable features for rbook that are enabled by default in a project’s Cargo.toml file:

FeatureDescription
writeCreation and modification of EPUB 2 and 3 formats.
preludeConvenience prelude only including common traits.
threadsafeEnables Send + Sync constraint for Epub.

Default features can be disabled and toggled selectively. For example, only retaining the threadsafe default feature:

[dependencies]
rbook = { version = "0.7.5", default-features = false, features = ["threadsafe"] }

§Opening an Ebook

rbook supports several methods to open an ebook (Epub):

  • A directory containing the contents of an unzipped ebook:
    let epub = Epub::open("/ebooks/unzipped_epub_dir");
  • A file path:
    let epub = Epub::open("/ebooks/zipped.epub");
  • Or any implementation of Read + Seek (and Send + Sync if the threadsafe feature is enabled):
    let cursor = std::io::Cursor::new(bytes);
    let epub = Epub::read(cursor);

Aside from how the contents of an ebook are stored, options can also be given to control parser behavior, such as strictness:

let epub = Epub::options()
    .strict(true) // Enable strict checks (`false` by default)
    // If only metadata is needed, skipping helps quicken parsing time and reduce space.
    .skip_toc(true) // Skips ToC-related parsing, such as toc.ncx (`false` by default)
    .skip_manifest(true) // Skips manifest-related parsing (`false` by default)
    .skip_spine(true) // Skips spine-related parsing (`false` by default)
    .open("tests/ebooks/example_epub")
    .unwrap();

§Reading an Ebook

Reading the contents of an ebook is handled by a Reader, which traverses end-user-readable resources in canonical order:

// Create a reader instance
let mut reader = epub.reader();

// Print the readable content
while let Some(Ok(data)) = reader.read_next() {
    let kind = data.manifest_entry().kind();

    assert_eq!("application/xhtml+xml", kind.as_str());
    assert_eq!("xhtml", kind.subtype());
    println!("{}", data.content());
}

Prior to creation, a reader can receive options to control its behavior, such as linearity:

use rbook::epub::reader::LinearBehavior;

let mut reader = epub.reader_builder()
                     // Make a reader omit non-linear content
                     .linear_behavior(LinearBehavior::LinearOnly)
                     .create();

§Resource retrieval from an Ebook

All files such as text, images, and video are accessible within an ebook programmatically.

The simplest way to access and retrieve resources from an ebook is through the Manifest, specifically through its entries via:

let manifest_entry = epub.manifest().cover_image().unwrap();
let cover_image_bytes = manifest_entry.read_bytes()?;

// process bytes //

For finer grain control, the Ebook trait provides two methods that accept a Resource as an argument:

let manifest_entry = epub.manifest().cover_image().unwrap();

let bytes_a = epub.read_resource_bytes(manifest_entry)?;
let bytes_b = epub.read_resource_bytes("/EPUB/img/cover.webm")?;

assert_eq!(bytes_a, bytes_b);

All resource retrieval methods are fallible, and attempts to access malformed or missing resources will return an EbookError::Archive error.

§See Also

§Examples

§Accessing Metadata: Retrieving the main title

// Retrieve the main title (all titles retrievable via `titles()`)
let title = epub.metadata().title().unwrap();
assert_eq!("Example EPUB", title.value());
assert_eq!(TitleKind::Main, title.kind());

// Retrieve the first alternate script of a title
let alternate_script = title.alternate_scripts().next().unwrap();
assert_eq!("サンプルEPUB", alternate_script.value());
assert_eq!("ja", alternate_script.language().scheme().code());
assert_eq!(LanguageKind::Bcp47, alternate_script.language().kind());

§Accessing Metadata: Retrieving the date and first creator

// Retrieve the publication year
let published = epub.metadata().published().unwrap();
assert_eq!(2023, published.date().year());
assert_eq!(1, published.date().month());
assert_eq!(25, published.date().day());
assert!(published.time().is_local());

// Retrieve the first creator
let creator = epub.metadata().creators().next().unwrap();
assert_eq!("John Doe", creator.value());
assert_eq!(Some("Doe, John"), creator.file_as());
assert_eq!(0, creator.order());

// Retrieve the main role of a creator (all roles retrievable via `roles()`)
let role = creator.main_role().unwrap();
assert_eq!("aut", role.code());
assert_eq!(Some("marc:relators"), role.source());

// Retrieve the first alternate script of a creator
let alternate_script = creator.alternate_scripts().next().unwrap();
assert_eq!("山田太郎", alternate_script.value());
assert_eq!("ja", alternate_script.language().scheme().code());
assert_eq!(LanguageKind::Bcp47, alternate_script.language().kind());

§Extracting images from the Manifest

use std::fs::{self, File};
use std::path::Path;

// Create an output directory for the extracted images
let out = Path::new("extracted_images");
fs::create_dir_all(&out).unwrap();

for image in epub.manifest().images() {
    // Extract the filename from the href and write to disk
    let filename = image.href().name().decode(); // Decode EPUB hrefs are percent-encoded

    // Copy the raw image bytes
    let mut file = File::create(out.join(&*filename)).unwrap();
    image.copy_bytes(&mut file).unwrap();
}

§Accessing EpubManifest fallbacks

// Fallbacks
let webm_cover = epub.manifest().cover_image().unwrap();
let kind = webm_cover.kind();
assert_eq!(("image", "webm"), (kind.maintype(), kind.subtype()));

// If the program does not support `webm`, fallback
let avif_cover = webm_cover.fallback().unwrap();
assert_eq!("image/avif", avif_cover.media_type());

// If the program does not support `avif`, fallback
let png_cover = avif_cover.fallback().unwrap();
assert_eq!("image/png", png_cover.media_type());

// No fallbacks remaining
assert_eq!(None, png_cover.fallback());

§Editing an Epub

use rbook::epub::EpubChapter;

Epub::open("old.epub")?
    .edit()
    // Appending a creator
    .author("Jane Doe")
    // Appending a chapter
    .chapter(EpubChapter::new("Chapter 1337").xhtml_body("1337"))
    // Setting the modified date to now
    .modified_now()
    .write()
    .compression(9)
    .save("new.epub")

§Creating a backwards-compatible EPUB 3 file

This example uses the high-level builder API. See the epub module for lower-level control over the manifest, spine, etc.

use rbook::ebook::toc::TocEntryKind;
use rbook::epub::EpubChapter;
use std::path::Path;

const XHTML: &[u8] = b"<xhtml>...</xhtml>"; // Example data

Epub::builder()
    .identifier("urn:example")
    .title("Doe Story")
    .author(["John Doe", "Jane Doe"])
    .language("en")
    // Reference a file stored on disk or provide in-memory bytes
    .cover_image(("cover.png", Path::new("local/file/cover.png")))
    .chapter([
        // Standard Chapter (Auto-generates href/filename "volume_i.xhtml")
        EpubChapter::new("Volume I").xhtml(XHTML).children(
            // Providing an explicit href (v1c1.xhtml)
            EpubChapter::new("I: Intro")
                .kind(TocEntryKind::Introduction)
                .href("v1c1.xhtml")
                .xhtml_body("<p>Basic text</p>"),
        ),
        EpubChapter::new("Volume II").children([
            // Referencing an XHTML file stored on the OS file system
            EpubChapter::new("I").href("v2/c1.xhtml").xhtml(Path::new("path/to/c1.xhtml")),
            // Navigation-only entry linking to a fragment in another chapter
            EpubChapter::new("Section 1").href("v2/c1.xhtml#section-1"),
            // Resource included in spine/manifest but omitted from ToC
            EpubChapter::unlisted("v3extras.xhtml").xhtml(XHTML),
        ]),
    ])
    .write()
    .compression(0)
    // Save to disk or alternatively write to memory
    .save("doe_story.epub")

Re-exports§

pub use ebook::Ebook;
pub use epub::Epub;

Modules§

ebook
Core format-agnostic Ebook module and implementations.
epub
The Electronic Publication (Epub) module.
input
Flexible input traits for Ebook operations.
preludeprelude
The rbook prelude for convenient imports of the core ebook and reader traits.
reader
Sequential + random‐access Ebook Reader module.