EpubDoc

Struct EpubDoc 

Source
pub struct EpubDoc<R: Read + Seek> {
    pub package_path: PathBuf,
    pub base_path: PathBuf,
    pub version: EpubVersion,
    pub unique_identifier: String,
    pub metadata: Vec<MetadataItem>,
    pub metadata_link: Vec<MetadataLinkItem>,
    pub manifest: HashMap<String, ManifestItem>,
    pub spine: Vec<SpineItem>,
    pub encryption: Option<Vec<EncryptionData>>,
    pub catalog: Vec<NavPoint>,
    pub catalog_title: String,
    /* private fields */
}
Expand description

EPUB document parser, representing a loaded and parsed EPUB publication

The EpubDoc structure is the core of the entire EPUB parsing library. It encapsulates all the parsing logic and data access interfaces for EPUB files. It is responsible for parsing various components of an EPUB, including metadata, manifests, reading order, table of contents navigation, and encrypted information, and provides methods for accessing this data.

Provides a unified data access interface for EPUB files, hiding the underlying file structure and parsing details. Strictly adheres to the EPUB specification in implementing the parsing logic to ensure compatibility with the standard.

§Usage

use lib_epub::epub::EpubDoc;

let doc = EpubDoc::new("./test_case/epub-33.epub");
assert!(doc.is_ok());

§Notes

  • The EpubDoc structure is thread-safe if and only if the structure is immutable.
  • The fact that EpubDoc is mutable has no practical meaning; modifications to the structure data are not stored in the epub file.

Fields§

§package_path: PathBuf

The path to the OPF file

§base_path: PathBuf

The path to the directory where the opf file is located

§version: EpubVersion

The epub version

§unique_identifier: String

The unique identifier of the epub file

This identifier is the actual value of the unique-identifier attribute of the package.

§metadata: Vec<MetadataItem>

Epub metadata extracted from OPF

§metadata_link: Vec<MetadataLinkItem>

Data in metadata that points to external files

§manifest: HashMap<String, ManifestItem>

A list of resources contained inside an epub extracted from OPF

All resources in the epub file are declared here, and undeclared resources should not be stored in the epub file and cannot be obtained from it.

§spine: Vec<SpineItem>

Physical reading order of publications extracted from OPF

This attribute declares the order in which multiple files containing published content should be displayed.

§encryption: Option<Vec<EncryptionData>>

The encryption.xml extracted from the META-INF directory

§catalog: Vec<NavPoint>

The navigation data of the epub file

§catalog_title: String

The title of the catalog

Implementations§

Source§

impl<R: Read + Seek> EpubDoc<R>

Source

pub fn from_reader(reader: R, epub_path: PathBuf) -> Result<Self, EpubError>

Creates a new EPUB document instance from a reader

This function is responsible for the core logic of parsing EPUB files, including verifying the file format, parsing container information, loading the OPF package document, and extracting metadata, manifest, reading order, and other core information.

§Parameters
  • reader: The data source that implements the Read and Seek traits, usually a file or memory buffer
  • epub_path: The path to the EPUB file, used for path resolution and validation
§Return
  • Ok(EpubDoc<R>): The successfully parsed EPUB document object
  • Err(EpubError): Errors encountered during parsing
§Notes
  • This function assumes the EPUB file structure is valid
Source

pub fn has_encryption(&self) -> bool

Check if the EPUB file contains encryption.xml

This function determines whether a publication contains encrypted resources by checking if a META-INF/encryption.xml file exists in the EPUB package. According to the EPUB specification, when resources in a publication are encrypted, the corresponding encryption information must be declared in the META-INF/encryption.xml file.

§Return
  • true if the publication contains encrypted resources
  • false if the publication does not contain encrypted resources
§Notes
  • This function only checks the existence of the encrypted file; it does not verify the validity of the encrypted information.
Source

pub fn get_metadata(&self, key: &str) -> Option<Vec<MetadataItem>>

Retrieves a list of metadata items

This function retrieves all matching metadata items from the EPUB metadata based on the specified attribute name (key). Metadata items may come from the DC (Dublin Core) namespace or the OPF namespace and contain basic information about the publication, such as title, author, identifier, etc.

§Parameters
  • key: The name of the metadata attribute to retrieve
§Return
  • Some(Vec<MetadataItem>): A vector containing all matching metadata items
  • None: If no matching metadata items are found
Source

pub fn get_metadata_value(&self, key: &str) -> Option<Vec<String>>

Retrieves a list of values for specific metadata items

This function retrieves the values ​​of all matching metadata items from the EPUB metadata based on the given property name (key).

§Parameters
  • key: The name of the metadata attribute to retrieve
§Return
  • Some(Vec<String>): A vector containing all matching metadata item values
  • None: If no matching metadata items are found
Source

pub fn get_title(&self) -> Result<Vec<String>, EpubError>

Retrieves the title of the publication

This function retrieves all title information from the EPUB metadata. According to the EPUB specification, a publication can have multiple titles, which are returned in the order they appear in the metadata.

§Return
  • Result<Vec<String>, EpubError>: A vector containing all title information
  • EpubError: If and only if the OPF file does not contain <dc:title>
§Notes
  • The EPUB specification requires each publication to have at least one title.
Source

pub fn get_language(&self) -> Result<Vec<String>, EpubError>

Retrieves the language used in the publication

This function retrieves the language information of a publication from the EPUB metadata. According to the EPUB specification, language information identifies the primary language of the publication and can have multiple language identifiers.

§Return
  • Ok(Vec<String>): A vector containing all language identifiers
  • Err(EpubError): If and only if the OPF file does not contain <dc:language>
§Notes
  • The EPUB specification requires that each publication specify at least one primary language.
  • Language identifiers should conform to RFC 3066 or later standards.
Source

pub fn get_identifier(&self) -> Result<Vec<String>, EpubError>

Retrieves the identifier of a publication

This function retrieves the identifier information of a publication from the EPUB metadata. According to the EPUB specification, each publication must have a identifier, typically an ISBN, UUID, or other unique identifier.

§Return
  • Ok(Vec<String>): A vector containing all identifier information
  • Err(EpubError): If and only if the OPF file does not contain <dc:identifier>
§Notes
  • The EPUB specification requires each publication to have at least one identifier.
  • In the OPF file, the unique-identifier attribute of the <package> element should point to a <dc:identifier> element used to uniquely identify the publication. This means that unique-identifier is not exactly equal to <dc:identifier>.
Source

pub fn get_manifest_item( &self, id: &str, ) -> Result<(Vec<u8>, String), EpubError>

Retrieve resource data by resource ID

This function will find the resource with the specified ID in the manifest. If the resource is encrypted, it will be automatically decrypted.

§Parameters
  • id: The ID of the resource to retrieve
§Return
  • Ok((Vec<u8>, String)): Successfully retrieved and decrypted resource data and the MIME type
  • Err(EpubError): Errors that occurred during the retrieval process
§Notes
  • This function will automatically decrypt the resource if it is encrypted.
  • For unsupported encryption methods, the corresponding error will be returned.
Source

pub fn get_manifest_item_by_path( &self, path: &str, ) -> Result<(Vec<u8>, String), EpubError>

Retrieves resource item data by resource path

This function retrieves resources from the manifest based on the input path. The input path must be a relative path to the root directory of the EPUB container; using an absolute path or a relative path to another location will result in an error.

§Parameters
  • path: The path of the resource to retrieve
§Return
  • Ok((Vec<u8>, String)): Successfully retrieved and decrypted resource data and the MIME type
  • Err(EpubError): Errors that occurred during the retrieval process
§Notes
  • This function will automatically decrypt the resource if it is encrypted.
  • For unsupported encryption methods, the corresponding error will be returned.
  • Relative paths other than the root directory of the Epub container are not supported.
Source

pub fn get_manifest_item_with_fallback( &self, id: &str, supported_format: Vec<&str>, ) -> Result<(Vec<u8>, String), EpubError>

Retrieves supported resource items by resource ID, with fallback mechanism supported

This function attempts to retrieve the resource item with the specified ID and checks if its MIME type is in the list of supported formats. If the current resource format is not supported, it searches for a supported resource format along the fallback chain according to the fallback mechanism defined in the EPUB specification.

§Parameters
  • id: The ID of the resource to retrieve
  • supported_format: A vector of supported MIME types
§Return
  • Ok((Vec<u8>, String)): Successfully retrieved and decrypted resource data and the MIME type
  • Err(EpubError): Errors that occurred during the retrieval process
Source

pub fn get_cover(&self) -> Option<(Vec<u8>, String)>

Retrieves the cover of the EPUB document

This function searches for the cover of the EPUB document by examining manifest items in the manifest. It looks for manifest items whose ID or attribute contains “cover” (case-insensitive) and attempts to retrieve the content of the first match.

§Return
  • Some((Vec<u8>, String)): Successfully retrieved and decrypted cover data and the MIME type
  • None: No cover resource was found
§Notes
  • This function only returns the first successfully retrieved cover resource, even if multiple matches exist
  • The retrieved cover may not be an image resource; users need to pay attention to the resource’s MIME type.
Source

pub fn navigate_by_spine_index( &mut self, index: usize, ) -> Option<(Vec<u8>, String)>

Navigate to a specified chapter using the spine index

This function retrieves the content data of the corresponding chapter based on the index position in the EPUB spine. The spine defines the linear reading order of the publication’s content documents, and each spine item references resources in the manifest.

§Parameters
  • index: The index position in the spine, starting from 0
§Return
  • Some((Vec<u8>, String)): Successfully retrieved chapter content data and the MIME type
  • None: Index out of range or data retrieval error
§Notes
  • The index must be less than the total number of spine projects.
  • If the resource is encrypted, it will be automatically decrypted before returning.(TODO)
  • It does not check whether the Spine project follows a linear reading order.
Source

pub fn spine_prev(&self) -> Option<(Vec<u8>, String)>

Navigate to the previous linear reading chapter

This function searches backwards in the EPUB spine for the previous linear reading chapter and returns the content data of that chapter. It only navigates to chapters marked as linear reading.

§Return
  • Some((Vec<u8>, String)): Successfully retrieved previous chapter content data and the MIME type
  • None: Already in the first chapter, the current chapter is not linear, or data retrieval failed
Source

pub fn spine_next(&mut self) -> Option<(Vec<u8>, String)>

Navigate to the next linear reading chapter

This function searches forwards in the EPUB spine for the next linear reading chapter and returns the content data of that chapter. It only navigates to chapters marked as linear reading.

§Return
  • Some((Vec<u8>, String)): Successfully retrieved next chapter content data and the MIME type
  • None: Already in the last chapter, the current chapter is not linear, or data retrieval failed
Source

pub fn spine_current(&self) -> Option<(Vec<u8>, String)>

Retrieves the content data of the current chapter

This function returns the content data of the chapter at the current index position in the EPUB spine.

§Return
  • Some((Vec<u8>, String)): Successfully retrieved current chapter content data and the MIME type
  • None: Data retrieval failed
Source§

impl EpubDoc<BufReader<File>>

Source

pub fn new<P: AsRef<Path>>(path: P) -> Result<Self, EpubError>

Creates a new EPUB document instance

This function is a convenience constructor for EpubDoc, used to create an EPUB parser instance directly from a file path.

§Parameters
  • path: The path to the EPUB file
§Return
  • Ok(EpubDoc): The created EPUB document instance
  • Err(EpubError): An error occurred during initialization

Auto Trait Implementations§

§

impl<R> !Freeze for EpubDoc<R>

§

impl<R> RefUnwindSafe for EpubDoc<R>

§

impl<R> Send for EpubDoc<R>
where R: Send,

§

impl<R> Sync for EpubDoc<R>
where R: Send,

§

impl<R> Unpin for EpubDoc<R>

§

impl<R> UnwindSafe for EpubDoc<R>

Blanket Implementations§

Source§

impl<T> Any for T
where T: 'static + ?Sized,

Source§

fn type_id(&self) -> TypeId

Gets the TypeId of self. Read more
Source§

impl<T> Borrow<T> for T
where T: ?Sized,

Source§

fn borrow(&self) -> &T

Immutably borrows from an owned value. Read more
Source§

impl<T> BorrowMut<T> for T
where T: ?Sized,

Source§

fn borrow_mut(&mut self) -> &mut T

Mutably borrows from an owned value. Read more
Source§

impl<T> From<T> for T

Source§

fn from(t: T) -> T

Returns the argument unchanged.

Source§

impl<T, U> Into<U> for T
where U: From<T>,

Source§

fn into(self) -> U

Calls U::from(self).

That is, this conversion is whatever the implementation of From<T> for U chooses to do.

Source§

impl<T> Same for T

Source§

type Output = T

Should always be Self
Source§

impl<T, U> TryFrom<U> for T
where U: Into<T>,

Source§

type Error = Infallible

The type returned in the event of a conversion error.
Source§

fn try_from(value: U) -> Result<T, <T as TryFrom<U>>::Error>

Performs the conversion.
Source§

impl<T, U> TryInto<U> for T
where U: TryFrom<T>,

Source§

type Error = <U as TryFrom<T>>::Error

The type returned in the event of a conversion error.
Source§

fn try_into(self) -> Result<U, <U as TryFrom<T>>::Error>

Performs the conversion.