pub struct EpubDoc<R: Read + Seek> {
pub package_path: PathBuf,
pub base_path: PathBuf,
pub version: EpubVersion,
pub unique_identifier: String,
pub metadata: Vec<MetadataItem>,
pub metadata_link: Vec<MetadataLinkItem>,
pub manifest: HashMap<String, ManifestItem>,
pub spine: Vec<SpineItem>,
pub encryption: Option<Vec<EncryptionData>>,
pub catalog: Vec<NavPoint>,
pub catalog_title: String,
/* private fields */
}Expand description
EPUB document parser, representing a loaded and parsed EPUB publication
The EpubDoc structure is the core of the entire EPUB parsing library.
It encapsulates all the parsing logic and data access interfaces for EPUB files.
It is responsible for parsing various components of an EPUB, including metadata,
manifests, reading order, table of contents navigation, and encrypted information,
and provides methods for accessing this data.
Provides a unified data access interface for EPUB files, hiding the underlying file structure and parsing details. Strictly adheres to the EPUB specification in implementing the parsing logic to ensure compatibility with the standard.
§Usage
use lib_epub::epub::EpubDoc;
let doc = EpubDoc::new("./test_case/epub-33.epub");
assert!(doc.is_ok());§Notes
- The
EpubDocstructure is thread-safe if and only if the structure is immutable. - The fact that
EpubDocis mutable has no practical meaning; modifications to the structure data are not stored in the epub file.
Fields§
§package_path: PathBufThe path to the OPF file
base_path: PathBufThe path to the directory where the opf file is located
version: EpubVersionThe epub version
unique_identifier: StringThe unique identifier of the epub file
This identifier is the actual value of the unique-identifier attribute of the package.
metadata: Vec<MetadataItem>Epub metadata extracted from OPF
metadata_link: Vec<MetadataLinkItem>Data in metadata that points to external files
manifest: HashMap<String, ManifestItem>A list of resources contained inside an epub extracted from OPF
All resources in the epub file are declared here, and undeclared resources should not be stored in the epub file and cannot be obtained from it.
spine: Vec<SpineItem>Physical reading order of publications extracted from OPF
This attribute declares the order in which multiple files containing published content should be displayed.
encryption: Option<Vec<EncryptionData>>The encryption.xml extracted from the META-INF directory
catalog: Vec<NavPoint>The navigation data of the epub file
catalog_title: StringThe title of the catalog
Implementations§
Source§impl<R: Read + Seek> EpubDoc<R>
impl<R: Read + Seek> EpubDoc<R>
Sourcepub fn from_reader(reader: R, epub_path: PathBuf) -> Result<Self, EpubError>
pub fn from_reader(reader: R, epub_path: PathBuf) -> Result<Self, EpubError>
Creates a new EPUB document instance from a reader
This function is responsible for the core logic of parsing EPUB files, including verifying the file format, parsing container information, loading the OPF package document, and extracting metadata, manifest, reading order, and other core information.
§Parameters
reader: The data source that implements theReadandSeektraits, usually a file or memory bufferepub_path: The path to the EPUB file, used for path resolution and validation
§Return
Ok(EpubDoc<R>): The successfully parsed EPUB document objectErr(EpubError): Errors encountered during parsing
§Notes
- This function assumes the EPUB file structure is valid
Sourcepub fn has_encryption(&self) -> bool
pub fn has_encryption(&self) -> bool
Check if the EPUB file contains encryption.xml
This function determines whether a publication contains encrypted resources
by checking if a META-INF/encryption.xml file exists in the EPUB package.
According to the EPUB specification, when resources in a publication are
encrypted, the corresponding encryption information must be declared in
the META-INF/encryption.xml file.
§Return
trueif the publication contains encrypted resourcesfalseif the publication does not contain encrypted resources
§Notes
- This function only checks the existence of the encrypted file; it does not verify the validity of the encrypted information.
Sourcepub fn get_metadata(&self, key: &str) -> Option<Vec<MetadataItem>>
pub fn get_metadata(&self, key: &str) -> Option<Vec<MetadataItem>>
Retrieves a list of metadata items
This function retrieves all matching metadata items from the EPUB metadata based on the specified attribute name (key). Metadata items may come from the DC (Dublin Core) namespace or the OPF namespace and contain basic information about the publication, such as title, author, identifier, etc.
§Parameters
key: The name of the metadata attribute to retrieve
§Return
Some(Vec<MetadataItem>): A vector containing all matching metadata itemsNone: If no matching metadata items are found
Sourcepub fn get_metadata_value(&self, key: &str) -> Option<Vec<String>>
pub fn get_metadata_value(&self, key: &str) -> Option<Vec<String>>
Retrieves a list of values for specific metadata items
This function retrieves the values of all matching metadata items from the EPUB metadata based on the given property name (key).
§Parameters
key: The name of the metadata attribute to retrieve
§Return
Some(Vec<String>): A vector containing all matching metadata item valuesNone: If no matching metadata items are found
Sourcepub fn get_title(&self) -> Result<Vec<String>, EpubError>
pub fn get_title(&self) -> Result<Vec<String>, EpubError>
Retrieves the title of the publication
This function retrieves all title information from the EPUB metadata. According to the EPUB specification, a publication can have multiple titles, which are returned in the order they appear in the metadata.
§Return
Result<Vec<String>, EpubError>: A vector containing all title informationEpubError: If and only if the OPF file does not contain<dc:title>
§Notes
- The EPUB specification requires each publication to have at least one title.
Sourcepub fn get_language(&self) -> Result<Vec<String>, EpubError>
pub fn get_language(&self) -> Result<Vec<String>, EpubError>
Retrieves the language used in the publication
This function retrieves the language information of a publication from the EPUB metadata. According to the EPUB specification, language information identifies the primary language of the publication and can have multiple language identifiers.
§Return
Ok(Vec<String>): A vector containing all language identifiersErr(EpubError): If and only if the OPF file does not contain<dc:language>
§Notes
- The EPUB specification requires that each publication specify at least one primary language.
- Language identifiers should conform to RFC 3066 or later standards.
Sourcepub fn get_identifier(&self) -> Result<Vec<String>, EpubError>
pub fn get_identifier(&self) -> Result<Vec<String>, EpubError>
Retrieves the identifier of a publication
This function retrieves the identifier information of a publication from the EPUB metadata. According to the EPUB specification, each publication must have a identifier, typically an ISBN, UUID, or other unique identifier.
§Return
Ok(Vec<String>): A vector containing all identifier informationErr(EpubError): If and only if the OPF file does not contain<dc:identifier>
§Notes
- The EPUB specification requires each publication to have at least one identifier.
- In the OPF file, the
unique-identifierattribute of the<package>element should point to a<dc:identifier>element used to uniquely identify the publication. This means thatunique-identifieris not exactly equal to<dc:identifier>.
Sourcepub fn get_manifest_item(
&self,
id: &str,
) -> Result<(Vec<u8>, String), EpubError>
pub fn get_manifest_item( &self, id: &str, ) -> Result<(Vec<u8>, String), EpubError>
Retrieve resource data by resource ID
This function will find the resource with the specified ID in the manifest. If the resource is encrypted, it will be automatically decrypted.
§Parameters
id: The ID of the resource to retrieve
§Return
Ok((Vec<u8>, String)): Successfully retrieved and decrypted resource data and the MIME typeErr(EpubError): Errors that occurred during the retrieval process
§Notes
- This function will automatically decrypt the resource if it is encrypted.
- For unsupported encryption methods, the corresponding error will be returned.
Sourcepub fn get_manifest_item_by_path(
&self,
path: &str,
) -> Result<(Vec<u8>, String), EpubError>
pub fn get_manifest_item_by_path( &self, path: &str, ) -> Result<(Vec<u8>, String), EpubError>
Retrieves resource item data by resource path
This function retrieves resources from the manifest based on the input path. The input path must be a relative path to the root directory of the EPUB container; using an absolute path or a relative path to another location will result in an error.
§Parameters
path: The path of the resource to retrieve
§Return
Ok((Vec<u8>, String)): Successfully retrieved and decrypted resource data and the MIME typeErr(EpubError): Errors that occurred during the retrieval process
§Notes
- This function will automatically decrypt the resource if it is encrypted.
- For unsupported encryption methods, the corresponding error will be returned.
- Relative paths other than the root directory of the Epub container are not supported.
Sourcepub fn get_manifest_item_with_fallback(
&self,
id: &str,
supported_format: Vec<&str>,
) -> Result<(Vec<u8>, String), EpubError>
pub fn get_manifest_item_with_fallback( &self, id: &str, supported_format: Vec<&str>, ) -> Result<(Vec<u8>, String), EpubError>
Retrieves supported resource items by resource ID, with fallback mechanism supported
This function attempts to retrieve the resource item with the specified ID and checks if its MIME type is in the list of supported formats. If the current resource format is not supported, it searches for a supported resource format along the fallback chain according to the fallback mechanism defined in the EPUB specification.
§Parameters
id: The ID of the resource to retrievesupported_format: A vector of supported MIME types
§Return
Ok((Vec<u8>, String)): Successfully retrieved and decrypted resource data and the MIME typeErr(EpubError): Errors that occurred during the retrieval process
Sourcepub fn get_cover(&self) -> Option<(Vec<u8>, String)>
pub fn get_cover(&self) -> Option<(Vec<u8>, String)>
Retrieves the cover of the EPUB document
This function searches for the cover of the EPUB document by examining manifest items in the manifest. It looks for manifest items whose ID or attribute contains “cover” (case-insensitive) and attempts to retrieve the content of the first match.
§Return
Some((Vec<u8>, String)): Successfully retrieved and decrypted cover data and the MIME typeNone: No cover resource was found
§Notes
- This function only returns the first successfully retrieved cover resource, even if multiple matches exist
- The retrieved cover may not be an image resource; users need to pay attention to the resource’s MIME type.
Navigate to a specified chapter using the spine index
This function retrieves the content data of the corresponding chapter based on the index position in the EPUB spine. The spine defines the linear reading order of the publication’s content documents, and each spine item references resources in the manifest.
§Parameters
index: The index position in the spine, starting from 0
§Return
Some((Vec<u8>, String)): Successfully retrieved chapter content data and the MIME typeNone: Index out of range or data retrieval error
§Notes
- The index must be less than the total number of spine projects.
- If the resource is encrypted, it will be automatically decrypted before returning.(TODO)
- It does not check whether the Spine project follows a linear reading order.
Sourcepub fn spine_prev(&self) -> Option<(Vec<u8>, String)>
pub fn spine_prev(&self) -> Option<(Vec<u8>, String)>
Navigate to the previous linear reading chapter
This function searches backwards in the EPUB spine for the previous linear reading chapter and returns the content data of that chapter. It only navigates to chapters marked as linear reading.
§Return
Some((Vec<u8>, String)): Successfully retrieved previous chapter content data and the MIME typeNone: Already in the first chapter, the current chapter is not linear, or data retrieval failed
Sourcepub fn spine_next(&mut self) -> Option<(Vec<u8>, String)>
pub fn spine_next(&mut self) -> Option<(Vec<u8>, String)>
Navigate to the next linear reading chapter
This function searches forwards in the EPUB spine for the next linear reading chapter and returns the content data of that chapter. It only navigates to chapters marked as linear reading.
§Return
Some((Vec<u8>, String)): Successfully retrieved next chapter content data and the MIME typeNone: Already in the last chapter, the current chapter is not linear, or data retrieval failed
Sourcepub fn spine_current(&self) -> Option<(Vec<u8>, String)>
pub fn spine_current(&self) -> Option<(Vec<u8>, String)>
Retrieves the content data of the current chapter
This function returns the content data of the chapter at the current index position in the EPUB spine.
§Return
Some((Vec<u8>, String)): Successfully retrieved current chapter content data and the MIME typeNone: Data retrieval failed
Source§impl EpubDoc<BufReader<File>>
impl EpubDoc<BufReader<File>>
Sourcepub fn new<P: AsRef<Path>>(path: P) -> Result<Self, EpubError>
pub fn new<P: AsRef<Path>>(path: P) -> Result<Self, EpubError>
Creates a new EPUB document instance
This function is a convenience constructor for EpubDoc,
used to create an EPUB parser instance directly from a file path.
§Parameters
path: The path to the EPUB file
§Return
Ok(EpubDoc): The created EPUB document instanceErr(EpubError): An error occurred during initialization