pub struct Document {
pub version: String,
pub binary_mark: Vec<u8>,
pub trailer: Dictionary,
pub reference_table: Xref,
pub objects: BTreeMap<ObjectId, Object>,
pub max_id: u32,
pub max_bookmark_id: u32,
pub bookmarks: Vec<u32>,
pub bookmark_table: HashMap<u32, Bookmark>,
pub xref_start: usize,
pub encryption_state: Option<EncryptionState>,
}
Expand description
A PDF document.
This can both be a combination of multiple incremental updates or just one (the last) incremental update in a PDF file.
Fields§
§version: String
The version of the PDF specification to which the file conforms.
binary_mark: Vec<u8>
The binary mark important for PDF A/2,3 tells various software tools to classify the file as containing 8-bit binary that should be preserved during processing
trailer: Dictionary
The trailer gives the location of the cross-reference table and of certain special objects.
reference_table: Xref
The cross-reference table contains locations of the indirect objects.
objects: BTreeMap<ObjectId, Object>
The objects that make up the document contained in the file.
max_id: u32
Current maximum object id within the document.
max_bookmark_id: u32
Current maximum object id within Bookmarks.
bookmarks: Vec<u32>
The bookmarks in the document. Render at the very end of document after renumbering objects.
bookmark_table: HashMap<u32, Bookmark>
used to locate a stored Bookmark so children can be appended to it via its id. Otherwise we need to do recursive lookups and returns on the bookmarks internal layout Vec
xref_start: usize
The byte the cross-reference table starts at.
This value is only set during reading, but not when writing the file.
It is used to support incremental updates in PDFs.
Default value is 0
.
encryption_state: Option<EncryptionState>
The encryption state stores the parameters that were used to decrypt this document if the document has been decrypted.
Implementations§
Source§impl Document
impl Document
Sourcepub fn new_from_prev(prev: &Document) -> Self
pub fn new_from_prev(prev: &Document) -> Self
Create a new PDF document that is an incremental update to a previous document.
Sourcepub fn adjust_zero_pages(&mut self)
pub fn adjust_zero_pages(&mut self)
Adjusts the Parents that have a ObjectId of (0,_) to that of their first child. will recurse through all entries till all parents of children are set. This should be ran before building the final bookmark objects but after renumbering of objects.
Sourcepub fn dereference<'a>(
&'a self,
object: &'a Object,
) -> Result<(Option<ObjectId>, &'a Object)>
pub fn dereference<'a>( &'a self, object: &'a Object, ) -> Result<(Option<ObjectId>, &'a Object)>
Follow references if the supplied object is a reference.
Returns a tuple of an optional object id and final object. The object id will be None if the object was not a reference. Otherwise, it will be the last object id in the reference chain.
Sourcepub fn get_object(&self, id: ObjectId) -> Result<&Object>
pub fn get_object(&self, id: ObjectId) -> Result<&Object>
Get object by object id, will iteratively dereference a referenced object.
Sourcepub fn has_object(&self, id: ObjectId) -> bool
pub fn has_object(&self, id: ObjectId) -> bool
Determines if an object exists in the current document (or incremental update.)
with the given ObjectId
.
true
if the object exists, false
if it does not exist.
Sourcepub fn get_object_mut(&mut self, id: ObjectId) -> Result<&mut Object>
pub fn get_object_mut(&mut self, id: ObjectId) -> Result<&mut Object>
Get mutable reference to object by object ID, will iteratively dereference a referenced object.
Sourcepub fn get_object_page(&self, id: ObjectId) -> Result<ObjectId>
pub fn get_object_page(&self, id: ObjectId) -> Result<ObjectId>
Get the object ID of the page that contains id
.
Sourcepub fn get_dictionary(&self, id: ObjectId) -> Result<&Dictionary>
pub fn get_dictionary(&self, id: ObjectId) -> Result<&Dictionary>
Get dictionary object by id.
Sourcepub fn get_dictionary_mut(&mut self, id: ObjectId) -> Result<&mut Dictionary>
pub fn get_dictionary_mut(&mut self, id: ObjectId) -> Result<&mut Dictionary>
Get a mutable dictionary object by id.
Sourcepub fn get_dict_in_dict<'a>(
&'a self,
node: &'a Dictionary,
key: &[u8],
) -> Result<&'a Dictionary>
pub fn get_dict_in_dict<'a>( &'a self, node: &'a Dictionary, key: &[u8], ) -> Result<&'a Dictionary>
Get dictionary in dictionary by key.
Sourcepub fn traverse_objects<A: Fn(&mut Object)>(
&mut self,
action: A,
) -> Vec<ObjectId>
pub fn traverse_objects<A: Fn(&mut Object)>( &mut self, action: A, ) -> Vec<ObjectId>
Traverse objects from trailer recursively, return all referenced object IDs.
Sourcepub fn get_encrypted(&self) -> Result<&Dictionary>
pub fn get_encrypted(&self) -> Result<&Dictionary>
Return dictionary with encryption information
Sourcepub fn is_encrypted(&self) -> bool
pub fn is_encrypted(&self) -> bool
Return true is PDF document is encrypted
Sourcepub fn authenticate_raw_owner_password<P>(&self, password: P) -> Result<()>
pub fn authenticate_raw_owner_password<P>(&self, password: P) -> Result<()>
Authenticate the provided owner password directly as bytes without sanitization
Sourcepub fn authenticate_raw_user_password<P>(&self, password: P) -> Result<()>
pub fn authenticate_raw_user_password<P>(&self, password: P) -> Result<()>
Authenticate the provided user password directly as bytes without sanitization
Sourcepub fn authenticate_raw_password<P>(&self, password: P) -> Result<()>
pub fn authenticate_raw_password<P>(&self, password: P) -> Result<()>
Authenticate the provided owner/user password as bytes without sanitization
Sourcepub fn authenticate_owner_password(&self, password: &str) -> Result<()>
pub fn authenticate_owner_password(&self, password: &str) -> Result<()>
Authenticate the provided owner password
Sourcepub fn authenticate_user_password(&self, password: &str) -> Result<()>
pub fn authenticate_user_password(&self, password: &str) -> Result<()>
Authenticate the provided user password
Sourcepub fn authenticate_password(&self, password: &str) -> Result<()>
pub fn authenticate_password(&self, password: &str) -> Result<()>
Authenticate the provided owner/user password
Sourcepub fn get_crypt_filters(&self) -> BTreeMap<Vec<u8>, Arc<dyn CryptFilter>>
pub fn get_crypt_filters(&self) -> BTreeMap<Vec<u8>, Arc<dyn CryptFilter>>
Returns a BTreeMap
of the crypt filters available in the PDF document if any.
Sourcepub fn encrypt(&mut self, state: &EncryptionState) -> Result<()>
pub fn encrypt(&mut self, state: &EncryptionState) -> Result<()>
Replaces all encrypted Strings and Streams with their encrypted contents
Sourcepub fn decrypt(&mut self, password: &str) -> Result<()>
pub fn decrypt(&mut self, password: &str) -> Result<()>
Replaces all encrypted Strings and Streams with their decrypted contents
Sourcepub fn decrypt_raw<P>(&mut self, password: P) -> Result<()>
pub fn decrypt_raw<P>(&mut self, password: P) -> Result<()>
Replaces all encrypted Strings and Streams with their decrypted contents with the password provided directly as bytes without sanitization
Sourcepub fn catalog(&self) -> Result<&Dictionary>
pub fn catalog(&self) -> Result<&Dictionary>
Return the PDF document catalog, which is the root of the document’s object graph.
Sourcepub fn catalog_mut(&mut self) -> Result<&mut Dictionary>
pub fn catalog_mut(&mut self) -> Result<&mut Dictionary>
Return a mutable reference to the PDF document catalog, which is the root of the document’s object graph.
Sourcepub fn get_pages(&self) -> BTreeMap<u32, ObjectId>
pub fn get_pages(&self) -> BTreeMap<u32, ObjectId>
Get page numbers and corresponding object ids.
pub fn page_iter(&self) -> impl Iterator<Item = ObjectId> + '_
Sourcepub fn get_page_contents(&self, page_id: ObjectId) -> Vec<ObjectId>
pub fn get_page_contents(&self, page_id: ObjectId) -> Vec<ObjectId>
Get content stream object ids of a page.
Sourcepub fn add_page_contents(
&mut self,
page_id: ObjectId,
content: Vec<u8>,
) -> Result<()>
pub fn add_page_contents( &mut self, page_id: ObjectId, content: Vec<u8>, ) -> Result<()>
Add content to a page. All existing content will be unchanged.
Sourcepub fn get_page_resources(
&self,
page_id: ObjectId,
) -> Result<(Option<&Dictionary>, Vec<ObjectId>)>
pub fn get_page_resources( &self, page_id: ObjectId, ) -> Result<(Option<&Dictionary>, Vec<ObjectId>)>
Get resources used by a page.
Sourcepub fn get_page_fonts(
&self,
page_id: ObjectId,
) -> Result<BTreeMap<Vec<u8>, &Dictionary>>
pub fn get_page_fonts( &self, page_id: ObjectId, ) -> Result<BTreeMap<Vec<u8>, &Dictionary>>
Get fonts used by a page.
Sourcepub fn get_page_annotations(
&self,
page_id: ObjectId,
) -> Result<Vec<&Dictionary>>
pub fn get_page_annotations( &self, page_id: ObjectId, ) -> Result<Vec<&Dictionary>>
Get the PDF annotations of a page. The /Subtype of each annotation dictionary defines the annotation type (Text, Link, Highlight, Underline, Ink, Popup, Widget, etc.). The /Rect of an annotation dictionary defines its location on the page.
pub fn get_page_images(&self, page_id: ObjectId) -> Result<Vec<PdfImage<'_>>>
pub fn decode_text(encoding: &Encoding<'_>, bytes: &[u8]) -> Result<String>
pub fn encode_text(encoding: &Encoding<'_>, text: &str) -> Vec<u8> ⓘ
Source§impl Document
impl Document
pub fn add_bookmark(&mut self, bookmark: Bookmark, parent: Option<u32>) -> u32
pub fn build_outline(&mut self) -> Option<ObjectId>
Source§impl Document
impl Document
Sourcepub fn with_version<S: Into<String>>(version: S) -> Document
pub fn with_version<S: Into<String>>(version: S) -> Document
Create new PDF document with version.
Sourcepub fn new_object_id(&mut self) -> ObjectId
pub fn new_object_id(&mut self) -> ObjectId
Create an object ID.
Sourcepub fn add_object<T: Into<Object>>(&mut self, object: T) -> ObjectId
pub fn add_object<T: Into<Object>>(&mut self, object: T) -> ObjectId
Add PDF object into document’s object list.
pub fn set_object<T: Into<Object>>(&mut self, id: ObjectId, object: T)
Sourcepub fn remove_object(&mut self, object_id: &ObjectId) -> Result<()>
pub fn remove_object(&mut self, object_id: &ObjectId) -> Result<()>
Remove PDF object from document’s object list.
Sourcepub fn get_or_create_resources(
&mut self,
page_id: ObjectId,
) -> Result<&mut Object>
pub fn get_or_create_resources( &mut self, page_id: ObjectId, ) -> Result<&mut Object>
Get the page’s resource dictionary.
Get Object that has the key “Resources”.
Source§impl Document
impl Document
pub fn get_named_destinations( &self, tree: &Dictionary, named_destinations: &mut IndexMap<Vec<u8>, Destination>, ) -> Result<()>
Source§impl Document
impl Document
pub fn get_outline( &self, node: &Dictionary, named_destinations: &mut IndexMap<Vec<u8>, Destination>, ) -> Result<Option<Outline>>
pub fn get_outlines( &self, node: Option<Object>, outlines: Option<Vec<Outline>>, named_destinations: &mut IndexMap<Vec<u8>, Destination>, ) -> Result<Option<Vec<Outline>>>
Source§impl Document
impl Document
Sourcepub fn change_producer(&mut self, producer: &str)
pub fn change_producer(&mut self, producer: &str)
Change producer of document information dictionary.
Sourcepub fn decompress(&mut self)
pub fn decompress(&mut self)
Decompress PDF stream objects.
Sourcepub fn delete_pages(&mut self, page_numbers: &[u32])
pub fn delete_pages(&mut self, page_numbers: &[u32])
Delete pages.
Sourcepub fn prune_objects(&mut self) -> Vec<ObjectId>
pub fn prune_objects(&mut self) -> Vec<ObjectId>
Prune all unused objects.
Sourcepub fn delete_object(&mut self, id: ObjectId) -> Option<Object>
pub fn delete_object(&mut self, id: ObjectId) -> Option<Object>
Delete object by object ID.
Sourcepub fn delete_zero_length_streams(&mut self) -> Vec<ObjectId>
pub fn delete_zero_length_streams(&mut self) -> Vec<ObjectId>
Delete zero length stream objects.
Sourcepub fn renumber_objects(&mut self)
pub fn renumber_objects(&mut self)
Renumber objects, normally called after delete_unused_objects.
pub fn renumber_bookmarks(&mut self, old: &ObjectId, new: &ObjectId)
Sourcepub fn renumber_objects_with(&mut self, starting_id: u32)
pub fn renumber_objects_with(&mut self, starting_id: u32)
Renumber objects with a custom starting id, this is very useful in case of multiple document object insertions in a single main document
pub fn change_content_stream(&mut self, stream_id: ObjectId, content: Vec<u8>)
pub fn change_page_content( &mut self, page_id: ObjectId, content: Vec<u8>, ) -> Result<()>
pub fn extract_stream( &self, stream_id: ObjectId, decompress: bool, ) -> Result<()>
Source§impl Document
impl Document
Sourcepub fn get_and_decode_page_content(
&self,
page_id: ObjectId,
) -> Result<Content<Vec<Operation>>>
pub fn get_and_decode_page_content( &self, page_id: ObjectId, ) -> Result<Content<Vec<Operation>>>
Get decoded page content;
Sourcepub fn add_to_page_content(
&mut self,
page_id: ObjectId,
content: Content<Vec<Operation>>,
) -> Result<()>
pub fn add_to_page_content( &mut self, page_id: ObjectId, content: Content<Vec<Operation>>, ) -> Result<()>
Add content to a page. All existing content will be unchanged.
pub fn extract_text(&self, page_numbers: &[u32]) -> Result<String>
pub fn extract_text_chunks(&self, page_numbers: &[u32]) -> Vec<Result<String>>
pub fn replace_text( &mut self, page_number: u32, text: &str, other_text: &str, ) -> Result<()>
pub fn insert_image( &mut self, page_id: ObjectId, img_object: Stream, position: (f32, f32), size: (f32, f32), ) -> Result<()>
pub fn insert_form_object( &mut self, page_id: ObjectId, form_obj: Stream, ) -> Result<()>
Source§impl Document
impl Document
Sourcepub fn load<P: AsRef<Path>>(path: P) -> Result<Document>
pub fn load<P: AsRef<Path>>(path: P) -> Result<Document>
Load a PDF document from a specified file path.
pub fn load_filtered<P: AsRef<Path>>( path: P, filter_func: fn((u32, u16), &mut Object) -> Option<((u32, u16), Object)>, ) -> Result<Document>
Trait Implementations§
Source§impl TryFrom<&Document> for PasswordAlgorithm
impl TryFrom<&Document> for PasswordAlgorithm
Auto Trait Implementations§
impl Freeze for Document
impl !RefUnwindSafe for Document
impl Send for Document
impl Sync for Document
impl Unpin for Document
impl !UnwindSafe for Document
Blanket Implementations§
Source§impl<T> BorrowMut<T> for Twhere
T: ?Sized,
impl<T> BorrowMut<T> for Twhere
T: ?Sized,
Source§fn borrow_mut(&mut self) -> &mut T
fn borrow_mut(&mut self) -> &mut T
Source§impl<T> CloneToUninit for Twhere
T: Clone,
impl<T> CloneToUninit for Twhere
T: Clone,
Source§impl<T> IntoEither for T
impl<T> IntoEither for T
Source§fn into_either(self, into_left: bool) -> Either<Self, Self>
fn into_either(self, into_left: bool) -> Either<Self, Self>
self
into a Left
variant of Either<Self, Self>
if into_left
is true
.
Converts self
into a Right
variant of Either<Self, Self>
otherwise. Read moreSource§fn into_either_with<F>(self, into_left: F) -> Either<Self, Self>
fn into_either_with<F>(self, into_left: F) -> Either<Self, Self>
self
into a Left
variant of Either<Self, Self>
if into_left(&self)
returns true
.
Converts self
into a Right
variant of Either<Self, Self>
otherwise. Read more