Struct Document

Source
pub struct Document {
    pub version: String,
    pub binary_mark: Vec<u8>,
    pub trailer: Dictionary,
    pub reference_table: Xref,
    pub objects: BTreeMap<ObjectId, Object>,
    pub max_id: u32,
    pub max_bookmark_id: u32,
    pub bookmarks: Vec<u32>,
    pub bookmark_table: HashMap<u32, Bookmark>,
    pub xref_start: usize,
    pub encryption_state: Option<EncryptionState>,
}
Expand description

A PDF document.

This can both be a combination of multiple incremental updates or just one (the last) incremental update in a PDF file.

Fields§

§version: String

The version of the PDF specification to which the file conforms.

§binary_mark: Vec<u8>

The binary mark important for PDF A/2,3 tells various software tools to classify the file as containing 8-bit binary that should be preserved during processing

§trailer: Dictionary

The trailer gives the location of the cross-reference table and of certain special objects.

§reference_table: Xref

The cross-reference table contains locations of the indirect objects.

§objects: BTreeMap<ObjectId, Object>

The objects that make up the document contained in the file.

§max_id: u32

Current maximum object id within the document.

§max_bookmark_id: u32

Current maximum object id within Bookmarks.

§bookmarks: Vec<u32>

The bookmarks in the document. Render at the very end of document after renumbering objects.

§bookmark_table: HashMap<u32, Bookmark>

used to locate a stored Bookmark so children can be appended to it via its id. Otherwise we need to do recursive lookups and returns on the bookmarks internal layout Vec

§xref_start: usize

The byte the cross-reference table starts at. This value is only set during reading, but not when writing the file. It is used to support incremental updates in PDFs. Default value is 0.

§encryption_state: Option<EncryptionState>

The encryption state stores the parameters that were used to decrypt this document if the document has been decrypted.

Implementations§

Source§

impl Document

Source

pub fn new() -> Self

Create new PDF document.

Source

pub fn new_from_prev(prev: &Document) -> Self

Create a new PDF document that is an incremental update to a previous document.

Source

pub fn adjust_zero_pages(&mut self)

Adjusts the Parents that have a ObjectId of (0,_) to that of their first child. will recurse through all entries till all parents of children are set. This should be ran before building the final bookmark objects but after renumbering of objects.

Source

pub fn dereference<'a>( &'a self, object: &'a Object, ) -> Result<(Option<ObjectId>, &'a Object)>

Follow references if the supplied object is a reference.

Returns a tuple of an optional object id and final object. The object id will be None if the object was not a reference. Otherwise, it will be the last object id in the reference chain.

Source

pub fn get_object(&self, id: ObjectId) -> Result<&Object>

Get object by object id, will iteratively dereference a referenced object.

Source

pub fn has_object(&self, id: ObjectId) -> bool

Determines if an object exists in the current document (or incremental update.) with the given ObjectId. true if the object exists, false if it does not exist.

Source

pub fn get_object_mut(&mut self, id: ObjectId) -> Result<&mut Object>

Get mutable reference to object by object ID, will iteratively dereference a referenced object.

Source

pub fn get_object_page(&self, id: ObjectId) -> Result<ObjectId>

Get the object ID of the page that contains id.

Source

pub fn get_dictionary(&self, id: ObjectId) -> Result<&Dictionary>

Get dictionary object by id.

Source

pub fn get_dictionary_mut(&mut self, id: ObjectId) -> Result<&mut Dictionary>

Get a mutable dictionary object by id.

Source

pub fn get_dict_in_dict<'a>( &'a self, node: &'a Dictionary, key: &[u8], ) -> Result<&'a Dictionary>

Get dictionary in dictionary by key.

Source

pub fn traverse_objects<A: Fn(&mut Object)>( &mut self, action: A, ) -> Vec<ObjectId>

Traverse objects from trailer recursively, return all referenced object IDs.

Source

pub fn get_encrypted(&self) -> Result<&Dictionary>

Return dictionary with encryption information

Source

pub fn is_encrypted(&self) -> bool

Return true is PDF document is encrypted

Source

pub fn authenticate_raw_owner_password<P>(&self, password: P) -> Result<()>
where P: AsRef<[u8]>,

Authenticate the provided owner password directly as bytes without sanitization

Source

pub fn authenticate_raw_user_password<P>(&self, password: P) -> Result<()>
where P: AsRef<[u8]>,

Authenticate the provided user password directly as bytes without sanitization

Source

pub fn authenticate_raw_password<P>(&self, password: P) -> Result<()>
where P: AsRef<[u8]>,

Authenticate the provided owner/user password as bytes without sanitization

Source

pub fn authenticate_owner_password(&self, password: &str) -> Result<()>

Authenticate the provided owner password

Source

pub fn authenticate_user_password(&self, password: &str) -> Result<()>

Authenticate the provided user password

Source

pub fn authenticate_password(&self, password: &str) -> Result<()>

Authenticate the provided owner/user password

Source

pub fn get_crypt_filters(&self) -> BTreeMap<Vec<u8>, Arc<dyn CryptFilter>>

Returns a BTreeMap of the crypt filters available in the PDF document if any.

Source

pub fn encrypt(&mut self, state: &EncryptionState) -> Result<()>

Replaces all encrypted Strings and Streams with their encrypted contents

Source

pub fn decrypt(&mut self, password: &str) -> Result<()>

Replaces all encrypted Strings and Streams with their decrypted contents

Source

pub fn decrypt_raw<P>(&mut self, password: P) -> Result<()>
where P: AsRef<[u8]>,

Replaces all encrypted Strings and Streams with their decrypted contents with the password provided directly as bytes without sanitization

Source

pub fn catalog(&self) -> Result<&Dictionary>

Return the PDF document catalog, which is the root of the document’s object graph.

Source

pub fn catalog_mut(&mut self) -> Result<&mut Dictionary>

Return a mutable reference to the PDF document catalog, which is the root of the document’s object graph.

Source

pub fn get_pages(&self) -> BTreeMap<u32, ObjectId>

Get page numbers and corresponding object ids.

Source

pub fn page_iter(&self) -> impl Iterator<Item = ObjectId> + '_

Source

pub fn get_page_contents(&self, page_id: ObjectId) -> Vec<ObjectId>

Get content stream object ids of a page.

Source

pub fn add_page_contents( &mut self, page_id: ObjectId, content: Vec<u8>, ) -> Result<()>

Add content to a page. All existing content will be unchanged.

Source

pub fn get_page_content(&self, page_id: ObjectId) -> Result<Vec<u8>>

Get content of a page.

Source

pub fn get_page_resources( &self, page_id: ObjectId, ) -> Result<(Option<&Dictionary>, Vec<ObjectId>)>

Get resources used by a page.

Source

pub fn get_page_fonts( &self, page_id: ObjectId, ) -> Result<BTreeMap<Vec<u8>, &Dictionary>>

Get fonts used by a page.

Source

pub fn get_page_annotations( &self, page_id: ObjectId, ) -> Result<Vec<&Dictionary>>

Get the PDF annotations of a page. The /Subtype of each annotation dictionary defines the annotation type (Text, Link, Highlight, Underline, Ink, Popup, Widget, etc.). The /Rect of an annotation dictionary defines its location on the page.

Source

pub fn get_page_images(&self, page_id: ObjectId) -> Result<Vec<PdfImage<'_>>>

Source

pub fn decode_text(encoding: &Encoding<'_>, bytes: &[u8]) -> Result<String>

Source

pub fn encode_text(encoding: &Encoding<'_>, text: &str) -> Vec<u8>

Source§

impl Document

Source

pub fn add_bookmark(&mut self, bookmark: Bookmark, parent: Option<u32>) -> u32

Source

pub fn build_outline(&mut self) -> Option<ObjectId>

Source§

impl Document

Source

pub fn with_version<S: Into<String>>(version: S) -> Document

Create new PDF document with version.

Source

pub fn new_object_id(&mut self) -> ObjectId

Create an object ID.

Source

pub fn add_object<T: Into<Object>>(&mut self, object: T) -> ObjectId

Add PDF object into document’s object list.

Source

pub fn set_object<T: Into<Object>>(&mut self, id: ObjectId, object: T)

Source

pub fn remove_object(&mut self, object_id: &ObjectId) -> Result<()>

Remove PDF object from document’s object list.

Source

pub fn get_or_create_resources( &mut self, page_id: ObjectId, ) -> Result<&mut Object>

Get the page’s resource dictionary.

Get Object that has the key “Resources”.

Source

pub fn add_xobject<N: Into<Vec<u8>>>( &mut self, page_id: ObjectId, xobject_name: N, xobject_id: ObjectId, ) -> Result<()>

Add XObject to a page.

Get Object that has the key Resources -> XObject.

Source

pub fn add_graphics_state<N: Into<Vec<u8>>>( &mut self, page_id: ObjectId, gs_name: N, gs_id: ObjectId, ) -> Result<()>

Add Graphics State to a page.

Get Object that has the key Resources -> ExtGState.

Source§

impl Document

Source

pub fn get_named_destinations( &self, tree: &Dictionary, named_destinations: &mut IndexMap<Vec<u8>, Destination>, ) -> Result<()>

Source§

impl Document

Source

pub fn get_outline( &self, node: &Dictionary, named_destinations: &mut IndexMap<Vec<u8>, Destination>, ) -> Result<Option<Outline>>

Source

pub fn get_outlines( &self, node: Option<Object>, outlines: Option<Vec<Outline>>, named_destinations: &mut IndexMap<Vec<u8>, Destination>, ) -> Result<Option<Vec<Outline>>>

Source§

impl Document

Source

pub fn change_producer(&mut self, producer: &str)

Change producer of document information dictionary.

Source

pub fn compress(&mut self)

Compress PDF stream objects.

Source

pub fn decompress(&mut self)

Decompress PDF stream objects.

Source

pub fn delete_pages(&mut self, page_numbers: &[u32])

Delete pages.

Source

pub fn prune_objects(&mut self) -> Vec<ObjectId>

Prune all unused objects.

Source

pub fn delete_object(&mut self, id: ObjectId) -> Option<Object>

Delete object by object ID.

Source

pub fn delete_zero_length_streams(&mut self) -> Vec<ObjectId>

Delete zero length stream objects.

Source

pub fn renumber_objects(&mut self)

Renumber objects, normally called after delete_unused_objects.

Source

pub fn renumber_bookmarks(&mut self, old: &ObjectId, new: &ObjectId)

Source

pub fn renumber_objects_with(&mut self, starting_id: u32)

Renumber objects with a custom starting id, this is very useful in case of multiple document object insertions in a single main document

Source

pub fn change_content_stream(&mut self, stream_id: ObjectId, content: Vec<u8>)

Source

pub fn change_page_content( &mut self, page_id: ObjectId, content: Vec<u8>, ) -> Result<()>

Source

pub fn extract_stream( &self, stream_id: ObjectId, decompress: bool, ) -> Result<()>

Source§

impl Document

Source

pub fn get_toc(&self) -> Result<Toc>

Source§

impl Document

Source

pub fn save<P: AsRef<Path>>(&mut self, path: P) -> Result<File>

Save PDF document to specified file path.

Source

pub fn save_to<W: Write>(&mut self, target: &mut W) -> Result<()>

Save PDF to arbitrary target

Source§

impl Document

Source

pub fn get_and_decode_page_content( &self, page_id: ObjectId, ) -> Result<Content<Vec<Operation>>>

Get decoded page content;

Source

pub fn add_to_page_content( &mut self, page_id: ObjectId, content: Content<Vec<Operation>>, ) -> Result<()>

Add content to a page. All existing content will be unchanged.

Source

pub fn extract_text(&self, page_numbers: &[u32]) -> Result<String>

Source

pub fn extract_text_chunks(&self, page_numbers: &[u32]) -> Vec<Result<String>>

Source

pub fn replace_text( &mut self, page_number: u32, text: &str, other_text: &str, ) -> Result<()>

Source

pub fn insert_image( &mut self, page_id: ObjectId, img_object: Stream, position: (f32, f32), size: (f32, f32), ) -> Result<()>

Source

pub fn insert_form_object( &mut self, page_id: ObjectId, form_obj: Stream, ) -> Result<()>

Source§

impl Document

Source

pub fn load<P: AsRef<Path>>(path: P) -> Result<Document>

Load a PDF document from a specified file path.

Source

pub fn load_filtered<P: AsRef<Path>>( path: P, filter_func: fn((u32, u16), &mut Object) -> Option<((u32, u16), Object)>, ) -> Result<Document>

Source

pub fn load_from<R: Read>(source: R) -> Result<Document>

Load a PDF document from an arbitrary source.

Source

pub fn load_mem(buffer: &[u8]) -> Result<Document>

Load a PDF document from a memory slice.

Trait Implementations§

Source§

impl Clone for Document

Source§

fn clone(&self) -> Document

Returns a copy of the value. Read more
1.0.0 · Source§

fn clone_from(&mut self, source: &Self)

Performs copy-assignment from source. Read more
Source§

impl Debug for Document

Source§

fn fmt(&self, f: &mut Formatter<'_>) -> Result

Formats the value using the given formatter. Read more
Source§

impl Default for Document

Source§

fn default() -> Self

Returns the “default value” for a type. Read more
Source§

impl TryFrom<&Document> for PasswordAlgorithm

Source§

type Error = Error

The type returned in the event of a conversion error.
Source§

fn try_from(value: &Document) -> Result<Self, Self::Error>

Performs the conversion.
Source§

impl TryInto<Document> for &[u8]

Source§

type Error = Error

The type returned in the event of a conversion error.
Source§

fn try_into(self) -> Result<Document>

Performs the conversion.

Auto Trait Implementations§

Blanket Implementations§

Source§

impl<T> Any for T
where T: 'static + ?Sized,

Source§

fn type_id(&self) -> TypeId

Gets the TypeId of self. Read more
Source§

impl<T> Borrow<T> for T
where T: ?Sized,

Source§

fn borrow(&self) -> &T

Immutably borrows from an owned value. Read more
Source§

impl<T> BorrowMut<T> for T
where T: ?Sized,

Source§

fn borrow_mut(&mut self) -> &mut T

Mutably borrows from an owned value. Read more
Source§

impl<T> CloneToUninit for T
where T: Clone,

Source§

unsafe fn clone_to_uninit(&self, dst: *mut u8)

🔬This is a nightly-only experimental API. (clone_to_uninit)
Performs copy-assignment from self to dst. Read more
Source§

impl<T> From<T> for T

Source§

fn from(t: T) -> T

Returns the argument unchanged.

Source§

impl<T, U> Into<U> for T
where U: From<T>,

Source§

fn into(self) -> U

Calls U::from(self).

That is, this conversion is whatever the implementation of From<T> for U chooses to do.

Source§

impl<T> IntoEither for T

Source§

fn into_either(self, into_left: bool) -> Either<Self, Self>

Converts self into a Left variant of Either<Self, Self> if into_left is true. Converts self into a Right variant of Either<Self, Self> otherwise. Read more
Source§

fn into_either_with<F>(self, into_left: F) -> Either<Self, Self>
where F: FnOnce(&Self) -> bool,

Converts self into a Left variant of Either<Self, Self> if into_left(&self) returns true. Converts self into a Right variant of Either<Self, Self> otherwise. Read more
Source§

impl<T> Pointable for T

Source§

const ALIGN: usize

The alignment of pointer.
Source§

type Init = T

The type for initializers.
Source§

unsafe fn init(init: <T as Pointable>::Init) -> usize

Initializes a with the given initializer. Read more
Source§

unsafe fn deref<'a>(ptr: usize) -> &'a T

Dereferences the given pointer. Read more
Source§

unsafe fn deref_mut<'a>(ptr: usize) -> &'a mut T

Mutably dereferences the given pointer. Read more
Source§

unsafe fn drop(ptr: usize)

Drops the object pointed to by the given pointer. Read more
Source§

impl<T> Same for T

Source§

type Output = T

Should always be Self
Source§

impl<T> ToOwned for T
where T: Clone,

Source§

type Owned = T

The resulting type after obtaining ownership.
Source§

fn to_owned(&self) -> T

Creates owned data from borrowed data, usually by cloning. Read more
Source§

fn clone_into(&self, target: &mut T)

Uses borrowed data to replace owned data, usually by cloning. Read more
Source§

impl<T, U> TryFrom<U> for T
where U: Into<T>,

Source§

type Error = Infallible

The type returned in the event of a conversion error.
Source§

fn try_from(value: U) -> Result<T, <T as TryFrom<U>>::Error>

Performs the conversion.
Source§

impl<T, U> TryInto<U> for T
where U: TryFrom<T>,

Source§

type Error = <U as TryFrom<T>>::Error

The type returned in the event of a conversion error.
Source§

fn try_into(self) -> Result<U, <U as TryFrom<T>>::Error>

Performs the conversion.
Source§

impl<V, T> VZip<V> for T
where V: MultiLane<T>,

Source§

fn vzip(self) -> V