pub struct ObjectStore<S: PdfSource = Arc<[u8]>> { /* private fields */ }Expand description
The central PDF object store with thread-safe lazy parsing.
Generic over the source data type via PdfSource.
Implementations§
Source§impl<S: PdfSource> ObjectStore<S>
impl<S: PdfSource> ObjectStore<S>
Sourcepub fn open(source: S, mode: ParsingMode) -> Result<Self, PdfError>
pub fn open(source: S, mode: ParsingMode) -> Result<Self, PdfError>
Open a PDF source, parsing the header, xref table, and trailer.
This constructs the ObjectStore with all object slots ready for
lazy parsing on demand. For encrypted PDFs, use Self::open_with_password.
Sourcepub fn open_with_password(
source: S,
mode: ParsingMode,
password: Option<&str>,
) -> Result<Self, PdfError>
pub fn open_with_password( source: S, mode: ParsingMode, password: Option<&str>, ) -> Result<Self, PdfError>
Open a PDF source, optionally providing a password for encrypted documents.
After parsing the header, xref, and trailer, if the trailer contains
an /Encrypt entry, the encryption dictionary is resolved and a
SecurityHandler is created. The password is verified against both
user and owner password hashes.
Sourcepub fn resolve(&self, id: ObjectId) -> Result<&Object, PdfError>
pub fn resolve(&self, id: ObjectId) -> Result<&Object, PdfError>
Resolve an indirect object by its ID. Returns a reference to the lazily-parsed object.
Sourcepub fn deep_resolve<'a>(
&'a self,
obj: &'a Object,
) -> Result<&'a Object, PdfError>
pub fn deep_resolve<'a>( &'a self, obj: &'a Object, ) -> Result<&'a Object, PdfError>
Follow a reference chain to the concrete object.
Uses an iterative loop with circular reference detection
via SmallVec. Never recurses on the call stack.
Sourcepub fn dict_resolve<'a>(
&'a self,
dict: &'a HashMap<Name, Object>,
key: &Name,
) -> Result<Option<&'a Object>, PdfError>
pub fn dict_resolve<'a>( &'a self, dict: &'a HashMap<Name, Object>, key: &Name, ) -> Result<Option<&'a Object>, PdfError>
Resolve a dictionary value by key, following references.
Sourcepub fn decode_stream(&self, stream: &Object) -> Result<Vec<u8>, PdfError>
pub fn decode_stream(&self, stream: &Object) -> Result<Vec<u8>, PdfError>
Decode stream data on demand using the filter chain from the stream dictionary.
Resolves /Filter and /DecodeParms from the stream dictionary,
then applies the codec pipeline via rpdfium_codec::apply_filter_chain.
Sourcepub fn parsing_mode(&self) -> ParsingMode
pub fn parsing_mode(&self) -> ParsingMode
Get the parsing mode.
Sourcepub fn trailer(&self) -> &TrailerInfo
pub fn trailer(&self) -> &TrailerInfo
Get the trailer info.
Sourcepub fn get_trailer(&self) -> &TrailerInfo
pub fn get_trailer(&self) -> &TrailerInfo
ADR-019 alias for trailer().
Corresponds to CPDF_Parser::GetTrailer() in PDFium.
Sourcepub fn file_version(&self) -> PdfVersion
pub fn file_version(&self) -> PdfVersion
Get the PDF version from the header.
Corresponds to CPDF_Parser::GetFileVersion() in PDFium.
Sourcepub fn get_file_version(&self) -> PdfVersion
pub fn get_file_version(&self) -> PdfVersion
ADR-019 alias for file_version().
Corresponds to CPDF_Parser::GetFileVersion() in PDFium.
Sourcepub fn version(&self) -> PdfVersion
👎Deprecated since 0.0.0: use file_version() or get_file_version()
pub fn version(&self) -> PdfVersion
use file_version() or get_file_version()
Rust-idiomatic alias for file_version().
Sourcepub fn object_count(&self) -> usize
pub fn object_count(&self) -> usize
Get the number of object slots.
Sourcepub fn object_ids(&self) -> impl Iterator<Item = &ObjectId>
pub fn object_ids(&self) -> impl Iterator<Item = &ObjectId>
Get all known object IDs.
Sourcepub fn security_handler(&self) -> Option<&SecurityHandler>
pub fn security_handler(&self) -> Option<&SecurityHandler>
Returns a reference to the security handler, if the document is encrypted.
Sourcepub fn get_security_handler(&self) -> Option<&SecurityHandler>
pub fn get_security_handler(&self) -> Option<&SecurityHandler>
ADR-019 alias for security_handler().
Corresponds to CPDF_Parser::GetSecurityHandler() in PDFium.
Sourcepub fn permissions(&self) -> Option<Permissions>
pub fn permissions(&self) -> Option<Permissions>
Returns the document access permissions, if the document is encrypted.
Delegates to SecurityHandler::permissions(). Returns None for
unencrypted documents (all permissions implicitly granted).
Corresponds to CPDF_Parser::GetPermissions() in PDFium.
Sourcepub fn get_permissions(&self) -> Option<Permissions>
pub fn get_permissions(&self) -> Option<Permissions>
ADR-019 alias for permissions().
Corresponds to CPDF_Parser::GetPermissions() in PDFium.
Sourcepub fn encoded_password(&self) -> Option<&[u8]>
pub fn encoded_password(&self) -> Option<&[u8]>
Returns the encoded password bytes used during authentication, if the document is encrypted.
Delegates to SecurityHandler::encoded_password(). Returns None for
unencrypted documents.
Corresponds to CPDF_Parser::GetEncodedPassword() in PDFium.
Sourcepub fn get_encoded_password(&self) -> Option<&[u8]>
pub fn get_encoded_password(&self) -> Option<&[u8]>
ADR-019 alias for encoded_password().
Corresponds to CPDF_Parser::GetEncodedPassword() in PDFium.
Sourcepub fn xref_table_rebuilt(&self) -> bool
pub fn xref_table_rebuilt(&self) -> bool
Returns true if the cross-reference table was rebuilt (Lenient mode recovery).
When true, the original xref table could not be parsed and the parser
fell back to a linear scan for N G obj markers.
Corresponds to CPDF_Parser::IsXRefTableRebuilt() in PDFium.
Sourcepub fn is_xref_table_rebuilt(&self) -> bool
pub fn is_xref_table_rebuilt(&self) -> bool
ADR-019 alias for xref_table_rebuilt().
Corresponds to CPDF_Parser::IsXRefTableRebuilt() in PDFium.
Sourcepub fn xref_rebuilt(&self) -> bool
👎Deprecated since 0.0.0: use xref_table_rebuilt() or is_xref_table_rebuilt()
pub fn xref_rebuilt(&self) -> bool
use xref_table_rebuilt() or is_xref_table_rebuilt()
Abbreviated alias for xref_table_rebuilt().
Sourcepub fn is_xref_stream(&self) -> bool
pub fn is_xref_stream(&self) -> bool
Returns true if the newest cross-reference section is an xref stream
(PDF 1.5+ compressed xref), false if it is a traditional xref table.
Corresponds to CPDF_Parser::IsXRefStream() in PDFium.
Sourcepub fn object_position_or_zero(&self, id: ObjectId) -> Option<u64>
pub fn object_position_or_zero(&self, id: ObjectId) -> Option<u64>
Returns the byte offset of the given object in the source data, if the
object is a direct (non-stream) object. Returns None for objects
embedded in object streams (ObjStm).
Corresponds to CPDF_Parser::GetObjectPositionOrZero() in PDFium.
Sourcepub fn get_object_position_or_zero(&self, id: ObjectId) -> Option<u64>
pub fn get_object_position_or_zero(&self, id: ObjectId) -> Option<u64>
ADR-019 alias for object_position_or_zero().
Corresponds to CPDF_Parser::GetObjectPositionOrZero() in PDFium.
Sourcepub fn object_offset(&self, id: ObjectId) -> Option<u64>
👎Deprecated since 0.0.0: use object_position_or_zero() or get_object_position_or_zero()
pub fn object_offset(&self, id: ObjectId) -> Option<u64>
use object_position_or_zero() or get_object_position_or_zero()
Rust-idiomatic alias for object_position_or_zero().
Sourcepub fn source_data(&self) -> &S
pub fn source_data(&self) -> &S
Returns a reference to the raw source data.
Sourcepub fn last_obj_num(&self) -> u32
pub fn last_obj_num(&self) -> u32
Returns the maximum object number in the store.
Corresponds to CPDF_Parser::GetLastObjNum() in PDFium.
Sourcepub fn get_last_obj_num(&self) -> u32
pub fn get_last_obj_num(&self) -> u32
ADR-019 alias for last_obj_num().
Corresponds to CPDF_Parser::GetLastObjNum() in PDFium.
Sourcepub fn max_object_number(&self) -> u32
👎Deprecated since 0.0.0: use last_obj_num() or get_last_obj_num()
pub fn max_object_number(&self) -> u32
use last_obj_num() or get_last_obj_num()
Rust-idiomatic alias for last_obj_num().
Sourcepub fn last_xref_offset(&self) -> u64
pub fn last_xref_offset(&self) -> u64
Returns the byte offset of the last startxref value.
This is needed for incremental saves to set the /Prev trailer key.
Corresponds to CPDF_Parser::GetLastXRefOffset() in PDFium.
Sourcepub fn xref_start_offset(&self) -> u64
👎Deprecated since 0.0.0: use last_xref_offset() or get_last_xref_offset()
pub fn xref_start_offset(&self) -> u64
use last_xref_offset() or get_last_xref_offset()
Rust-idiomatic alias for last_xref_offset().
Sourcepub fn is_valid_object_number(&self, number: u32) -> bool
pub fn is_valid_object_number(&self, number: u32) -> bool
Returns true if the given object number exists in the cross-reference
table (i.e. is a valid, non-free indirect object).
Corresponds to CPDF_Parser::IsValidObjectNumber() in PDFium.
Sourcepub fn get_last_xref_offset(&self) -> u64
pub fn get_last_xref_offset(&self) -> u64
ADR-019 alias for last_xref_offset().
Corresponds to CPDF_Parser::GetLastXRefOffset() in PDFium.
Sourcepub fn is_object_free(&self, number: u32) -> bool
pub fn is_object_free(&self, number: u32) -> bool
Returns true if the given object number is marked free or null (i.e. does not
exist as an in-use object in the store).
Corresponds to CPDF_Parser::IsObjectFreeOrNull() in PDFium.
Sourcepub fn is_object_free_or_null(&self, number: u32) -> bool
pub fn is_object_free_or_null(&self, number: u32) -> bool
ADR-019 alias for is_object_free().
Corresponds to CPDF_Parser::IsObjectFreeOrNull() in PDFium.
Sourcepub fn document_size(&self) -> usize
pub fn document_size(&self) -> usize
Returns the total size of the source document in bytes.
Corresponds to CPDF_Parser::GetDocumentSize() in PDFium.
Sourcepub fn get_document_size(&self) -> usize
pub fn get_document_size(&self) -> usize
ADR-019 alias for document_size().
Corresponds to CPDF_Parser::GetDocumentSize() in PDFium.
Sourcepub fn decode_stream_for_object(
&self,
stream: &Object,
obj_id: ObjectId,
) -> Result<Vec<u8>, PdfError>
pub fn decode_stream_for_object( &self, stream: &Object, obj_id: ObjectId, ) -> Result<Vec<u8>, PdfError>
Decode stream data for a specific object, applying decryption if needed.
Like Self::decode_stream, but also decrypts the raw stream data before
applying the filter chain when the document is encrypted.
Sourcepub fn raw_stream_bytes_for_object(
&self,
stream: &Object,
obj_id: ObjectId,
) -> Result<Vec<u8>, PdfError>
pub fn raw_stream_bytes_for_object( &self, stream: &Object, obj_id: ObjectId, ) -> Result<Vec<u8>, PdfError>
Return the raw (optionally decrypted) stream bytes without applying any filter chain. This is useful when the caller needs to handle a specific filter (like JPXDecode) specially to extract metadata.