Skip to main content

PdfString

Struct PdfString 

Source
pub struct PdfString { /* private fields */ }
Expand description

A PDF string with encoding-aware conversion.

Stores the raw bytes as they appear in the PDF file. The encoding is detected from the content: if the bytes start with 0xFE 0xFF (BOM), the string is UTF-16BE; otherwise it is PDFDocEncoding.

Implementations§

Source§

impl PdfString

Source

pub fn from_bytes(bytes: Vec<u8>) -> Self

Create a PdfString from raw bytes (as parsed from the PDF).

Source

pub fn from_unicode(s: &str) -> Self

Encode a UTF-8 string as a PDF string.

Uses PDFDocEncoding if every character is representable; otherwise uses UTF-16BE with a 0xFE 0xFF byte-order mark. This matches the logic of PDF_EncodeText() in PDFium upstream.

§Examples
let ascii = PdfString::from_unicode("hello");
assert_eq!(ascii.encoding(), PdfStringEncoding::PdfDocEncoding);

let unicode = PdfString::from_unicode("日本語");
assert_eq!(unicode.encoding(), PdfStringEncoding::Utf16Be);
Source

pub fn as_bytes(&self) -> &[u8]

Raw bytes (for binary operations, stream /Length, etc.).

Source

pub fn encoding(&self) -> PdfStringEncoding

Detect encoding from the byte-order mark.

Source

pub fn to_string_lossy(&self) -> String

Decode to a Rust String (UTF-8), handling all PDF string encodings.

  • UTF-16BE: decoded with surrogate-pair support; invalid pairs → U+FFFD.
  • UTF-8 BOM: decoded as UTF-8 after stripping the BOM.
  • PDFDocEncoding: each byte mapped to Unicode per ISO 32000-2 Annex D.

ISO 2022 language-tag escape sequences (U+001B…U+001B) present in UTF-16BE and UTF-8 BOM strings are stripped, matching the behaviour of StripLanguageCodes() / PDF_DecodeText() in PDFium upstream.

Source

pub fn is_empty(&self) -> bool

Returns true if the string has no bytes.

Source

pub fn len(&self) -> usize

Returns the length of the raw byte representation.

Source

pub fn unicode_data(&self) -> String

👎Deprecated:

use to_string_lossy() instead

Decode to a Rust String (UTF-8), handling both PDF encodings.

Deprecated; use to_string_lossy instead.

Source

pub fn get_unicode_data(&self) -> String

Upstream-aligned alias for to_string_lossy.

Corresponds to ByteString::GetUnicodeData() in PDFium upstream.

Source

pub fn get_raw_string(&self) -> &[u8]

Upstream-aligned alias for as_bytes.

Corresponds to ByteString::GetRawString() in PDFium upstream.

Trait Implementations§

Source§

impl Clone for PdfString

Source§

fn clone(&self) -> PdfString

Returns a duplicate of the value. Read more
1.0.0 · Source§

fn clone_from(&mut self, source: &Self)

Performs copy-assignment from source. Read more
Source§

impl Debug for PdfString

Source§

fn fmt(&self, f: &mut Formatter<'_>) -> Result

Formats the value using the given formatter. Read more
Source§

impl Display for PdfString

Source§

fn fmt(&self, f: &mut Formatter<'_>) -> Result

Formats the value using the given formatter. Read more
Source§

impl Hash for PdfString

Source§

fn hash<__H: Hasher>(&self, state: &mut __H)

Feeds this value into the given Hasher. Read more
1.3.0 · Source§

fn hash_slice<H>(data: &[Self], state: &mut H)
where H: Hasher, Self: Sized,

Feeds a slice of this type into the given Hasher. Read more
Source§

impl PartialEq for PdfString

Source§

fn eq(&self, other: &PdfString) -> bool

Tests for self and other values to be equal, and is used by ==.
1.0.0 · Source§

fn ne(&self, other: &Rhs) -> bool

Tests for !=. The default implementation is almost always sufficient, and should not be overridden without very good reason.
Source§

impl Eq for PdfString

Source§

impl StructuralPartialEq for PdfString

Auto Trait Implementations§

Blanket Implementations§

Source§

impl<T> Any for T
where T: 'static + ?Sized,

Source§

fn type_id(&self) -> TypeId

Gets the TypeId of self. Read more
Source§

impl<T> Borrow<T> for T
where T: ?Sized,

Source§

fn borrow(&self) -> &T

Immutably borrows from an owned value. Read more
Source§

impl<T> BorrowMut<T> for T
where T: ?Sized,

Source§

fn borrow_mut(&mut self) -> &mut T

Mutably borrows from an owned value. Read more
Source§

impl<T> CloneToUninit for T
where T: Clone,

Source§

unsafe fn clone_to_uninit(&self, dest: *mut u8)

🔬This is a nightly-only experimental API. (clone_to_uninit)
Performs copy-assignment from self to dest. Read more
Source§

impl<T> From<T> for T

Source§

fn from(t: T) -> T

Returns the argument unchanged.

Source§

impl<T, U> Into<U> for T
where U: From<T>,

Source§

fn into(self) -> U

Calls U::from(self).

That is, this conversion is whatever the implementation of From<T> for U chooses to do.

Source§

impl<T> ToOwned for T
where T: Clone,

Source§

type Owned = T

The resulting type after obtaining ownership.
Source§

fn to_owned(&self) -> T

Creates owned data from borrowed data, usually by cloning. Read more
Source§

fn clone_into(&self, target: &mut T)

Uses borrowed data to replace owned data, usually by cloning. Read more
Source§

impl<T> ToString for T
where T: Display + ?Sized,

Source§

fn to_string(&self) -> String

Converts the given value to a String. Read more
Source§

impl<T, U> TryFrom<U> for T
where U: Into<T>,

Source§

type Error = Infallible

The type returned in the event of a conversion error.
Source§

fn try_from(value: U) -> Result<T, <T as TryFrom<U>>::Error>

Performs the conversion.
Source§

impl<T, U> TryInto<U> for T
where U: TryFrom<T>,

Source§

type Error = <U as TryFrom<T>>::Error

The type returned in the event of a conversion error.
Source§

fn try_into(self) -> Result<U, <U as TryFrom<T>>::Error>

Performs the conversion.