Decodes a text string.
Depending on the BOM at the start of the string, a different encoding is chosen.
All encodings specified in PDF2.0 are supported (PDFDocEncoding, UTF-16BE,
and UTF-8).
Encodes the given str to UTF-8. This method of encoding text strings
is first specified in PDF2.0 and reader support is still lacking
(notably, Adobe Acrobat Reader doesn’t support it at the time of writing).
Thus, using it is NOT RECOMMENDED.
Encodes the given str to UTF-16BE.
The recommended way to encode text strings, as it supports all of
unicode and all major PDF readers support it.
Extract the text from a pdf at path and return a String with the results
Extract the text from a pdf at path and return a Vec<String> with the results separately by page
Parse a given document and output it to output
Creates a text string.
If the input only contains ASCII characters, the string is encoded
in PDFDocEncoding, otherwise in UTF-16BE.