pub struct Extracted {
pub text: String,
pub metadata: BTreeMap<String, MetaValue>,
}Expand description
The result of extracting one document: the plain text plus a small, format-tagged metadata map.
This is the --json shape the CLI emits verbatim ({text, metadata}); in
plain mode the CLI prints Extracted::text and discards the metadata.
Metadata is intentionally minimal and best-effort — extraction never fails
for want of a title; it just omits the key.
Fields§
§text: StringThe extracted plain text (UTF-8), normalized to \n line endings with
trailing whitespace trimmed per line and a single trailing newline. For
a document with no recoverable text layer (e.g. a scanned, image-only
PDF) this is the empty string — the contract is “empty in, empty out.”
metadata: BTreeMap<String, MetaValue>Best-effort key/value metadata. Always carries format (the adapter
that ran, e.g. "pdf"). Adapters add what they cheaply know:
pages/sheets/sheet_names (counts), title (when the container
declares one). A BTreeMap so --json output is key-ordered and stable.
Trait Implementations§
impl Eq for Extracted
impl StructuralPartialEq for Extracted
Auto Trait Implementations§
impl Freeze for Extracted
impl RefUnwindSafe for Extracted
impl Send for Extracted
impl Sync for Extracted
impl Unpin for Extracted
impl UnsafeUnpin for Extracted
impl UnwindSafe for Extracted
Blanket Implementations§
Source§impl<T> BorrowMut<T> for Twhere
T: ?Sized,
impl<T> BorrowMut<T> for Twhere
T: ?Sized,
Source§fn borrow_mut(&mut self) -> &mut T
fn borrow_mut(&mut self) -> &mut T
Source§impl<T> CloneToUninit for Twhere
T: Clone,
impl<T> CloneToUninit for Twhere
T: Clone,
Source§impl<Q, K> Equivalent<K> for Q
impl<Q, K> Equivalent<K> for Q
Source§impl<Q, K> Equivalent<K> for Q
impl<Q, K> Equivalent<K> for Q
Source§fn equivalent(&self, key: &K) -> bool
fn equivalent(&self, key: &K) -> bool
key and return true if they are equal.