Skip to main content

StructureResult

Struct StructureResult 

Source
pub struct StructureResult {
    pub input_path: Arc<str>,
    pub index: usize,
    pub layout_elements: Vec<LayoutElement>,
    pub tables: Vec<TableResult>,
    pub formulas: Vec<FormulaResult>,
    pub text_regions: Option<Vec<TextRegion>>,
    pub orientation_angle: Option<f32>,
    pub region_blocks: Option<Vec<RegionBlock>>,
    pub rectified_img: Option<Arc<ImageBuffer<Rgb<u8>, Vec<u8>>>>,
    pub page_continuation_flags: Option<PageContinuationFlags>,
}
Expand description

Result of document structure analysis.

This struct contains all the results from analyzing a document’s structure, including layout elements, tables, formulas, and OCR results.

§Coordinate System

The coordinate system of bounding boxes depends on which preprocessing was applied:

  • No preprocessing: Boxes are in the original input image’s coordinate system.

  • Orientation correction only (orientation_angle set, rectified_img is None): Boxes are transformed back to the original input image’s coordinate system.

  • Rectification applied (rectified_img is Some): Boxes remain in the rectified image’s coordinate system. Neural network-based rectification (UVDoc) warps cannot be precisely inverted, so use rectified_img for visualization instead of the original image.

  • Both orientation and rectification: Boxes are in the rectified coordinate system (rectification takes precedence since it’s applied after orientation correction).

Fields§

§input_path: Arc<str>

Path to the input image file

§index: usize

Index of the image in a batch (0 for single image processing)

§layout_elements: Vec<LayoutElement>

Detected layout elements (text regions, tables, figures, etc.)

§tables: Vec<TableResult>

Recognized tables with their structure and content

§formulas: Vec<FormulaResult>

Recognized mathematical formulas

§text_regions: Option<Vec<TextRegion>>

OCR text regions (if OCR was integrated)

§orientation_angle: Option<f32>

Document orientation angle (if orientation correction was used)

§region_blocks: Option<Vec<RegionBlock>>

Detected region blocks for hierarchical ordering (PP-DocBlockLayout) When present, layout_elements are already sorted by region hierarchy

§rectified_img: Option<Arc<ImageBuffer<Rgb<u8>, Vec<u8>>>>

Rectified image (if document rectification was used) Note: Bounding boxes are already transformed back to original coordinates for rotation, but for rectification (UVDoc), boxes are in the rectified image’s coordinate system. Use this image for visualization when rectification was applied.

§page_continuation_flags: Option<PageContinuationFlags>

Page continuation flags for multi-page document processing. This indicates whether this page continues a paragraph from the previous page or continues to the next page, which is crucial for proper markdown concatenation.

Implementations§

Source§

impl StructureResult

Source

pub fn new(input_path: impl Into<Arc<str>>, index: usize) -> StructureResult

Creates a new structure result.

Source

pub fn with_layout_elements( self, elements: Vec<LayoutElement>, ) -> StructureResult

Adds layout elements to the result.

Source

pub fn with_tables(self, tables: Vec<TableResult>) -> StructureResult

Adds tables to the result.

Source

pub fn with_formulas(self, formulas: Vec<FormulaResult>) -> StructureResult

Adds formulas to the result.

Source

pub fn with_text_regions(self, regions: Vec<TextRegion>) -> StructureResult

Adds OCR text regions to the result.

Source

pub fn with_region_blocks(self, blocks: Vec<RegionBlock>) -> StructureResult

Adds region blocks to the result (PP-DocBlockLayout).

Region blocks represent hierarchical groupings of layout elements. When set, layout_elements should already be sorted by region hierarchy.

Source

pub fn with_page_continuation_flags( self, flags: PageContinuationFlags, ) -> StructureResult

Sets page continuation flags for multi-page document processing.

Source

pub fn to_markdown(&self) -> String

Converts the result to a Markdown string.

Follows PP-StructureV3’s formatting rules:

  • DocTitle: # title
  • ParagraphTitle: Auto-detect numbering (1.2.3 -> ###)
  • Formula: $$latex$$
  • Table: HTML with border
  • Images: ![Figure](caption)

Note: Low-confidence text elements that overlap with table regions are filtered out to avoid duplicate content from table OCR.

Source

pub fn calculate_continuation_flags(&self) -> PageContinuationFlags

Calculates the page continuation flags for this result.

This follows PaddleX’s get_seg_flag logic to determine whether the page starts/ends in the middle of a semantic paragraph.

Returns (paragraph_start, paragraph_end) where:

  • paragraph_start: false means page continues from previous
  • paragraph_end: false means content continues to next page
Source

pub fn to_html(&self) -> String

Converts the result to an HTML string.

Follows PP-StructureV3’s formatting rules with semantic HTML tags.

Source

pub fn to_json_value(&self) -> Result<Value, Error>

Converts the result to a JSON Value.

Source

pub fn save_results( &self, output_dir: impl AsRef<Path>, to_json: bool, to_html: bool, ) -> Result<(), Error>

Saves the analysis results to the specified directory.

This generates:

  • *_res.json: The full structured result
  • *_res.html: An HTML representation

Note: Markdown export with image extraction should use the example utilities (examples/utils/markdown.rs) instead, as that requires I/O operations that belong in the application layer. Use StructureResult::to_markdown() for pure markdown generation without side effects.

§Arguments
  • output_dir - Directory to save the output files
  • to_json - If true, save a JSON representation
  • to_html - If true, save an HTML representation

Trait Implementations§

Source§

impl Clone for StructureResult

Source§

fn clone(&self) -> StructureResult

Returns a duplicate of the value. Read more
1.0.0 · Source§

fn clone_from(&mut self, source: &Self)

Performs copy-assignment from source. Read more
Source§

impl Debug for StructureResult

Source§

fn fmt(&self, f: &mut Formatter<'_>) -> Result<(), Error>

Formats the value using the given formatter. Read more
Source§

impl<'de> Deserialize<'de> for StructureResult

Source§

fn deserialize<__D>( __deserializer: __D, ) -> Result<StructureResult, <__D as Deserializer<'de>>::Error>
where __D: Deserializer<'de>,

Deserialize this value from the given Serde deserializer. Read more
Source§

impl Serialize for StructureResult

Source§

fn serialize<__S>( &self, __serializer: __S, ) -> Result<<__S as Serializer>::Ok, <__S as Serializer>::Error>
where __S: Serializer,

Serialize this value into the given Serde serializer. Read more
Source§

impl StructureResultExt for StructureResult

Source§

fn to_concatenated_markdown(results: &[StructureResult]) -> String

Converts multiple results to a single concatenated markdown.
Source§

fn save_multi_page_results( results: &[StructureResult], output_dir: impl AsRef<Path>, base_name: &str, to_json: bool, to_markdown: bool, to_html: bool, ) -> Result<(), Error>

Saves multiple results with concatenated markdown.

Auto Trait Implementations§

Blanket Implementations§

Source§

impl<T> Any for T
where T: 'static + ?Sized,

Source§

fn type_id(&self) -> TypeId

Gets the TypeId of self. Read more
Source§

impl<T> Borrow<T> for T
where T: ?Sized,

Source§

fn borrow(&self) -> &T

Immutably borrows from an owned value. Read more
Source§

impl<T> BorrowMut<T> for T
where T: ?Sized,

Source§

fn borrow_mut(&mut self) -> &mut T

Mutably borrows from an owned value. Read more
Source§

impl<T> CloneToUninit for T
where T: Clone,

Source§

unsafe fn clone_to_uninit(&self, dest: *mut u8)

🔬This is a nightly-only experimental API. (clone_to_uninit)
Performs copy-assignment from self to dest. Read more
Source§

impl<T> From<T> for T

Source§

fn from(t: T) -> T

Returns the argument unchanged.

Source§

impl<T> Instrument for T

Source§

fn instrument(self, span: Span) -> Instrumented<Self>

Instruments this type with the provided Span, returning an Instrumented wrapper. Read more
Source§

fn in_current_span(self) -> Instrumented<Self>

Instruments this type with the current Span, returning an Instrumented wrapper. Read more
Source§

impl<T, U> Into<U> for T
where U: From<T>,

Source§

fn into(self) -> U

Calls U::from(self).

That is, this conversion is whatever the implementation of From<T> for U chooses to do.

Source§

impl<T> IntoEither for T

Source§

fn into_either(self, into_left: bool) -> Either<Self, Self>

Converts self into a Left variant of Either<Self, Self> if into_left is true. Converts self into a Right variant of Either<Self, Self> otherwise. Read more
Source§

fn into_either_with<F>(self, into_left: F) -> Either<Self, Self>
where F: FnOnce(&Self) -> bool,

Converts self into a Left variant of Either<Self, Self> if into_left(&self) returns true. Converts self into a Right variant of Either<Self, Self> otherwise. Read more
Source§

impl<T> Pointable for T

Source§

const ALIGN: usize

The alignment of pointer.
Source§

type Init = T

The type for initializers.
Source§

unsafe fn init(init: <T as Pointable>::Init) -> usize

Initializes a with the given initializer. Read more
Source§

unsafe fn deref<'a>(ptr: usize) -> &'a T

Dereferences the given pointer. Read more
Source§

unsafe fn deref_mut<'a>(ptr: usize) -> &'a mut T

Mutably dereferences the given pointer. Read more
Source§

unsafe fn drop(ptr: usize)

Drops the object pointed to by the given pointer. Read more
Source§

impl<T> Same for T

Source§

type Output = T

Should always be Self
Source§

impl<SS, SP> SupersetOf<SS> for SP
where SS: SubsetOf<SP>,

Source§

fn to_subset(&self) -> Option<SS>

The inverse inclusion map: attempts to construct self from the equivalent element of its superset. Read more
Source§

fn is_in_subset(&self) -> bool

Checks if self is actually part of its subset T (and can be converted to it).
Source§

fn to_subset_unchecked(&self) -> SS

Use with care! Same as self.to_subset but without any property checks. Always succeeds.
Source§

fn from_subset(element: &SS) -> SP

The inclusion map: converts self to the equivalent element of its superset.
Source§

impl<T> ToOwned for T
where T: Clone,

Source§

type Owned = T

The resulting type after obtaining ownership.
Source§

fn to_owned(&self) -> T

Creates owned data from borrowed data, usually by cloning. Read more
Source§

fn clone_into(&self, target: &mut T)

Uses borrowed data to replace owned data, usually by cloning. Read more
Source§

impl<T, U> TryFrom<U> for T
where U: Into<T>,

Source§

type Error = Infallible

The type returned in the event of a conversion error.
Source§

fn try_from(value: U) -> Result<T, <T as TryFrom<U>>::Error>

Performs the conversion.
Source§

impl<T, U> TryInto<U> for T
where U: TryFrom<T>,

Source§

type Error = <U as TryFrom<T>>::Error

The type returned in the event of a conversion error.
Source§

fn try_into(self) -> Result<U, <U as TryFrom<T>>::Error>

Performs the conversion.
Source§

impl<V, T> VZip<V> for T
where V: MultiLane<T>,

Source§

fn vzip(self) -> V

Source§

impl<T> WithSubscriber for T

Source§

fn with_subscriber<S>(self, subscriber: S) -> WithDispatch<Self>
where S: Into<Dispatch>,

Attaches the provided Subscriber to this type, returning a WithDispatch wrapper. Read more
Source§

fn with_current_subscriber(self) -> WithDispatch<Self>

Attaches the current default Subscriber to this type, returning a WithDispatch wrapper. Read more
Source§

impl<T> DeserializeOwned for T
where T: for<'de> Deserialize<'de>,