pub struct StructuredDataDetector { /* private fields */ }Expand description
Main detector for structured data patterns in PDF text.
This detector analyzes text fragments to identify:
- Tables (using spatial clustering)
- Key-value pairs (using pattern matching)
- Multi-column layouts (using gap analysis)
§Examples
use oxidize_pdf::text::structured::{StructuredDataDetector, StructuredDataConfig};
use oxidize_pdf::text::extraction::TextFragment;
let config = StructuredDataConfig::default();
let detector = StructuredDataDetector::new(config);
let fragments: Vec<TextFragment> = vec![]; // from PDF extraction
let result = detector.detect(&fragments)?;
for table in &result.tables {
println!("Table: {}x{} rows (confidence: {:.2})",
table.row_count(), table.column_count(), table.confidence);
}Implementations§
Source§impl StructuredDataDetector
impl StructuredDataDetector
Sourcepub fn new(config: StructuredDataConfig) -> Self
pub fn new(config: StructuredDataConfig) -> Self
Creates a new detector with the given configuration.
Sourcepub fn detect(
&self,
fragments: &[TextFragment],
) -> Result<StructuredDataResult, String>
pub fn detect( &self, fragments: &[TextFragment], ) -> Result<StructuredDataResult, String>
Detects structured data patterns in the given text fragments.
This is the main entry point for structured data extraction. It analyzes the text fragments and returns all detected patterns.
§Arguments
fragments- Text fragments extracted from a PDF page
§Returns
A StructuredDataResult containing all detected patterns.
§Errors
Returns an error if the detection algorithms fail (currently infallible).
Sourcepub fn config(&self) -> &StructuredDataConfig
pub fn config(&self) -> &StructuredDataConfig
Gets the current configuration.
Sourcepub fn set_config(&mut self, config: StructuredDataConfig)
pub fn set_config(&mut self, config: StructuredDataConfig)
Updates the configuration.
Trait Implementations§
Source§impl Clone for StructuredDataDetector
impl Clone for StructuredDataDetector
Source§fn clone(&self) -> StructuredDataDetector
fn clone(&self) -> StructuredDataDetector
Returns a duplicate of the value. Read more
1.0.0 · Source§fn clone_from(&mut self, source: &Self)
fn clone_from(&mut self, source: &Self)
Performs copy-assignment from
source. Read moreSource§impl Debug for StructuredDataDetector
impl Debug for StructuredDataDetector
Auto Trait Implementations§
impl Freeze for StructuredDataDetector
impl RefUnwindSafe for StructuredDataDetector
impl Send for StructuredDataDetector
impl Sync for StructuredDataDetector
impl Unpin for StructuredDataDetector
impl UnwindSafe for StructuredDataDetector
Blanket Implementations§
Source§impl<T> BorrowMut<T> for Twhere
T: ?Sized,
impl<T> BorrowMut<T> for Twhere
T: ?Sized,
Source§fn borrow_mut(&mut self) -> &mut T
fn borrow_mut(&mut self) -> &mut T
Mutably borrows from an owned value. Read more
Source§impl<T> CloneToUninit for Twhere
T: Clone,
impl<T> CloneToUninit for Twhere
T: Clone,
Source§impl<T> Instrument for T
impl<T> Instrument for T
Source§fn instrument(self, span: Span) -> Instrumented<Self>
fn instrument(self, span: Span) -> Instrumented<Self>
Source§fn in_current_span(self) -> Instrumented<Self>
fn in_current_span(self) -> Instrumented<Self>
Source§impl<T> IntoEither for T
impl<T> IntoEither for T
Source§fn into_either(self, into_left: bool) -> Either<Self, Self>
fn into_either(self, into_left: bool) -> Either<Self, Self>
Converts
self into a Left variant of Either<Self, Self>
if into_left is true.
Converts self into a Right variant of Either<Self, Self>
otherwise. Read moreSource§fn into_either_with<F>(self, into_left: F) -> Either<Self, Self>
fn into_either_with<F>(self, into_left: F) -> Either<Self, Self>
Converts
self into a Left variant of Either<Self, Self>
if into_left(&self) returns true.
Converts self into a Right variant of Either<Self, Self>
otherwise. Read moreSource§impl<T> Pointable for T
impl<T> Pointable for T
Source§impl<R, P> ReadPrimitive<R> for P
impl<R, P> ReadPrimitive<R> for P
Source§fn read_from_little_endian(read: &mut R) -> Result<Self, Error>
fn read_from_little_endian(read: &mut R) -> Result<Self, Error>
Read this value from the supplied reader. Same as
ReadEndian::read_from_little_endian().