pub trait Extractor: Send + Sync {
// Required method
fn extract(&self, html: &str) -> Result<ExtractionOutput, ExtractionError>;
}Expand description
Trait implemented by each extraction format (JSON-LD, Microdata, RDFa).
Provides a unified interface for extracting structured data from raw HTML.
Each implementation parses the HTML internally using scraper.
For better performance when running multiple extractors, use the
format-specific extract_from_document() methods which accept a
pre-parsed scraper::Html document.
§Examples
use schemaorg_rs::extraction::{Extractor, MicrodataExtractor};
let html = r#"<html><body>
<div itemscope itemtype="https://schema.org/Product">
<span itemprop="name">Widget</span>
</div>
</body></html>"#;
let output = MicrodataExtractor.extract(html).unwrap();
assert_eq!(output.nodes[0].types, vec!["Product"]);Required Methods§
Sourcefn extract(&self, html: &str) -> Result<ExtractionOutput, ExtractionError>
fn extract(&self, html: &str) -> Result<ExtractionOutput, ExtractionError>
Extracts structured data nodes from an HTML document.
§Errors
Returns ExtractionError if a fatal error prevents extraction.
Most issues are captured as warnings in the returned
ExtractionOutput instead.
Dyn Compatibility§
This trait is dyn compatible.
In older versions of Rust, dyn compatibility was called "object safety".