Skip to main content

extract_all

Function extract_all 

Source
pub fn extract_all(html: &str) -> Result<StructuredDataGraph, ExtractionError>
Expand description

Extracts all structured data from an HTML document.

Runs JSON-LD, Microdata, and RDFa Lite extractors and merges the results into a single StructuredDataGraph.

Individual extractor failures are captured as warnings; only truly fatal errors (e.g. inability to parse HTML) propagate as errors.

§Errors

Returns ExtractionError::Internal if a fatal, unrecoverable error occurs during HTML parsing. In practice this function is infallible: individual format failures are captured as WarningCode::ExtractorFailed warnings.

§Examples

use schemaorg_rs::extract_all;

let html = r#"<html><head>
<script type="application/ld+json">{
"@context": "https://schema.org",
"@type": "Product",
"name": "Widget"
}</script>
</head></html>"#;

let graph = extract_all(html).unwrap();
assert_eq!(graph.nodes.len(), 1);
assert_eq!(graph.nodes[0].types, vec!["Product"]);