Expand description
Unified structured data graph combining all extraction formats.
This module provides the primary entry point extract_all which runs
all three extractors (JSON-LD, Microdata, RDFa Lite) against an HTML
document and merges the results into a single StructuredDataGraph.
§Pipeline
- Parse the HTML once using
scraper::Html - Run each extractor against the parsed DOM
- Merge all nodes and warnings into a single graph
- Individual extractor failures are captured as warnings (not errors)
§Examples
use schemaorg_rs::extract_all;
let html = r#"<html><head>
<script type="application/ld+json">{
"@context": "https://schema.org",
"@type": "Product",
"name": "Widget"
}</script>
</head></html>"#;
let graph = extract_all(html).unwrap();
assert_eq!(graph.nodes.len(), 1);
assert_eq!(graph.nodes[0].types, vec!["Product"]);
assert!(graph.warnings.is_empty());Structs§
- Structured
Data Graph - A unified graph of all structured data extracted from an HTML document.
Functions§
- extract_
all - Extracts all structured data from an HTML document.