Skip to main content

Crate html2json

Crate html2json 

Source
Expand description

html2json - HTML to JSON extractor using html5ever

A Rust port of cheerio-json-mapper using html5ever for HTML parsing.

§Overview

This library extracts structured JSON data from HTML using CSS selectors defined in a JSON spec format.

§Basic Example

use html2json::{extract, Spec};

let html = r#"<html><body><h1>Hello</h1><p class="desc">World</p></body></html>"#;
let spec_json = r#"{"title": "h1", "description": "p.desc"}"#;
let spec: Spec = serde_json::from_str(spec_json)?;
let result = extract(html, &spec)?;
assert_eq!(result["title"], "Hello");
assert_eq!(result["description"], "World");

Re-exports§

pub use dom::Dom;
pub use spec::Spec;

Modules§

dom
DOM module wrapping scraper (html5ever) for HTML parsing and CSS selector matching
pipe
Pipe transformation module
spec
Spec parsing module

Functions§

extract
Extract JSON from HTML using a spec