Skip to main content

Crate fhp_selector

Crate fhp_selector 

Source
Expand description

CSS selector and XPath engine for the SIMD-optimized HTML parser.

Provides CSS selector parsing, XPath evaluation, and a convenience API for querying a parsed fhp_tree::Document.

§Quick Start — CSS

use fhp_tree::parse;
use fhp_selector::Selectable;

let doc = parse("<div><p class=\"intro\">Hello</p></div>").unwrap();
let sel = doc.select("p.intro").unwrap();
assert_eq!(sel.len(), 1);
assert_eq!(sel.text(), "Hello");

§Quick Start — XPath

use fhp_tree::parse;
use fhp_selector::Selectable;
use fhp_selector::xpath::ast::XPathResult;

let doc = parse("<div><p>Hello</p></div>").unwrap();
let result = doc.xpath("//p/text()").unwrap();
match result {
    XPathResult::Strings(texts) => assert_eq!(texts[0], "Hello"),
    _ => panic!("expected strings"),
}

§Supported CSS Selectors

  • Type: div, p, span
  • Class: .class
  • ID: #id
  • Universal: *
  • Attribute: [attr], [attr=val], [attr~=val], [attr^=val], [attr$=val], [attr*=val]
  • Pseudo: :first-child, :last-child, :nth-child(an+b), :not(sel)
  • Compound: div.class#id[attr]
  • Combinator: A B, A > B, A + B, A ~ B
  • Comma list: div, span

§Supported XPath

  • //tag — descendant search
  • //tag[@attr='value'] — attribute predicate
  • /path/to/tag — absolute path
  • //tag[contains(@attr, 'substr')] — contains predicate
  • //tag[position()=N] — position predicate
  • //tag/text() — text extraction
  • .. — parent axis

Modules§

ast
CSS selector AST types. CSS selector abstract syntax tree.
bloom
Bloom filter for ancestor pre-filtering. 256-bit bloom filter for ancestor pre-filtering.
matcher
Right-to-left matching engine. Right-to-left CSS selector matching engine.
parser
CSS selector parser. CSS selector parser.
xpath
XPath expression support. XPath expression support (subset for web scraping).

Structs§

CompiledSelector
A pre-compiled CSS selector for reuse across documents and threads.
DocumentIndex
Pre-built index for O(1) id, class, and tag lookups.
Selection
A collection of matched nodes from a selector query.
SelectionIter
Iterator over Selection results.

Traits§

Selectable
Extension trait that adds CSS selector methods to Document.