Expand description
CSS-to-XPath translation with ::text and ::attr() pseudo-element support.
This module provides the translation layer between CSS selectors and XPath
expressions, including support for the Scrapy/parsel-compatible
pseudo-elements ::text and ::attr(ATTR_NAME).
§Background
Python scrapling (via cssselect) translates every CSS selector to an
XPath expression, which lxml then evaluates against the DOM. In Rust,
the scraper crate matches CSS selectors natively — no XPath needed.
This module exists for three reasons:
-
Pseudo-element parsing —
::textand::attr(name)must be stripped before CSS matching and applied as post-processing. TheCssQuerytype handles this. -
CSS→XPath conversion — the
css_to_xpath()function provides feature parity with the Python API for users who need XPath strings (e.g. forgenerate_xpath_selector()). -
Caching — translations are LRU-cached (capacity 256) since the same selectors are typically used repeatedly.
§Pseudo-elements
| Pseudo-element | XPath suffix | Meaning |
|---|---|---|
::text | /text() | Select text node children |
::attr(href) | /@href | Select the href attribute value |
§Examples
use scrapling::translator::{CssQuery, css_to_xpath};
// Parse pseudo-elements from a CSS selector
let q = CssQuery::parse("div.content::text").unwrap();
assert_eq!(q.css(), "div.content");
assert!(q.is_text());
let q2 = CssQuery::parse("a.link::attr(href)").unwrap();
assert_eq!(q2.css(), "a.link");
assert_eq!(q2.attribute(), Some("href"));
// Translate CSS to XPath
let xpath = css_to_xpath("div.content::text").unwrap();
assert!(xpath.ends_with("/text()"));Structs§
- CssQuery
- A parsed CSS selector with any pseudo-element extracted.
Enums§
- Pseudo
Element - The pseudo-element extracted from a CSS selector.
Functions§
- css_
to_ xpath - Translate a CSS selector to an XPath expression.