Skip to main content

Module translator

Module translator 

Source
Expand description

CSS-to-XPath translation with ::text and ::attr() pseudo-element support.

This module provides the translation layer between CSS selectors and XPath expressions, including support for the Scrapy/parsel-compatible pseudo-elements ::text and ::attr(ATTR_NAME).

§Background

Python scrapling (via cssselect) translates every CSS selector to an XPath expression, which lxml then evaluates against the DOM. In Rust, the scraper crate matches CSS selectors natively — no XPath needed.

This module exists for three reasons:

  1. Pseudo-element parsing::text and ::attr(name) must be stripped before CSS matching and applied as post-processing. The CssQuery type handles this.

  2. CSS→XPath conversion — the css_to_xpath() function provides feature parity with the Python API for users who need XPath strings (e.g. for generate_xpath_selector()).

  3. Caching — translations are LRU-cached (capacity 256) since the same selectors are typically used repeatedly.

§Pseudo-elements

Pseudo-elementXPath suffixMeaning
::text/text()Select text node children
::attr(href)/@hrefSelect the href attribute value

§Examples

use scrapling::translator::{CssQuery, css_to_xpath};

// Parse pseudo-elements from a CSS selector
let q = CssQuery::parse("div.content::text").unwrap();
assert_eq!(q.css(), "div.content");
assert!(q.is_text());

let q2 = CssQuery::parse("a.link::attr(href)").unwrap();
assert_eq!(q2.css(), "a.link");
assert_eq!(q2.attribute(), Some("href"));

// Translate CSS to XPath
let xpath = css_to_xpath("div.content::text").unwrap();
assert!(xpath.ends_with("/text()"));

Structs§

CssQuery
A parsed CSS selector with any pseudo-element extracted.

Enums§

PseudoElement
The pseudo-element extracted from a CSS selector.

Functions§

css_to_xpath
Translate a CSS selector to an XPath expression.