Struct Node

Source
pub struct Node(/* private fields */);
Expand description

Type which represents a HTML node, which can be a group of elements, an element, or the entire HTML document.

Implementations§

Source§

impl Node

Source

pub fn new<T: AsRef<[u8]>>(buf: T) -> Result<Self>

Parse HTML into a Node. As there is no base URI specified, absolute URL resolution requires the HTML to have a <base href> tag.

Source

pub fn new_with_uri<A: AsRef<[u8]>, B: AsRef<str>>( buf: A, base_uri: B, ) -> Result<Self>

Parse HTML into a Node. The given base_uri will be used for any URLs that occurs before a <base href> tag is defined.

Source

pub fn new_fragment<T: AsRef<[u8]>>(buf: T) -> Result<Self>

Parse a HTML fragment, assuming that it forms the body of the HTML. Similar to Node::new, relative URLs will not be resolved unless there is a <base href> tag.

Source

pub fn new_fragment_with_uri<A: AsRef<[u8]>, B: AsRef<str>>( buf: A, base_uri: B, ) -> Result<Self>

Parse a HTML fragment, assuming that it forms the body of the HTML. Similar to Node::new_with_uri, URL resolution occurs for any that appears before a <base href> tag.

Source

pub unsafe fn from(ptr: i32) -> Self

Get an instance from a PtrRef

§Safety

Ensure that this Ptr is of Kind::Node before converting.

Source

pub fn close(self)

Source

pub fn select<T: AsRef<str>>(&self, selector: T) -> Self

Find elements that matches the given CSS (or JQuery) selector.

Supported selectors
PatternMatchesExample
*any element*
tagelements with the given tag namediv
*|Eelements of type E in any namespace (including non-namespaced)*|name finds <fb:name> and <name> elements
ns|Eelements of type E in the namespace nsfb|name finds <fb:name> elements
#idelements with attribute ID of “id”div#wrap, #logo
.classelements with a class name of “class”div.left, .result
[attr]elements with an attribute named “attr” (with any value)a[href], [title]
[^attrPrefix]elements with an attribute name starting with “attrPrefix”. Use to find elements with HTML5 datasets[^data-], div[^data-]
[attr=val]elements with an attribute named “attr”, and value equal to “val”img[width=500], a[rel=nofollow]
[attr="val"]elements with an attribute named “attr”, and value equal to “val”span[hello="Cleveland"][goodbye="Columbus"], a[rel="nofollow"]
[attr^=valPrefix]elements with an attribute named “attr”, and value starting with “valPrefix”a[href^=http:]
[attr$=valSuffix]elements with an attribute named “attr”, and value ending with “valSuffix”img[src$=.png]
[attr*=valContaining]elements with an attribute named “attr”, and value containing “valContaining”a[href*=/search/]
[attr~=regex]elements with an attribute named “attr”, and value matching the regular expressionimg[src~=(?i)\\.(png|jpe?g)]
The above may be combined in any orderdiv.header[title]
§Combinators
PatternMatchesExample
E Fan F element descended from an E elementdiv a, .logo h1
E > Fan F direct child of Eol > li
E + Fan F element immediately preceded by sibling Eli + li, div.head + div
E ~ Fan F element preceded by sibling Eh1 ~ p
E, F, Gall matching elements E, F, or Ga[href], div, h3
§Pseudo selectors
PatternMatchesExample
:lt(n)elements whose sibling index is less than ntd:lt(3) finds the first 3 cells of each row
:gt(n)elements whose sibling index is greater than ntd:gt(1) finds cells after skipping the first two
:eq(n)elements whose sibling index is equal to ntd:eq(0) finds the first cell of each row
:has(selector)elements that contains at least one element matching the selectordiv:has(p) finds divs that contain p elements; div:has(> a) selects div elements that have at least one direct child a element.
:not(selector)elements that do not match the selector.div:not(.logo) finds all divs that do not have the “logo” class; div:not(:has(div)) finds divs that do not contain divs.
:contains(text)elements that contains the specified text. The search is case insensitive. The text may appear in the found element, or any of its descendants.p:contains(SwiftSoup) finds p elements containing the text “SwiftSoup”; p:contains(hello \(there\)) finds p elements containing the text “Hello (There)”
:matches(regex)elements whose text matches the specified regular expression. The text may appear in the found element, or any of its descendants.td:matches(\\d+) finds table cells containing digits. div:matches((?i)login) finds divs containing the text, case insensitively.
:containsOwn(text)elements that directly contain the specified text. The search is case insensitive. The text must appear in the found element, not any of its descendants.p:containsOwn(SwiftSoup) finds p elements with own text “SwiftSoup”.
:matchesOwn(regex)elements whose own text matches the specified regular expression. The text must appear in the found element, not any of its descendants.td:matchesOwn(\\d+) finds table cells directly containing digits. div:matchesOwn((?i)login) finds divs containing the text, case insensitively.
§Structural pseudo-selectors
PatternMatchesExample
:rootThe element that is the root of the document. In HTML, this is the html element
:nth-child(an+b)elements that have an+b-1 siblings before it in the document tree, for any positive integer or zero value of n, and has a parent element. For values of a and b greater than zero, this effectively divides the element’s children into groups of a elements (the last group taking the remainder), and selecting the bth element of each group. For example, this allows the selectors to address every other row in a table, and could be used to alternate the color of paragraph text in a cycle of four. The a and b values must be integers (positive, negative, or zero). The index of the first child of an element is 1.
:nth-last-child(an+b)elements that have an+b-1 siblings after it in the document tree. Otherwise like :nth-child()tr:nth-last-child(-n+2) the last two rows of a table
:nth-of-type(an+b)pseudo-class notation represents an element that has an+b-1 siblings with the same expanded element name before it in the document tree, for any zero or positive integer value of n, and has a parent elementimg:nth-of-type(2n+1)
:nth-last-of-type(an+b)pseudo-class notation represents an element that has an+b-1 siblings with the same expanded element name after it in the document tree, for any zero or positive integer value of n, and has a parent elementimg:nth-last-of-type(2n+1)
:first-childelements that are the first child of some other element.div > p:first-child
:last-childelements that are the last child of some other element.ol > li:last-child
:first-of-typeelements that are the first sibling of its type in the list of children of its parent elementdl dt:first-of-type
:last-of-typeelements that are the last sibling of its type in the list of children of its parent elementtr > td:last-of-type
:only-childelements that have a parent element and whose parent element hasve no other element children
:only-of-typean element that has a parent element and whose parent element has no other element children with the same expanded element name
:emptyelements that have no children at all
Source

pub fn attr<T: AsRef<str>>(&self, attr: T) -> String

Get an attribute value by its key. To get an absolute URL from an attribute that may be a relative URL, prefix the key with abs:.

§Example
// Assumes that `el` is a Node
let url = el.attr("abs:src");
Source

pub fn set_html<T: AsRef<str>>(&mut self, html: T) -> Result<()>

Set the element’s inner HTML, clearning the existing HTML.

§Notice

Internally, this operates on SwiftSoup.Element, but not on SwiftSoup.Elements, which is the type you usually get when using methods like Node::select. Either use Node::array to iterate through each element, or use Node::first/Node::last to select an element before calling this function.

Source

pub fn set_text<T: AsRef<str>>(&mut self, text: T) -> Result<()>

Set the element’s text content, clearing any existing content.

§Notice

Internally, this operates on SwiftSoup.Element, but not on SwiftSoup.Elements, which is the type you usually get when using methods like Node::select. Either use Node::array to iterate through each element, or use Node::first/Node::last to select an element before calling this function.

Source

pub fn prepend<T: AsRef<str>>(&mut self, html: T) -> Result<()>

Add inner HTML into this element. The given HTML will be parsed, and each node prepended to the start of the element’s children.

§Notice

Internally, this operates on SwiftSoup.Element, but not on SwiftSoup.Elements, which is the type you usually get when using methods like Node::select. Either use Node::array to iterate through each element, or use Node::first/Node::last to select an element before calling this function.

Source

pub fn append<T: AsRef<str>>(&mut self, html: T) -> Result<()>

Add inner HTML into this element. The given HTML will be parsed, and each node appended to the end of the element’s children.

§Notice

Internally, this operates on SwiftSoup.Element, but not on SwiftSoup.Elements, which is the type you usually get when using methods like Node::select. Either use Node::array to iterate through each element, or use Node::first/Node::last to select an element before calling this function.

Source

pub fn first(&self) -> Self

Get the first sibling of this element, which can be this element

Source

pub fn last(&self) -> Self

Get the last sibling of this element, which can be this element

Source

pub fn next(&self) -> Option<Node>

Get the next sibling of the element, returning None if there isn’t one.

Source

pub fn previous(&self) -> Option<Node>

Get the previous sibling of the element, returning None if there isn’t one.

Source

pub fn base_uri(&self) -> String

Get the base URI of this Node

Source

pub fn body(&self) -> String

Get the document’s body element.

Source

pub fn text(&self) -> String

Get the normalized, combined text of this element and its children. Whitespace is normalized and trimmed.

For example, given HTML <p>Hello <b>there</b> now! </p>, p.text() returns “Hello there now!”

Note that this method returns text that would be presented to a reader. The contents of data nodes (e.g. <script> tags) are not considered text. Use Node::html or Node::data to retrieve that content.

Source

pub fn untrimmed_text(&self) -> String

Get the text of this element and its children. Whitespace is not normalized and trimmed.

Notices from Node::text applies.

Source

pub fn own_text(&self) -> String

Gets the (normalized) text owned by this element only; does not get the combined text of all children.

Node::own_text only operates on a singular element, so calling it after Node::select will not work. You need to get a specific element first, through Node::array and ArrayRef::get, Node::first, or Node::last.

Source

pub fn data(&self) -> String

Get the combined data of this element. Data is e.g. the inside of a <script> tag.

Note that data is NOT the text of the element. Use Node::text to get the text that would be visible to a user, and Node::data for the contents of scripts, comments, CSS styles, etc.

Source

pub fn array(&self) -> ArrayRef

Get an array of Node. This is most commonly used with Node::select to iterate through elements that match a selector.

Source

pub fn html(&self) -> String

Get the node’s inner HTML.

For example, on <div><p></p></div>, div.html() would return <p></p>.

Source

pub fn outer_html(&self) -> String

Get the node’s outer HTML.

For example, on <div><p></p></div>, div.outer_html() would return <div><p></p></div>.

Source

pub fn escape(&self) -> String

Get the node’s text and escape any HTML-reserved characters to HTML entities.

For example, for a node with text Hello &<> Å å π 新 there ¾ © », this would return Hello &amp;&lt;&gt; Å å π 新 there ¾ © »

Source

pub fn unescape(&self) -> String

Get the node’s text and unescape any HTML entities to their original characters.

For example, for a node with text Hello &amp;&lt;&gt; Å å π 新 there ¾ © », this would return Hello &<> Å å π 新 there ¾ © ».

Source

pub fn id(&self) -> String

Get the id attribute of this element.

Source

pub fn tag_name(&self) -> String

Get the name of the tag for this element. This will always be the lowercased version. For example, <DIV> and <div> would both return div.

Source

pub fn class_name(&self) -> String

Get the literal value of this node’s class attribute. For example, on <div class="header gray"> this would return header gray.

Source

pub fn has_class<T: AsRef<str>>(&self, class_name: T) -> bool

Test if this element has a class. Case insensitive.

Source

pub fn has_attr<T: AsRef<str>>(&self, attr_name: T) -> bool

Test if this element has an attribute. Case insensitive.

Trait Implementations§

Source§

impl Clone for Node

Source§

fn clone(&self) -> Self

Returns a duplicate of the value. Read more
1.0.0 · Source§

fn clone_from(&mut self, source: &Self)

Performs copy-assignment from source. Read more
Source§

impl Debug for Node

Source§

fn fmt(&self, f: &mut Formatter<'_>) -> Result

Formats the value using the given formatter. Read more
Source§

impl Display for Node

Source§

fn fmt(&self, f: &mut Formatter<'_>) -> Result

Returns the outer HTML of the node.

Source§

impl Drop for Node

Source§

fn drop(&mut self)

Executes the destructor for this type. Read more

Auto Trait Implementations§

§

impl Freeze for Node

§

impl RefUnwindSafe for Node

§

impl Send for Node

§

impl Sync for Node

§

impl Unpin for Node

§

impl UnwindSafe for Node

Blanket Implementations§

Source§

impl<T> Any for T
where T: 'static + ?Sized,

Source§

fn type_id(&self) -> TypeId

Gets the TypeId of self. Read more
Source§

impl<T> Borrow<T> for T
where T: ?Sized,

Source§

fn borrow(&self) -> &T

Immutably borrows from an owned value. Read more
Source§

impl<T> BorrowMut<T> for T
where T: ?Sized,

Source§

fn borrow_mut(&mut self) -> &mut T

Mutably borrows from an owned value. Read more
Source§

impl<T> CloneToUninit for T
where T: Clone,

Source§

unsafe fn clone_to_uninit(&self, dest: *mut u8)

🔬This is a nightly-only experimental API. (clone_to_uninit)
Performs copy-assignment from self to dest. Read more
Source§

impl<T> From<T> for T

Source§

fn from(t: T) -> T

Returns the argument unchanged.

Source§

impl<T, U> Into<U> for T
where U: From<T>,

Source§

fn into(self) -> U

Calls U::from(self).

That is, this conversion is whatever the implementation of From<T> for U chooses to do.

Source§

impl<T> ToOwned for T
where T: Clone,

Source§

type Owned = T

The resulting type after obtaining ownership.
Source§

fn to_owned(&self) -> T

Creates owned data from borrowed data, usually by cloning. Read more
Source§

fn clone_into(&self, target: &mut T)

Uses borrowed data to replace owned data, usually by cloning. Read more
Source§

impl<T> ToString for T
where T: Display + ?Sized,

Source§

fn to_string(&self) -> String

Converts the given value to a String. Read more
Source§

impl<T, U> TryFrom<U> for T
where U: Into<T>,

Source§

type Error = Infallible

The type returned in the event of a conversion error.
Source§

fn try_from(value: U) -> Result<T, <T as TryFrom<U>>::Error>

Performs the conversion.
Source§

impl<T, U> TryInto<U> for T
where U: TryFrom<T>,

Source§

type Error = <U as TryFrom<T>>::Error

The type returned in the event of a conversion error.
Source§

fn try_into(self) -> Result<U, <U as TryFrom<T>>::Error>

Performs the conversion.