Struct Document

Source
pub struct Document {
    pub tree: Tree,
    pub errors: RefCell<Vec<Cow<'static, str>>>,
    pub quirks_mode: Cell<QuirksMode>,
}
Expand description

Document represents an HTML document to be manipulated.

Fields§

§tree: Tree

The document’s dom tree.

§errors: RefCell<Vec<Cow<'static, str>>>

Errors that occurred during parsing.

§quirks_mode: Cell<QuirksMode>

The document’s quirks mode.

Implementations§

Source§

impl Document

Source

pub fn fragment<T: Into<StrTendril>>(html: T) -> Self

Creates a new HTML document fragment.

Source

pub fn fragment_sink() -> Self

Create a new sink for a html document fragment

Source§

impl Document

Source

pub fn root(&self) -> NodeRef<'_>

Return the underlying root document node.

Source

pub fn html_root(&self) -> NodeRef<'_>

Returns the root element node (<html>) of the document.

Source

pub fn html(&self) -> StrTendril

Gets the HTML contents of the document. It includes the text and comment nodes.

Source

pub fn inner_html(&self) -> StrTendril

Gets the HTML contents of the document. It includes only children nodes.

Source

pub fn try_html(&self) -> Option<StrTendril>

Gets the HTML contents of the document. It includes its children nodes.

Source

pub fn try_inner_html(&self) -> Option<StrTendril>

Gets the HTML contents of the document. It includes only children nodes.

Source

pub fn text(&self) -> StrTendril

Gets the text content of the document.

Source

pub fn formatted_text(&self) -> StrTendril

Returns the formatted text of the document and its descendants. This is the same as the text() method, but with a few differences:

  • Whitespace is normalized so that there is only one space between words.
  • All whitespace is removed from the beginning and end of the text.
  • After block elements, a double newline is added.
  • For elements like br, ‘hr’, ‘li’, ‘tr’ a single newline is added.
Source

pub fn base_uri(&self) -> Option<StrTendril>

Finds the base URI of the tree by looking for <base> tags in document’s head.

The base URI is the value of the href attribute of the first <base> tag in the document’s head. If no such tag is found, the method returns None.

Source

pub fn normalize(&self)

Merges adjacent text nodes and removes empty text nodes.

Normalization is necessary to ensure that adjacent text nodes are merged into one text node.

§Example
use dom_query::Document;

let doc = Document::from("<div>Hello</div>");
let sel = doc.select("div");
let div = sel.nodes().first().unwrap();
let text1 = doc.tree.new_text(" ");
let text2 = doc.tree.new_text("World");
let text3 = doc.tree.new_text("");
div.append_child(&text1);
div.append_child(&text2);
div.append_child(&text3);
assert_eq!(div.children().len(), 4);
doc.normalize();
assert_eq!(div.children().len(), 1);
assert_eq!(div.text(), "Hello World".into());
Source§

impl Document

Source

pub fn select(&self, sel: &str) -> Selection<'_>

Gets the descendants of the root document node in the current, filter by a selector. It returns a new selection object containing these matched elements.

§Panics

Panics if failed to parse the given CSS selector.

Source

pub fn nip(&self, sel: &str) -> Selection<'_>

Alias for select, it gets the descendants of the root document node in the current, filter by a selector. It returns a new selection object containing these matched elements.

§Panics

Panics if failed to parse the given CSS selector.

Source

pub fn try_select(&self, sel: &str) -> Option<Selection<'_>>

Gets the descendants of the root document node in the current, filter by a selector. It returns a new selection object containing these matched elements.

Source

pub fn select_matcher(&self, matcher: &Matcher) -> Selection<'_>

Gets the descendants of the root document node in the current, filter by a matcher. It returns a new selection object containing these matched elements.

Source

pub fn select_single_matcher(&self, matcher: &Matcher) -> Selection<'_>

Gets the descendants of the root document node in the current, filter by a matcher. It returns a new selection object containing elements of the single (first) match.

Source

pub fn select_single(&self, sel: &str) -> Selection<'_>

Gets the descendants of the root document node in the current, filter by a selector. It returns a new selection object containing elements of the single (first) match.

§Panics

Panics if failed to parse the given CSS selector.

Trait Implementations§

Source§

impl Clone for Document

Source§

fn clone(&self) -> Document

Returns a copy of the value. Read more
1.0.0 · Source§

fn clone_from(&mut self, source: &Self)

Performs copy-assignment from source. Read more
Source§

impl Default for Document

Source§

fn default() -> Self

Returns the “default value” for a type. Read more
Source§

impl<T: Into<StrTendril>> From<T> for Document

Source§

fn from(html: T) -> Self

Converts to this type from the input type.
Source§

impl TreeSink for Document

Source§

type Output = Document

The overall result of parsing.

Source§

type Handle = NodeId

Handle is a reference to a DOM node. The tree builder requires that a Handle implements Clone to get another reference to the same node.

Source§

fn finish(self) -> Self

Consume this sink and return the overall result of parsing.

Source§

fn parse_error(&self, msg: Cow<'static, str>)

Signal a parse error.

Source§

fn get_document(&self) -> Self::Handle

Get a handle to the Document node.

Source§

fn get_template_contents(&self, target: &Self::Handle) -> Self::Handle

Get a handle to a template’s template contents. The tree builder promises this will never be called with something else than a template element.

Source§

fn set_quirks_mode(&self, mode: QuirksMode)

Set the document’s quirks mode.

Source§

fn same_node(&self, x: &Self::Handle, y: &Self::Handle) -> bool

Do two handles refer to the same node?.

Source§

fn elem_name(&self, target: &Self::Handle) -> Self::ElemName<'_>

What is the name of the element? Should never be called on a non-element node; Feel free to panic!.

Source§

fn create_element( &self, name: QualName, attrs: Vec<Attribute>, flags: ElementFlags, ) -> Self::Handle

Create an element. When creating a template element (name.ns.expanded() == expanded_name!(html"template")), an associated document fragment called the “template contents” should also be created. Later calls to self.get_template_contents() with that given element return it. See the template element in the whatwg spec,

Source§

fn create_comment(&self, text: StrTendril) -> Self::Handle

Create a comment node.

Source§

fn create_pi(&self, target: StrTendril, data: StrTendril) -> Self::Handle

Create a Processing Instruction node.

Source§

fn append(&self, parent: &Self::Handle, child: NodeOrText<Self::Handle>)

Append a node as the last child of the given node. If this would produce adjacent sibling text nodes, it should concatenate the text instead. The child node will not already have a parent.

Source§

fn append_before_sibling( &self, sibling: &Self::Handle, child: NodeOrText<Self::Handle>, )

Append a node as the sibling immediately before the given node. The tree builder promises that sibling is not a text node. However its old previous sibling, which would become the new node’s previous sibling, could be a text node. If the new node is also a text node, the two should be merged, as in the behavior of append.

Source§

fn append_based_on_parent_node( &self, element: &Self::Handle, prev_element: &Self::Handle, child: NodeOrText<Self::Handle>, )

When the insertion point is decided by the existence of a parent node of the element, we consider both possibilities and send the element which will be used if a parent node exists, along with the element to be used if there isn’t one.

Source§

fn append_doctype_to_document( &self, name: StrTendril, public_id: StrTendril, system_id: StrTendril, )

Append a DOCTYPE element to the Document node.

Source§

fn add_attrs_if_missing(&self, target: &Self::Handle, attrs: Vec<Attribute>)

Add each attribute to the given element, if no attribute with that name already exists. The tree builder promises this will never be called with something else than an element.

Source§

fn remove_from_parent(&self, target: &Self::Handle)

Detach the given node from its parent.

Source§

fn reparent_children(&self, node: &Self::Handle, new_parent: &Self::Handle)

Remove all the children from node and append them to new_parent.

Source§

type ElemName<'a> = Ref<'a, QualName>

Source§

fn mark_script_already_started(&self, _node: &Self::Handle)

Mark a HTML <script> as “already started”.
Source§

fn pop(&self, _node: &Self::Handle)

Indicate that a node was popped off the stack of open elements.
Source§

fn associate_with_form( &self, _target: &Self::Handle, _form: &Self::Handle, _nodes: (&Self::Handle, Option<&Self::Handle>), )

Associate the given form-associatable element with the form element
Source§

fn is_mathml_annotation_xml_integration_point( &self, _handle: &Self::Handle, ) -> bool

Returns true if the adjusted current node is an HTML integration point and the token is a start tag.
Source§

fn set_current_line(&self, _line_number: u64)

Called whenever the line number changes.
Source§

fn allow_declarative_shadow_roots( &self, _intended_parent: &Self::Handle, ) -> bool

Source§

fn attach_declarative_shadow( &self, _location: &Self::Handle, _template: &Self::Handle, _attrs: Vec<Attribute>, ) -> Result<(), String>

Attach declarative shadow

Auto Trait Implementations§

Blanket Implementations§

Source§

impl<T> Any for T
where T: 'static + ?Sized,

Source§

fn type_id(&self) -> TypeId

Gets the TypeId of self. Read more
Source§

impl<T> Borrow<T> for T
where T: ?Sized,

Source§

fn borrow(&self) -> &T

Immutably borrows from an owned value. Read more
Source§

impl<T> BorrowMut<T> for T
where T: ?Sized,

Source§

fn borrow_mut(&mut self) -> &mut T

Mutably borrows from an owned value. Read more
Source§

impl<T> CloneToUninit for T
where T: Clone,

Source§

unsafe fn clone_to_uninit(&self, dest: *mut u8)

🔬This is a nightly-only experimental API. (clone_to_uninit)
Performs copy-assignment from self to dest. Read more
Source§

impl<T> From<T> for T

Source§

fn from(t: T) -> T

Returns the argument unchanged.

Source§

impl<T, U> Into<U> for T
where U: From<T>,

Source§

fn into(self) -> U

Calls U::from(self).

That is, this conversion is whatever the implementation of From<T> for U chooses to do.

Source§

impl<T> ToOwned for T
where T: Clone,

Source§

type Owned = T

The resulting type after obtaining ownership.
Source§

fn to_owned(&self) -> T

Creates owned data from borrowed data, usually by cloning. Read more
Source§

fn clone_into(&self, target: &mut T)

Uses borrowed data to replace owned data, usually by cloning. Read more
Source§

impl<T, U> TryFrom<U> for T
where U: Into<T>,

Source§

type Error = Infallible

The type returned in the event of a conversion error.
Source§

fn try_from(value: U) -> Result<T, <T as TryFrom<U>>::Error>

Performs the conversion.
Source§

impl<T, U> TryInto<U> for T
where U: TryFrom<T>,

Source§

type Error = <U as TryFrom<T>>::Error

The type returned in the event of a conversion error.
Source§

fn try_into(self) -> Result<U, <U as TryFrom<T>>::Error>

Performs the conversion.