pub struct Document {
pub tree: Tree,
pub errors: RefCell<Vec<Cow<'static, str>>>,
pub quirks_mode: Cell<QuirksMode>,
}Expand description
Document represents an HTML document to be manipulated.
Fields§
§tree: TreeThe document’s dom tree.
errors: RefCell<Vec<Cow<'static, str>>>Errors that occurred during parsing.
quirks_mode: Cell<QuirksMode>The document’s quirks mode.
Implementations§
Source§impl Document
impl Document
Sourcepub fn html(&self) -> Tendril<UTF8>
pub fn html(&self) -> Tendril<UTF8>
Gets the HTML contents of the document. It includes the text and comment nodes.
Sourcepub fn inner_html(&self) -> Tendril<UTF8>
pub fn inner_html(&self) -> Tendril<UTF8>
Gets the HTML contents of the document. It includes only children nodes.
Sourcepub fn try_html(&self) -> Option<Tendril<UTF8>>
pub fn try_html(&self) -> Option<Tendril<UTF8>>
Gets the HTML contents of the document. It includes its children nodes.
Sourcepub fn try_inner_html(&self) -> Option<Tendril<UTF8>>
pub fn try_inner_html(&self) -> Option<Tendril<UTF8>>
Gets the HTML contents of the document. It includes only children nodes.
Sourcepub fn formatted_text(&self) -> Tendril<UTF8>
pub fn formatted_text(&self) -> Tendril<UTF8>
Returns the formatted text of the document and its descendants. This is the same as
the text() method, but with a few differences:
- Whitespace is normalized so that there is only one space between words.
- All whitespace is removed from the beginning and end of the text.
- After block elements, a double newline is added.
- For elements like
br, ‘hr’, ‘li’, ‘tr’ a single newline is added.
Sourcepub fn base_uri(&self) -> Option<Tendril<UTF8>>
pub fn base_uri(&self) -> Option<Tendril<UTF8>>
Finds the base URI of the tree by looking for <base> tags in document’s head.
The base URI is the value of the href attribute of the first
<base> tag in the document’s head. If no such tag is found,
the method returns None.
Sourcepub fn body(&self) -> Option<NodeRef<'_>>
pub fn body(&self) -> Option<NodeRef<'_>>
Returns the document’s <body> element, or None if absent.
For fragments (crate::NodeData::Fragment), this typically returns None.
Sourcepub fn head(&self) -> Option<NodeRef<'_>>
pub fn head(&self) -> Option<NodeRef<'_>>
Returns the document’s <head> element, or None if absent.
For fragments (crate::NodeData::Fragment), this typically returns None.
Sourcepub fn normalize(&self)
pub fn normalize(&self)
Merges adjacent text nodes and removes empty text nodes.
Normalization is necessary to ensure that adjacent text nodes are merged into one text node.
§Example
use dom_query::Document;
let doc = Document::from("<div>Hello</div>");
let sel = doc.select("div");
let div = sel.nodes().first().unwrap();
let text1 = doc.tree.new_text(" ");
let text2 = doc.tree.new_text("World");
let text3 = doc.tree.new_text("");
div.append_child(&text1);
div.append_child(&text2);
div.append_child(&text3);
assert_eq!(div.children().len(), 4);
doc.normalize();
assert_eq!(div.children().len(), 1);
assert_eq!(div.text(), "Hello World".into());Source§impl Document
impl Document
Sourcepub fn select(&self, sel: &str) -> Selection<'_>
pub fn select(&self, sel: &str) -> Selection<'_>
Gets the descendants of the root document node in the current, filter by a selector. It returns a new selection object containing these matched elements.
§Panics
Panics if failed to parse the given CSS selector.
Sourcepub fn nip(&self, sel: &str) -> Selection<'_>
pub fn nip(&self, sel: &str) -> Selection<'_>
Alias for select, it gets the descendants of the root document node in the current, filter by a selector.
It returns a new selection object containing these matched elements.
§Panics
Panics if failed to parse the given CSS selector.
Sourcepub fn try_select(&self, sel: &str) -> Option<Selection<'_>>
pub fn try_select(&self, sel: &str) -> Option<Selection<'_>>
Gets the descendants of the root document node in the current, filter by a selector. It returns a new selection object containing these matched elements.
Sourcepub fn select_matcher(&self, matcher: &Matcher) -> Selection<'_>
pub fn select_matcher(&self, matcher: &Matcher) -> Selection<'_>
Gets the descendants of the root document node in the current, filter by a matcher. It returns a new selection object containing these matched elements.
Sourcepub fn select_single_matcher(&self, matcher: &Matcher) -> Selection<'_>
pub fn select_single_matcher(&self, matcher: &Matcher) -> Selection<'_>
Gets the descendants of the root document node in the current, filter by a matcher. It returns a new selection object containing elements of the single (first) match.
Sourcepub fn select_single(&self, sel: &str) -> Selection<'_>
pub fn select_single(&self, sel: &str) -> Selection<'_>
Gets the descendants of the root document node in the current, filter by a selector. It returns a new selection object containing elements of the single (first) match.
§Panics
Panics if failed to parse the given CSS selector.
Source§impl Document
impl Document
Sourcepub fn md(&self, skip_tags: Option<&[&str]>) -> Tendril<UTF8>
pub fn md(&self, skip_tags: Option<&[&str]>) -> Tendril<UTF8>
Produces a Markdown representation of the Document,
skipping elements matching the specified skip_tags list along with their descendants.
- If
skip_tagsisNone, the default list is used:["script", "style", "meta", "head"]. - To process all elements without exclusions, pass
Some(&[]).
Trait Implementations§
Source§impl TreeSink for Document
impl TreeSink for Document
Source§type Handle = NodeId
type Handle = NodeId
Handle is a reference to a DOM node. The tree builder requires that a Handle implements Clone to get
another reference to the same node.
Source§fn parse_error(&self, msg: Cow<'static, str>)
fn parse_error(&self, msg: Cow<'static, str>)
Signal a parse error.
Source§fn get_template_contents(
&self,
target: &<Document as TreeSink>::Handle,
) -> <Document as TreeSink>::Handle
fn get_template_contents( &self, target: &<Document as TreeSink>::Handle, ) -> <Document as TreeSink>::Handle
Get a handle to a template’s template contents. The tree builder promises this will never be called with something else than a template element.
Source§fn set_quirks_mode(&self, mode: QuirksMode)
fn set_quirks_mode(&self, mode: QuirksMode)
Set the document’s quirks mode.
Source§fn same_node(
&self,
x: &<Document as TreeSink>::Handle,
y: &<Document as TreeSink>::Handle,
) -> bool
fn same_node( &self, x: &<Document as TreeSink>::Handle, y: &<Document as TreeSink>::Handle, ) -> bool
Do two handles refer to the same node?.
Source§fn elem_name(
&self,
target: &<Document as TreeSink>::Handle,
) -> <Document as TreeSink>::ElemName<'_>
fn elem_name( &self, target: &<Document as TreeSink>::Handle, ) -> <Document as TreeSink>::ElemName<'_>
What is the name of the element?
Should never be called on a non-element node; Feel free to panic!.
Source§fn create_element(
&self,
name: QualName,
attrs: Vec<Attribute>,
flags: ElementFlags,
) -> <Document as TreeSink>::Handle
fn create_element( &self, name: QualName, attrs: Vec<Attribute>, flags: ElementFlags, ) -> <Document as TreeSink>::Handle
Create an element.
When creating a template element (name.ns.expanded() == expanded_name!(html"template")), an
associated document fragment called the “template contents” should also be created. Later calls to
self.get_template_contents() with that given element return it. See the template element in the whatwg spec,
Source§fn create_comment(&self, text: Tendril<UTF8>) -> <Document as TreeSink>::Handle
fn create_comment(&self, text: Tendril<UTF8>) -> <Document as TreeSink>::Handle
Create a comment node.
Source§fn create_pi(
&self,
target: Tendril<UTF8>,
data: Tendril<UTF8>,
) -> <Document as TreeSink>::Handle
fn create_pi( &self, target: Tendril<UTF8>, data: Tendril<UTF8>, ) -> <Document as TreeSink>::Handle
Create a Processing Instruction node.
Source§fn append(
&self,
parent: &<Document as TreeSink>::Handle,
child: NodeOrText<<Document as TreeSink>::Handle>,
)
fn append( &self, parent: &<Document as TreeSink>::Handle, child: NodeOrText<<Document as TreeSink>::Handle>, )
Append a node as the last child of the given node. If this would produce adjacent sibling text nodes, it should concatenate the text instead. The child node will not already have a parent.
Source§fn append_before_sibling(
&self,
sibling: &<Document as TreeSink>::Handle,
child: NodeOrText<<Document as TreeSink>::Handle>,
)
fn append_before_sibling( &self, sibling: &<Document as TreeSink>::Handle, child: NodeOrText<<Document as TreeSink>::Handle>, )
Append a node as the sibling immediately before the given node.
The tree builder promises that sibling is not a text node. However its old previous sibling, which would
become the new node’s previous sibling, could be a text node. If the new node is also a text node, the two
should be merged, as in the behavior of append.
Source§fn append_based_on_parent_node(
&self,
element: &<Document as TreeSink>::Handle,
prev_element: &<Document as TreeSink>::Handle,
child: NodeOrText<<Document as TreeSink>::Handle>,
)
fn append_based_on_parent_node( &self, element: &<Document as TreeSink>::Handle, prev_element: &<Document as TreeSink>::Handle, child: NodeOrText<<Document as TreeSink>::Handle>, )
When the insertion point is decided by the existence of a parent node of the element, we consider both possibilities and send the element which will be used if a parent node exists, along with the element to be used if there isn’t one.
Source§fn append_doctype_to_document(
&self,
name: Tendril<UTF8>,
public_id: Tendril<UTF8>,
system_id: Tendril<UTF8>,
)
fn append_doctype_to_document( &self, name: Tendril<UTF8>, public_id: Tendril<UTF8>, system_id: Tendril<UTF8>, )
Append a DOCTYPE element to the Document node.
Source§fn add_attrs_if_missing(
&self,
target: &<Document as TreeSink>::Handle,
attrs: Vec<Attribute>,
)
fn add_attrs_if_missing( &self, target: &<Document as TreeSink>::Handle, attrs: Vec<Attribute>, )
Add each attribute to the given element, if no attribute with that name already exists. The tree builder promises this will never be called with something else than an element.
Source§fn remove_from_parent(&self, target: &<Document as TreeSink>::Handle)
fn remove_from_parent(&self, target: &<Document as TreeSink>::Handle)
Detach the given node from its parent.
Source§fn reparent_children(
&self,
node: &<Document as TreeSink>::Handle,
new_parent: &<Document as TreeSink>::Handle,
)
fn reparent_children( &self, node: &<Document as TreeSink>::Handle, new_parent: &<Document as TreeSink>::Handle, )
Remove all the children from node and append them to new_parent.
type ElemName<'a> = Ref<'a, QualName>
Source§fn is_mathml_annotation_xml_integration_point(
&self,
handle: &<Document as TreeSink>::Handle,
) -> bool
fn is_mathml_annotation_xml_integration_point( &self, handle: &<Document as TreeSink>::Handle, ) -> bool
Source§fn mark_script_already_started(&self, _node: &Self::Handle)
fn mark_script_already_started(&self, _node: &Self::Handle)
<script> as “already started”.