Expand description
§KISS-XML: Keep It Super Simple XML
This Rust library provides an easy-to-use Document Object Model (DOM) for reading and writing XML files. Unlike many other XML parsers, KISS-XML simply parses the given XML to a full in-memory DOM, which you can then modify and serialize back to XML. No schemas or looping required.
This library does not aim to support all XML specifications, only the most commonly used subset of features.
§What’s included:
KISS-XML provides the basics for XML documents, including:
- Parse XML files and strings to a DOM
- XML elements, text, and comments
- DOM is mutable and can be saved as a string and to files
- XML namespaces (with and without prefixes)
- CDATA
- Easy to use
§What’s NOT included:
- Schema handling
- Document type declarations (DTDs will be preserved but not interpreted)
- Parsing character encodings other than UTF-8
- Typed XML data (eg integer attribute values)
- Performance optimizations (prioritizing easy-to-use over fast)
If you need any of the above XML features, then this library is too simple for your needs. Try another XML parsing crate instead.
§Examples
§Parse an XML file and print it to the terminal
To parse an XML file, all you need to do is call the kiss_xml::parse_filepath(...)
function, and you can convert it to a string with the to_string()
method or write it to a file with .write_to_filepath(...)
.
fn main() -> Result<(), kiss_xml::errors::KissXmlError> {
use kiss_xml;
let doc = kiss_xml::parse_filepath("tests/some-file.xml")?;
println!("{}", doc.to_string());
Ok(())
}
§Parse XML and then search the DOM for specific elements
Parsed XML content will be converted into a Document Object Model (DOM) with a single root element. A DOM is a tree-like data structure made up of XML Element, Text, and Comment nodes. You can explore the DOM element-by-element with the .elements_by_name(&str)
and .first_element_by_name(&str)
methods, scan the children of an element with the .child_*()
methods, or do a recursive search using the .search(...)
and .search_*(...)
methods.
For example:
fn main() -> Result<(), kiss_xml::errors::KissXmlError> {
use kiss_xml;
use kiss_xml::dom::*;
use kiss_xml::errors::*;
let xml = r#"<?xml version="1.0" encoding="UTF-8"?>
<config>
<name>My Settings</name>
<sound>
<property name="volume" value="11" />
<property name="mixer" value="standard" />
</sound>
</config>
"#;
// parse XML to a document object model (DOM)
let dom = kiss_xml::parse_str(xml)?;
// print all sound properties
let properties = dom.root_element()
.first_element_by_name("sound")?
.elements_by_name("property");
for prop in properties {
println!(
"{} = {}",
prop.get_attr("name").ok_or(DoesNotExistError::default())?,
prop.get_attr("value").ok_or(DoesNotExistError::default())?
);
}
// print children of the root element
for e in dom.root_element().child_elements() {
println!("child element <{}>", e.name())
}
// print all elements
for e in dom.root_element().search_elements(|_| true) {
println!("found element <{}>", e.name())
}
Ok(())
}
§Create and edit DOM from scratch
To modify the DOM, use the .*_mut(...)
methods to get mutable references to the elements. You and insert, append, and remove elements (and other kinds of nodes) from the DOM.
For example:
fn main() -> Result<(), kiss_xml::errors::KissXmlError> {
use kiss_xml;
use kiss_xml::dom::*;
use kiss_xml::errors::*;
// make a DOM from scratch
let mut doc = Document::new(Element::new_from_name("politicians")?);
doc.root_element_mut().insert(0, Element::new_with_text("person", "John Adams")?);
doc.root_element_mut().append(Element::new_with_text("person", "Hillary Clinton")?);
doc.root_element_mut().append(Element::new_with_text("person", "Jimmy John")?);
doc.root_element_mut().append(Element::new_with_text("person", "Nanny No-Name")?);
// remove element by index
let _removed_element = doc.root_element_mut().remove_element(3)?;
// remove element(s) by use of a predicate function
let _num_removed = doc.root_element_mut().remove_elements(|e| e.text() == "Jimmy John");
// print first element content
println!("First politician: {}", doc.root_element().first_element_by_name("person")?.text());
// write to file
doc.write_to_filepath("tests/politics.xml");
Ok(())
}
§Get and modify text and comments
The XML DOM is made up of Node objects (trait objects implementing trait kiss_xml::dom::Node). The following example shows how to add and remove text and comment nodes in addition to element nodes.
fn main() -> Result<(), kiss_xml::errors::KissXmlError> {
use kiss_xml;
use kiss_xml::dom::*;
use kiss_xml::errors::*;
use std::collections::HashMap;
let mut doc = kiss_xml::parse_str(
r#"<html>
<!-- this is a comment -->
<body>
Content goes here
</body>
</html>"#
)?;
// read and remove the first comment
let comments = doc.root_element().children()
.filter(|n| n.is_comment())
.collect::<Vec<_>>();
let first_comment = comments.first()
.ok_or(DoesNotExistError::new("no comments in DOM"))?;
println!("Comment: {}", first_comment.text());
doc.root_element_mut().remove_all(&|n| n.is_comment());
// replace content of <body> with some HTML
doc.root_element_mut().first_element_by_name_mut("body")?.remove_all(&|_| true);
doc.root_element_mut().first_element_by_name_mut("body")?.append_all(
vec![
Element::new_with_text("h1", "Chapter 1")?.boxed(),
Comment::new("Note: there is only one chapter")?.boxed(),
Element::new_with_children("p", vec![
Text::new("Once upon a time, there was a little ").boxed(),
Element::new_with_attributes_and_text::<&str,&str>(
"a",
HashMap::from([("href","https://en.wikipedia.org/wiki/Gnome")]),
"gnome"
)?.boxed(),
Text::new(" who lived in a walnut tree...").boxed()
])?.boxed()
]
);
// print the results
println!("{}", doc.to_string());
// prints:
// <html>
// <body>
// <h1>Chapter 1</h1>
// <!--Note: there is only one chapter-->
// <p>Once upon a time, there was a little <a href="https://en.wikipedia.org/wiki/Gnome">gnome</a> who lived in a walnut tree...</p>
// </body>
// </html>
Ok(())
}
§Implementation Details
§Indentation and Whitespace Handling
KISS-XML always produces indented XML output and disregards the whitespace characters between tags. However, there is an exception to this rule: If an XML element contains text, then whitespace will be all preserved on parse and indentation will be disabled when serialized to an XML string.
For example, consider this code snippet:
fn ws_example_1() -> Result<(), Box<dyn std::error::Error>> {
use kiss_xml;
use kiss_xml::dom::*;
let mut tree = Element::new_with_children(
"tree", vec![Element::new_with_text("speak", "bark!")?.boxed()]
)?;
tree.append(Element::new_from_name("branch")?);
println!("{tree}");
Ok(())
}
The above code will print the following:
<tree>
<speak>bark!</speak>
<branch/>
</tree>
However, if you then add a text node to the “tree” element, then the output formatting will change significantly:
fn ws_example_2() -> Result<(), Box<dyn std::error::Error>> {
use kiss_xml;
use kiss_xml::dom::*;
let mut tree = Element::new_with_children(
"tree", vec![Element::new_with_text("speak", "bark!")?.boxed()]
)?;
tree.append(Element::new_from_name("branch")?);
tree.append(Text::new("I'm a tree!"));
println!("{tree}");
Ok(())
}
The above code will print the following:
<tree><speak>bark!</speak><branch/>I'm a tree!</tree>
Likewise, if we were to parse the following XML with KISS-XML:
<tree>
<speak>bark!</speak>
<branch/>
I'm a tree!
</tree>
You will find that the final Text
node contains \n··I'm·a·tree!\n
(where \n and · represent newline and space characters for clarity). Unlike HTML, KISS-XML does not collapse whitespaces.
This behavior is based on a common (but not universal) interpretation of the official XML specification.
§License
This library is open source, licensed under the MIT License. You may use it as-is or with modification, without any limitations.
Modules§
- dom
- A document object model (DOM) is a tree data structure with three different kinds of nodes: Element, Text, and Comment nodes. Element nodes can have children (a list of child nodes), while Text and Comment nodes cannot. As per the XML specification, a DOM can only have one root element.
- errors
- The kiss_xml::error module holds an enum of possible error types, each of which has a corresponding implementation struct.
Functions§
- attribute_
escape - Escapes a subset of XML reserved characters (&, ‘, and “) in an attribute into XML-compatible text, eg replacing “&” with “&” and “’” with “'”
- escape
- Escapes all special characters (&, <, >, ’, and “) in a string into an XML-compatible string, eg replacing “&” with “&” and “<” with “<”
- parse_
filepath - Reads the file from the given filepath and parses it as an XML document
- parse_
str - Reads the XML content from the UTF-8 encoded text string and parses it as an XML document
- parse_
stream - Reads the XML content from the given stream reader and parses it as an XML document. Note that this function will read to EOF before returning.
- text_
escape - Escapes a subset of XML reserved characters (&, <, and >) in a text string into XML-compatible text, eg replacing “&” with “&” and “<” with “<”
- unescape
- Reverses any escaped characters (&, <, >, ’, and “) in XML-compatible text to regenerate the original text, eg replacing “&” with “&” and “<” with “<”