Expand description
XML is a flexible markup language that is still used for sharing data between applications or for writing configuration files.
Serde XML provides a way to convert between text and strongly-typed Rust data structures.
§Caveats
The Serde framework was mainly designed with formats such as JSON or YAML in mind. As opposed to XML, these formats have the advantage of a stricter syntax which makes it possible to know what type a field is without relying on an accompanying schema, and disallows repeating the same tag multiple times in the same object.
For example, encoding the following document in YAML is not trivial.
<document>
<header>A header</header>
<section>First section</section>
<section>Second section</section>
<sidenote>A sidenote</sidenote>
<section>Third section</section>
<sidenote>Another sidenote</sidenote>
<section>Fourth section</section>
<footer>The footer</footer>
</document>
One possibility is the following YAML document.
- header: A header
- section: First section
- section: Second section
- sidenote: A sidenote
- section: Third section
- sidenote: Another sidenote
- section: Fourth section
- footer: The footer
Other notable differences:
- XML requires a named root node.
- XML has a namespace system.
- XML distinguishes between attributes, child tags and contents.
- In XML, the order of nodes is sometimes important.
§Basic example
use serde::{Deserialize, Serialize};
use serde_xml_rs::{from_str, to_string};
#[derive(Debug, Serialize, Deserialize, PartialEq)]
struct Item {
name: String,
source: String,
}
let src = r#"<?xml version="1.0" encoding="UTF-8"?><Item><name>Banana</name><source>Store</source></Item>"#;
let should_be = Item {
name: "Banana".to_string(),
source: "Store".to_string(),
};
let item: Item = from_str(src).unwrap();
assert_eq!(item, should_be);
let reserialized_item = to_string(&item).unwrap();
assert_eq!(src, reserialized_item);
§Correspondence between XML and Rust
§Document root
As stated above, XML documents must have one and only one root element. This puts a constraint on the range of types that can be supported at the root, especially during serialization when a name has to be given to the root element.
In order to support serialization and deserialization, the root Rust type, that is the type of the value passed to to_string
or returned by from_str
, must be one of:
- a struct
- a newtype struct
- a unit struct (not very interesting)
- an enum
XML | Rust |
---|---|
|
ⓘ
ⓘ
ⓘ
|
Other types must be encapsulated in order to be serialized, because the name of the struct or enum provides the name of the root element for the XML document.
The deserializer supports more Rust types directly:
- primitives (
bool
,char
, integers, floats) - options
- unit (
()
)
Sequences, tuples and maps are not supported as root types for the moment, but could be in the future.
§Strings and byte arrays
XML | Rust |
---|---|
|
|
Borrowed strings are not supported.
§Primitive types
XML | Rust |
---|---|
unit | |
|
|
boolean | |
|
|
char | |
|
|
|
|
integers | |
|
|
floats | |
|
|
§Child elements
Rust structs can be used to (de)serialize the contents of XML elements: child elements, attributes, and text. The name of the struct field must match the name of the corresponding child element.
XML | Rust |
---|---|
|
|
§Attributes
Fields that deserialize to and serialize from attributes must have a name starting with @
.
XML | Rust |
---|---|
|
|
For serialization to work, all attributes must be declared before any child elements.
#[derive(Serialize, Deserialize)]
struct Document {
#[serde(rename = "@a")]
a: String,
#[serde(rename = "b")] // This child element appears before an attribute
b: i32,
#[serde(rename = "@c")]
c: (),
}
let value = Document {
a: "abc".to_string(),
b: 123,
c: (),
};
assert!(serde_xml_rs::to_string(&value).is_err()); // ERROR !
§Elements with attributes and text content
When an element (root or child) that contains both attributes and text content,
the struct type must have a field named #text
.
Currently, mixed content with child elements and text is not supported.
XML | Rust |
---|---|
|
|
§Repeated tags and sequences
Repeated tags are handled by fields with a type of Vec<...>
.
The name of the field must correspond to the name of the tag that is repeated.
All of the repeated tags must be consecutive, unless the overlapping sequences option is activated.
XML | Rust |
---|---|
|
|
§Choices and enums
Enums can be used to represent different options. The name of the variant must match the name of the child element. Variants are handled much like their struct counterparts (unit, newtype, struct).
XML | Rust |
---|---|
|
|
§Enums in attribute values
Only unit variants can be used for attribute values.
XML | Rust |
---|---|
|
|
§Sequences of choices and #content
Sequences of choices can be handled various ways:
- Without any configuration: a field named
item
with an enum type will be mapped to repeated element<item>
containing a child element designating an enum variant and any parameters (<item><variant-name>content</variant-name></item>
). - Using a field named
#content
: any child elements are treated as enum variants and are collected into the vector. - Container tag using an intermediate struct:
<items><item>...</item><item>...</item>...</items>
. The ergonomics of this option may be improved in the future. In the meantime, look at serde-query.
XML | Rust |
---|---|
|
|
Using | |
|
|
Container element | |
|
|
§XML Namespaces
Any XML namespaces declared in a document are mapped to a prefix. That prefix can then appears in the names of attributes and elements. The prefix must also appear in the names of the corresponding Rust fields.
- Deserialization: Only prefixes matter. Any
xmlns...
attributes are ignored. - Serialization: The mapping between prefixes and namespace URI must be provided (see SerdeXml::namespace). All namespaces are declared in the root element.
XML | Rust |
---|---|
|
|
§Custom EventReader
use serde::{Deserialize, Serialize};
use serde_xml_rs::{from_str, to_string, de::Deserializer};
use xml::reader::{EventReader, ParserConfig};
#[derive(Debug, Serialize, Deserialize, PartialEq)]
struct Item {
name: String,
source: String,
}
let src = r#"<Item><name> Banana </name><source>Store</source></Item>"#;
let should_be = Item {
name: " Banana ".to_string(),
source: "Store".to_string(),
};
let config = ParserConfig::new()
.trim_whitespace(false)
.whitespace_to_characters(true);
let event_reader = EventReader::new_with_config(src.as_bytes(), config);
let item = Item::deserialize(&mut Deserializer::new(event_reader)).unwrap();
assert_eq!(item, should_be);
Re-exports§
pub use crate::config::SerdeXml;
pub use crate::de::from_reader;
pub use crate::de::from_str;
pub use crate::de::Deserializer;
pub use crate::ser::to_string;
pub use crate::ser::to_writer;
pub use crate::ser::Serializer;