fastxml
A fast, memory-efficient XML library for Rust with XPath and streaming schema validation support. Designed for processing large XML documents like CityGML files used in PLATEAU.
Features
- 🦀 Pure Rust — No C dependencies, no unsafe code
- ✅ libxml Compatible — Consistent parsing/XPath results
- ⚡ Streaming — Parse and validate gigabyte-scale XML with ~1 MB memory footprint
- 🔄 Zero-Copy Transform — Stream-based XPath transformation with minimal allocations
- 📋 Full XPath & XSD — Complete XPath 1.0, schema parsing with import resolution, built-in GML types
Performance
Comparison with libxml
fastxml is designed as a drop-in replacement for libxml in Rust projects:
| Feature | libxml | fastxml |
|---|---|---|
| DOM parsing | ✅ | ✅ |
| XPath | ✅ | ✅ |
| Schema validation | ✅ (DOM only) | ✅ (DOM + Streaming) |
| Streaming | ❌ | ✅ |
| Memory efficiency | Low | High |
| Pure Rust | ❌ | ✅ |
Benchmark (PLATEAU DEM GML, 907 MB, 31M nodes) — benchmark code:
Parse only:
| Mode | Time | Throughput | Memory |
|---|---|---|---|
| libxml DOM | 3.29s | 276 MB/s | 4.19 GB |
| fastxml DOM | 3.67s | 247 MB/s | 666 MB |
| fastxml Streaming | 3.13s | 290 MB/s | ~1 MB |
Parse + Schema Validation (via xsi:schemaLocation):
| Mode | Time | Throughput | Memory |
|---|---|---|---|
| fastxml Streaming | 22.96s | 40 MB/s | ~1 MB |
- DOM: fastxml uses 6.3x less memory than libxml
- Streaming: Constant memory regardless of file size (only parser buffers)
- Schema validation auto-fetches XSD from
xsi:schemaLocation
Compatibility Testing: Parsing, XPath, and validation results are verified against libxml2. Run with cargo test --features compare-libxml (requires libxml2-dev).
Installation
Add to your Cargo.toml:
[]
= "0.1"
Features
By default, no HTTP client is included. Choose the features you need:
| Feature | Description |
|---|---|
ureq |
Sync HTTP client (UreqFetcher) for schema fetching |
reqwest |
Async HTTP client (ReqwestFetcher) for schema fetching |
async-trait |
Async trait support for custom AsyncSchemaStore implementations |
profile |
Memory profiling utilities |
compare-libxml |
Enable libxml2 comparison tests (requires libxml2-dev) |
# For sync schema fetching
= { = "0.1", = ["ureq"] }
# For async schema fetching
= { = "0.1", = ["reqwest"] }
# For custom async implementations (without built-in HTTP client)
= { = "0.1", = ["async-trait"] }
Quick Start
DOM Parsing
use ;
let xml = r#"
<root>
<item id="1">Hello</item>
<item id="2">World</item>
</root>
"#;
// Parse XML
let doc = parse?;
println!;
// XPath query
let result = evaluate?;
for node in result.into_nodes
Streaming Parser
Process large files with minimal memory:
use ;
use BufReader;
use File;
let file = open?;
let reader = new;
let mut parser = new;
parser.add_handler;
parser.parse?;
Streaming Transform
Transform XML documents efficiently with XPath-based element selection. Only matched elements are converted to DOM, providing significant memory savings for large files.
use StreamTransformer;
let xml = r#"<root><item id="1">A</item><item id="2">B</item></root>"#;
// Modify specific elements
let result = new
.xpath
.transform
.to_string
.unwrap;
// Result: <root><item id="1">A</item><item id="2" modified="true">B</item></root>
// Remove elements
let result = new
.xpath
.transform
.to_string
.unwrap;
// Result: <root><item id="2">B</item></root>
// Extract data without transformation
let ids: = new
.xpath
.collect
.unwrap;
// ids: ["1", "2"]
// Iterate over matched elements
let mut count = 0;
new
.xpath
.for_each
.unwrap;
With namespace support:
use ;
let xml = r#"<root xmlns:gml="http://www.opengis.net/gml">
<gml:Point><gml:pos>1 2</gml:pos></gml:Point>
</root>"#;
// Option 1: Register namespaces manually
let result = new
.namespaces
.xpath
.transform
.to_string
.unwrap;
// Option 2: Import namespaces from parsed document
let doc = parse.unwrap;
let result = new
.with_document_namespaces
.xpath
.transform
.to_string
.unwrap;
Performance (100K elements, 11 MB XML):
| Approach | Time | Memory |
|---|---|---|
| Streaming Transform | 47ms | ~11 MB |
| DOM Parse + XPath | 141ms | ~135 MB |
Streaming is 3x faster and uses 12x less memory.
Schema Validation
Validate XML documents against XSD schemas:
use ;
// Parse the XML document
let xml = read?;
let doc = parse?;
// Validate against XSD schema (fetches imports automatically)
let errors = validate_document_by_schema?;
if errors.is_empty else
Auto-detect Schema from xsi:schemaLocation
Automatically fetch and validate against schemas referenced in the XML document:
use ;
let xml = r#"<?xml version="1.0"?>
<root xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xsi:schemaLocation="http://example.com/ns http://example.com/schema.xsd">
<element>content</element>
</root>"#;
let doc = parse?;
// Reads xsi:schemaLocation, fetches schemas, and validates
let errors = validate_with_schema_location?;
This requires the ureq feature:
= { = "0.1", = ["ureq"] }
Streaming Validation
For large files, validate while parsing in a single pass:
use StreamingParser;
use StreamingSchemaValidator;
use parse_xsd;
use Arc;
use BufReader;
use File;
// Load and compile the schema
let xsd_content = read?;
let schema = new;
// Create streaming parser with validation
let file = open?;
let mut parser = new;
let validator = new;
parser.add_handler;
// Parse and validate in single pass
parser.parse?;
Streaming Validation with xsi:schemaLocation
For files with xsi:schemaLocation, fetch schemas automatically and validate in streaming mode with a single pass:
use streaming_validate_with_schema_location;
use File;
use BufReader;
let file = open?;
// Single-pass: reads schemaLocation from first element, fetches schema, validates
let errors = streaming_validate_with_schema_location?;
Or with more control using LazySchemaValidator:
use StreamingParser;
use ;
use File;
use BufReader;
let file = open?;
let mut parser = new;
// LazySchemaValidator fetches schema on first StartElement
let validator = new;
parser.add_handler;
parser.parse?;
This requires the ureq feature.
Error Handling
Validation errors include detailed location and context information:
use ;
let doc = parse?;
let errors = validate_document_by_schema?;
for error in &errors
// Filter by severity
let fatal_errors: = errors.iter
.filter
.collect;
XPath with Namespaces
use ;
let xml = r#"
<core:CityModel xmlns:core="http://www.opengis.net/citygml/2.0"
xmlns:bldg="http://www.opengis.net/citygml/building/2.0">
<bldg:Building gml:id="bldg_001">
<bldg:measuredHeight>25.5</bldg:measuredHeight>
</bldg:Building>
</core:CityModel>
"#;
let doc = parse?;
// Query with namespace prefix
let buildings = evaluate?;
println!;
// Query with name() function
let heights = evaluate?;
Limitations
XPath
Supported expressions:
| Expression | Example | Description |
|---|---|---|
| Absolute path | /root/child |
Direct path from root |
| Descendant | //element |
Any descendant |
| Wildcard | //* |
All elements |
| Name predicate | //*[name()='Building'] |
Match by name |
| Logical operators | //*[name()='A' or name()='B'] |
and, or, not |
| Text | //element/text() |
Text content |
| Namespace | //bldg:Building |
Namespaced elements |
| Axes | ancestor::div, following-sibling::* |
All standard axes |
| Arithmetic | @value + 10 |
+, -, *, div, mod |
| Comparison | @count > 5 |
=, !=, <, >, <=, >= |
| Functions | count(//item), contains(@name, 'test') |
Position, string, math functions |
| Union | //a | //b |
Combine multiple paths |
| Variables | //item[@id=$target] |
Variable references |
| Namespace axis | namespace::* |
In-scope namespaces |
XSD Schema
Supported: Element/attribute definitions, complex types (sequence/choice/all), simple types (restriction/list/union), type inheritance, facets, attribute/model groups, import/include/redefine, built-in XSD and GML types, identity constraints (unique/key/keyref), streaming validation with error location info.
Partial: Substitution groups (parsing only).
Not Supported
- XQuery, DTD validation, XSLT, XInclude, XML Signature/Encryption
- Catalog support
- Entity expansion (basic only)
Development
Load Test CLI
# Synthetic data
# Real files
# Real files with schema validation (auto-fetches from xsi:schemaLocation)
# Compare with libxml
| Option | Description |
|---|---|
--pattern <PATTERN> |
many-elements, deep-nesting, large-content, citygml |
--size <SIZE> |
Size for pattern |
--mode <MODE> |
dom, streaming, or both (default) |
--validate |
Enable schema validation (reads xsi:schemaLocation and fetches schemas, requires ureq feature) |
License
MIT OR Apache-2.0