fastxml
A fast, memory-efficient XML library for Rust with XPath and schema validation support. Designed for processing large XML documents like CityGML files used in PLATEAU.
Features
- 🦀 Pure Rust — No C dependencies, no unsafe code
- 🔄 libxml Compatible — Consistent parsing/XPath results
- 💾 Memory Efficient — Parse and validate gigabyte-scale XML with ~1 MB memory footprint
- 🔍 Full XPath 1.0 — Complete XPath 1.0 support with namespace handling
- 📋 XSD Support — Schema parsing with import resolution, built-in GML types
- ⚡ Async Support — Async schema fetching and resolution with tokio
⚠️ Early Development (v0.x): API may change. Limited production experience. Not recommended for business-critical systems. Use at your own risk.
Performance
Benchmark on PLATEAU DEM GML (907 MB, 31M nodes) — benchmark code:
Parse only:
| Mode | Time | Throughput | Memory |
|---|---|---|---|
| libxml DOM | 7.11s | 128 MB/s | 4.19 GB |
| fastxml DOM | 11.50s | 79 MB/s | 951 MB |
| fastxml Streaming | 9.86s | 92 MB/s | ~1 MB |
Parse + Schema Validation:
| Mode | Time | Throughput | Memory |
|---|---|---|---|
| libxml DOM + validate | 11.10s | 82 MB/s | 3.64 GB |
| fastxml DOM + validate | 57.20s | 16 MB/s | 1.96 GB |
| fastxml Streaming | 22.33s | 41 MB/s | ~1 MB |
- DOM: 4.4x less memory than libxml
- Streaming: ~41 MB/s consistent throughput with minimal memory (~1 MB regardless of file size)
Installation
[]
= "0.6"
Cargo Features
| Feature | Description |
|---|---|
ureq |
Sync HTTP client for schema fetching (recommended) |
tokio |
Async HTTP client for schema fetching (reqwest + tokio) |
async-trait |
Async trait support for custom implementations |
compare-libxml |
Enable libxml2 comparison tests |
# Recommended: sync schema fetching
= { = "0.6", = ["ureq"] }
# Async schema fetching
= { = "0.6", = ["tokio"] }
Schema Fetchers
| Fetcher | Description |
|---|---|
FileFetcher |
Local filesystem |
UreqFetcher |
Sync HTTP (requires ureq) |
ReqwestFetcher |
Async HTTP (requires tokio) |
DefaultFetcher |
File + sync HTTP combined (requires ureq for HTTP) |
AsyncDefaultFetcher |
File + async HTTP combined (requires tokio) |
Traits:
| Trait | Description |
|---|---|
SchemaFetcher |
Sync fetcher trait |
AsyncSchemaFetcher |
Async fetcher trait (requires tokio) |
use ;
let fetcher = with_base_dir;
let result = fetcher.fetch?;
Quick Start
DOM Parsing
use ;
let xml = r#"<root><item id="1">Hello</item><item id="2">World</item></root>"#;
let doc = parse?;
let result = evaluate?;
for node in result.into_nodes
Streaming Parser
Process large files with minimal memory:
use ;
use BufReader;
use File;
let file = open?;
let mut parser = new;
parser.add_handler;
parser.parse?;
Stream Transform
Transform XML with XPath-based element selection:
use StreamTransformer;
let xml = r#"<root><item id="1">A</item><item id="2">B</item></root>"#;
// Modify elements (supports multiple handlers)
let result = new
.on
.run?
.to_string?;
// Extract data (single XPath)
let ids: = new
.collect?;
// Extract data from multiple XPaths in a single pass
let : = new
.collect_multi?;
// Iterate for side effects (no output transformation)
let mut ids = Vecnew;
new
.on
.for_each?;
Auto-detect Namespaces
Extract namespace declarations from the root element without DOM parsing:
let xml = r#"<root xmlns:gml="http://www.opengis.net/gml"><gml:point/></root>"#;
new
.with_root_namespaces? // Auto-registers namespaces from root element
.on
.run?;
Namespace URI Matching
Match elements by namespace URI instead of prefix (useful when different prefixes map to the same URI):
// Matches both gml:feature and g:feature if they have the same namespace URI
new
.namespace
.on
.run?;
Parent Context Access
Access ancestor elements' information during streaming transformation:
new
.on_with_context
.run?;
XPath Streamability Check
Check if an XPath can be processed in a single streaming pass:
use ;
// Quick check
if is_streamable
// Detailed analysis
match analyze_xpath_str?
Fallback Control
By default, non-streamable XPath expressions return an error. Enable fallback for two-pass processing:
// Default: error on non-streamable XPath
let result = new
.on
.run;
// => Err(NotStreamable { ... })
// Enable fallback (loads entire document into memory)
let result = new
.allow_fallback
.on
.run?;
Async Schema Resolution
Parse XSD schemas with async import/include resolution (requires tokio feature):
use ;
async
The async resolver:
- Fetches imported schemas asynchronously via HTTP
- Caches fetched schemas in the provided store
- Resolves nested imports (A → B → C)
- Detects circular dependencies
See examples/async_schema_resolution.rs for more examples.
Schema Validation
DOM Validation
use ;
let doc = parse?;
let errors = validate_document_by_schema?;
if errors.is_empty
Streaming Validation
Validate during parsing with minimal memory:
use StreamValidator;
use Arc;
let schema = new;
let reader = new;
let errors = new
.with_max_errors
.validate?;
Auto-detect Schema
Fetch schemas from xsi:schemaLocation automatically (requires ureq feature):
use ;
let doc = parse?;
let errors = validate_with_schema_location?;
For streaming:
use streaming_validate_with_schema_location;
let errors = streaming_validate_with_schema_location?;
Async Validation
Validate with async schema fetching (requires tokio feature):
use ;
async
Or get the compiled schema for reuse:
use get_schema_from_schema_location_async;
let schema = get_schema_from_schema_location_async.await?;
Validation Errors
use ErrorLevel;
for error in &errors
XPath
Basic Usage
use ;
let doc = parse?;
let result = evaluate?;
With Namespaces
let xml = r#"
<core:CityModel xmlns:core="http://www.opengis.net/citygml/2.0"
xmlns:bldg="http://www.opengis.net/citygml/building/2.0">
<bldg:Building gml:id="bldg_001">
<bldg:measuredHeight>25.5</bldg:measuredHeight>
</bldg:Building>
</core:CityModel>"#;
let doc = parse?;
let buildings = evaluate?;
Supported Specifications
XPath 1.0
| Feature | Examples |
|---|---|
| Paths | /root/child, //element, //* |
| Predicates | [@id='1'], [position()=1], [name()='foo'] |
| Axes | ancestor::, following-sibling::, namespace:: |
| Operators | and, or, not(), =, !=, <, >, +, -, *, div, mod |
| Functions | count(), contains(), string(), number(), sum(), etc. |
| Namespaces | //ns:element, namespace::* |
| Variables | $var |
| Union | `//a |
XSD Schema
| Feature | Support |
|---|---|
| Element/attribute definitions | ✅ |
| Complex types (sequence/choice/all) | ✅ |
| Simple types (restriction/list/union) | ✅ |
| Type inheritance | ✅ |
| Facets | ✅ |
| Attribute/model groups | ✅ |
| import/include/redefine | ✅ |
| Built-in XSD and GML types | ✅ |
| Identity constraints (unique/key/keyref) | ✅ |
| Substitution groups | ✅ |
Not Supported
- XQuery, XSLT, XInclude
- DTD validation
- XML Signature/Encryption
- Catalog support
- Full entity expansion
Development
Examples
# Async schema resolution
# Schema validation
# Benchmark CLI
License
MIT OR Apache-2.0