Obj2XML-rs
High-performance, memory-efficient XML generator for Python, written in Rust.
A fast, deterministic, streaming-capable JSONβXML DSL with Python ergonomics. obj2xml-rs is a drop-in replacement for libraries like xmltodict.unparse but designed for speed, scalability, and correctness. It leverages Rust's zero-copy optimizations and streaming capabilities to handle massive datasets without exhausting system memory.
π Features
β‘ Blazing Fast: Built on quick-xml with Zero-Copy (Cow<str>) optimizations. 5-15x faster than pure Python.
π True Streaming: Supports Python Generators and Iterators. Writes huge XML files item-by-item directly to disk.
π‘οΈ Robust Error Context: Exceptions include the full XML path (e.g., Error at root/users/[3]/@id).
π Safe: Includes cycle detection to prevent infinite recursion crashes.
π§ Professional Spec: Supports Namespaces, CDATA, Comments, Processing Instructions, and deterministic attribute sorting.
π Pythonic: Supports default handlers for custom types (like datetime), similar to json.dump.
π¦ Installation
pip install obj2xml-rs
π Quick Start
=
Output:
Rust
Fast
Safe
Streaming (Low Memory)
Generate XML from a generator. Writes to file incrementally.
yield
π Specification & Behavior
This section defines how Python structures map to XML.
1. Reserved Keys
The following keys have special meaning in a dictionary:
| Key | Description | Example |
|---|---|---|
| @key | XML Attribute (prefix configurable) | {"@id": 1} β |
| #text | Element text content | {"tag": {"#text": "Hello"}} β Hello |
| #comment | XML Comment | {"#comment": "Note"} β |
| ?key | Processing Instruction | {"?xml-stylesheet": "href..."} β |
| #tail | Text content appearing immediately after the element's closing tag. | {"b": {"#text": "Bold", "#tail": " text"}} β < b>Bold</ b> text |
| cdata | CDATA Wrapper | {"#text": {"cdata": "x<y"}} β |
2. Element Mapping & Lists
Dict Keys: Map directly to XML Element names. Lists: Keys containing a list generate repeated elements with the same name.
12
Root Primitives: If the input is a list of primitives, they are wrapped in item_name.
12
3. Attributes & Sorting
Keys starting with attr_prefix (default "@") become attributes. Values: Any serializable value is accepted. Dicts/Lists in attributes are stringified. Sorting: Attributes follow Python insertion order by default. Use sort_attributes=True for deterministic output (attributes sorted lexicographically).
4. Namespaces
Namespaces can be declared in three ways: Static (Root Scope): Best practice for clean XML.
...
Inline Declarations:
Dynamic Assignment:
Automatically generates prefixes (ns0, ns1...)
5. Advanced Nodes
CDATA: Use the cdata key inside a text node. Comments: Use #comment. Processing Instructions: Keys starting with ?.
6. Constraints & Validation policies
XML Names: No validation of XML name syntax is performed. If you pass {"": 1}, invalid XML will be generated. Mixed Content: Mixed #text and child elements are allowed.
**Valid:** HelloWorld
Root Rules:
full_document=True (default): Requires exactly one root element.
full_document=False: Allows multiple roots (XML Fragment).
β οΈ Error Handling
Errors are actionable and include the full path to the problematic node.
=
Output:
Custom serialization failed: Bad data (at users/[0]/meta/@date)
Circular References: A RecursionError is raised if an object references itself.
βοΈ API Reference
| Argument | Description |
|---|---|
| compat | "native" (default, for None) or "dicttoxml" (legacy, ). |
| default | Callback to serialize unknown types. Errors propagate with path context. |
| streaming | If True, writes incrementally. output must be provided. |
| namespaces | Dict of {prefix: uri} declared at the root element. |
π₯οΈ CLI Usage
# Basic
python -m obj2xml_rs input.json -o output.xml --pretty
# Streaming from Pipe
cat huge.json | python -m obj2xml_rs --stream --item-name "record" > out.xml
π License
This project is licensed under the Apache License 2.0.