# Obj2XML-rs
[](https://pypi.org/project/obj2xml-rs/)
[](https://pypi.org/project/obj2xml-rs/)
**High-performance, memory-efficient XML serializer and parser for Python, written in Rust.**
A fast, deterministic, streaming-capable JSON↔XML tool with Python ergonomics. `obj2xml-rs` is a drop-in replacement for
libraries like `xmltodict` but designed for speed, scalability, and correctness.
It leverages Rust's zero-copy optimizations and streaming capabilities to handle massive datasets without exhausting system memory.
### Features
* **Blazing Fast**: Built on `quick-xml` with Zero-Copy (`Cow<str>`) optimizations. 5-15x faster than pure Python.
* **True Streaming**: Supports Python Generators and Iterators. Writes huge XML files item-by-item directly to disk.
* **Robust Error Context**: Exceptions include the full XML path (e.g., `Error at root/users/[3]/@id`).
* **Safe**: Includes cycle detection to prevent infinite recursion crashes.
* **Professional Spec**: Supports Namespaces, CDATA, Comments, Processing Instructions, and deterministic attribute sorting.
* **Pythonic**: Supports `default` handlers for custom types (like `datetime`), similar to `json.dump`.
---
### Installation
```bash
pip install obj2xml-rs
```
---
### Quick Start
#### 1. Unparse (Dict → XML)
```python
import obj2xml_rs
data = {
"root": {
"@id": "123",
"name": "Rust",
"features": ["Fast", "Safe"]
}
}
print(obj2xml_rs.unparse(data, pretty=True))
```
**Output:**
```xml
<?xml version="1.0" encoding="utf-8"?>
<root id="123">
<name>Rust</name>
<features>Fast</features>
<features>Safe</features>
</root>
```
#### 2. Parse (XML → Dict)
```python
xml = '<root id="1"><item>A</item><item>B</item></root>'
data = obj2xml_rs.parse(xml)
print(data)
# {'root': {'@id': '1', 'item': ['A', 'B']}}
```
#### 3. Streaming (Low Memory Write)
Generate XML from a generator. Writes to file incrementally.
```python
def huge_data():
for i in range(1_000_000):
yield {"row": {"id": i, "val": f"data_{i}"}}
obj2xml_rs.unparse(
huge_data(),
output="large.xml",
streaming=True,
item_name="row"
)
```
---
### Specification & Behavior
This section defines how Python structures map to XML.
#### 1. Reserved Keys
The following keys have special meaning in a dictionary:
| Key | Description | Example |
|:---------:|:-------------------------------------------------------------------:|:----------------------------------------------------------------|
| @key | XML Attribute (prefix configurable) | {"@id": 1} → <tag id="1"> |
| #text | Element text content | {"tag": {"#text": "Hello"}} → <tag>Hello</tag> |
| #comment | XML Comment | {"#comment": "Note"} → |
| ?key | Processing Instruction | {"?xml-stylesheet": "href..."} → <?xml-stylesheet href...?> |
| #tail | Text content appearing immediately after the element's closing tag. | {"b": {"#text": "Bold", "#tail": " text"}} → < b>Bold</ b> text |
| __cdata__ | CDATA Wrapper | {"#text": {"__cdata__": "x<y"}} → <![CDATA[x<y]]> |
#### 2. Element Mapping & Lists
* **Dict Keys**: Map directly to XML Element names.
* **Lists**: Keys containing a list generate repeated elements with the same name.
```python
{"items": {"item": [1, 2]}}
# <items><item>1</item><item>2</item></items>
```
* **Root Primitives**: If the input is a list of primitives, they are wrapped in `item_name`.
```python
unparse([1, 2], item_name="n", full_document=False)
# <n>1</n><n>2</n>
```
#### 3. Attributes & Sorting
* Keys starting with `attr_prefix` (default `"@"`) become attributes.
* **Values**: Any serializable value is accepted. Dicts/Lists in attributes are stringified.
* **Sorting**: Attributes follow Python insertion order by default. Use `sort_attributes=True` for deterministic output (attributes sorted lexicographically).
#### 4. Namespaces
Namespaces can be declared in three ways:
1. **Static (Root Scope)**: Best practice for clean XML.
```python
unparse(data, namespaces={"soap": "http://example.com/soap"})
# <root xmlns:soap="http://example.com/soap"> ...
```
2. **Inline Declarations**:
```python
{"root": {"@xmlns:x": "urn:x", "x:child": 1}}
```
3. **Dynamic Assignment**:
```python
{"tag": {"@ns": "urn:auto"}}
# Automatically generates prefixes (ns0, ns1...)
```
#### 5. Advanced Nodes
* **CDATA**: Use the `__cdata__` key inside a text node.
* **Comments**: Use `#comment`.
* **Processing Instructions**: Keys starting with `?`.
```python
{"root": {"?xml-stylesheet": 'type="text/xsl" href="style.xsl"'}}
```
#### 6. Constraints & Validation policies
* **XML Names**: No validation of XML name syntax is performed. If you pass `{"<invalid>": 1}`, invalid XML will be generated.
* **Mixed Content**: Mixed `#text` and child elements are allowed.
```python
{"p": {"#text": "Hello", "b": "World"}}
# Valid: <p>Hello<b>World</b></p>
```
* **Root Rules**:
* `full_document=True` (default): Requires exactly one root element.
* `full_document=False`: Allows multiple roots (XML Fragment).
### Error Handling
Errors are actionable and include the full path to the problematic node.
```python
def fail_serializer(obj):
raise ValueError("Bad data")
data = {"users": [{"name": "Alice", "meta": {"@date": object()}}]}
try:
unparse(data, default=fail_serializer)
except ValueError as e:
print(e)
```
**Output:**
```text
Custom serialization failed: Bad data (at users/[0]/meta/@date)
```
* **Circular References**: A `RecursionError` is raised if an object references itself.
### API Reference
#### Unparse (Write)
```python
def unparse(
input: Union[Dict, Iterable, Any],
*,
output: Optional[Union[str, IO]] = None,
encoding: str = "utf-8",
full_document: bool = True,
attr_prefix: str = "@",
cdata_key: str = "#text",
pretty: bool = False,
indent: str = " ",
compat: str = "native",
streaming: bool = False,
default: Optional[Callable[[Any], str]] = None,
item_name: str = "item",
sort_attributes: bool = False,
namespaces: Optional[Dict[str, str]] = None
) -> str:
```
#### Parse (Read)
```python
def parse(
xml_input: Union[str, bytes, IO],
*,
encoding: Optional[str] = None,
attr_prefix: str = "@",
cdata_key: str = "#text",
force_cdata: bool = False,
process_namespaces: bool = False,
namespace_separator: str = ":",
strip_whitespace: bool = True,
force_list: Optional[Iterable[str]] = None,
process_comments: bool = False
) -> Dict[str, Any]:
```
### CLI Usage
**JSON to XML (Unparse)**
```bash
# Basic
python -m obj2xml_rs unparse input.json -o output.xml --pretty
# Streaming from Pipe
cat huge.json | python -m obj2xml_rs unparse --stream --item-name "record" > out.xml
```
**XML to JSON (Parse)**
```bash
# Convert XML file to JSON
python -m obj2xml_rs parse data.xml -o data.json --pretty
# Force specific tags to be lists
python -m obj2xml_rs parse data.xml --force-list item user
```
### Python XML Library Comparison Matrix
| Feature | obj2xml-rs | xmltodict | xmltodict-rs | dicttoxml | quick-xmltodict |
|:----------------:|:---------------------------------------------------------------------:|:----------------------------:|:-----------------------------------------------------:|:---------------------------------:|:----------------|
| Language | Rust (PyO3) | Python | Rust (PyO3) | Python | Rust (PyO3) |
| Capabilities | Read & Write | Read & Write | Read & Write | Write Only | Read Only |
| Write Speed | High | Low | High | Low | N/A |
|Write Memory Model| Streaming / Zero-Copy | In-Memory Object Graph | In-Memory String | In-Memory String | N/A |
| Stream Writing | Yes (Generators) | No | No | No | N/A |
| Async Support | Yes (asyncio) | No | No | No | N/A |
| Cycle Detection | Yes, detects cycles early and<br/>raises path-aware Python exceptions |No — fails with RecursionError|No — causes interpreter crash (SIGSEGV) on cyclic input| No — fails with RecursionError | N/A |
| Error Context | Path-Aware | Generic | Generic | Generic | N/A |
| Attributes | Deterministicc | Insertion Order | Insertion Order |Non-deterministic unless pre-sorted| N/A |
| Namespaces | Yes | Yes | Yes | Limited | N/A |
### 📄 License
This project is licensed under the Apache License 2.0.