# CLAUDE.md
This file provides guidance to Claude Code (claude.ai/code) when working with code in this repository.
## Project Overview
fastxml is a fast, memory-efficient XML library for Rust with XPath 1.0 and XSD schema validation support. Designed for processing large XML documents like CityGML files (PLATEAU).
## Build Commands
```bash
# Run tests
cargo test # Unit tests only
cargo test --lib --all-features # All lib tests with all features
cargo test --features tokio # With async tests
cargo test --features compare-libxml # With libxml comparison (requires libxml2-dev)
# Run a single test
cargo test test_name # By test name
cargo test xpath_test # By test file name
# Linting
cargo fmt --all -- --check # Check formatting
cargo clippy --all-targets --all-features -- -D warnings
# Build documentation
cargo doc --no-deps --all-features
# Run benchmarks
cargo bench
# Run examples
cargo run --example async_schema_resolution --features tokio
cargo run --example schema_validation --features ureq
cargo run --release --example bench -- ./file.xml
```
## Pre-commit Checklist
Before committing, always run all CI checks:
```bash
cargo fmt --all
cargo clippy --all-targets --all-features -- -D warnings
cargo test --lib --all-features
cargo doc --no-deps --all-features
```
Note: `clippy` and `doc` with `--all-features` require `libxml2-dev` system package.
## Version Bump Checklist
When bumping the version, update these files:
- `Cargo.toml` - `version = "x.y.z"`
- `Cargo.lock` - Run `cargo update -p fastxml` to update the lock file
- `README.md` - All `fastxml = "x.y"` references in installation examples
## Manual Testing Tools
```bash
# Validate arbitrary XML files against XSD schema
cargo run --release --features ureq --bin fastxml-validate -- <xml-file>
# Run benchmarks on XML files
cargo run --release --example bench -- <xml-file>
cargo run --release --features ureq --example bench -- <xml-file> --validate
# Benchmark modes: --mode dom | streaming | both (default: both)
cargo run --release --example bench -- <xml-file> --mode streaming
```
## Architecture
### Core Modules (src/)
- **parser.rs** - XML parsing using quick-xml, produces DOM or events
- **document.rs** - DOM document structure (XmlDocument)
- **node.rs** - Node types (XmlNode for mutable, XmlRoNode for read-only)
- **event.rs** - Streaming parser with event handlers (XmlEvent, StreamingParser)
- **transform/** - Stream transformation with XPath-based element selection (StreamTransformer)
### XPath (src/xpath/)
- **parser.rs** - XPath 1.0 expression parser
- **evaluator.rs** - XPath evaluation engine
- **context.rs** - Evaluation context with namespace bindings (XmlContext, XmlSafeContext)
- **functions/** - Built-in XPath functions (string, number, nodeset, boolean)
- **axes.rs** - XPath axes implementation
### Schema/Validation (src/schema/)
- **xsd/** - XSD schema parsing and compilation
- **parser/** - SAX-style XSD parser with stack-based state machine
- **compiler.rs** - Compiles parsed XSD into CompiledSchema
- **builtin.rs** - Built-in XSD and GML types
- **resolver/** - Import/include resolution (sync and async)
- **validator/** - Validation implementations
- **streaming.rs** - Single-pass streaming validator (OnePassSchemaValidator)
- **lazy.rs** - Lazy schema loading validator
- **dom.rs** - DOM-based validation
- **fetcher/** - Schema fetching (file, HTTP sync/async)
- **store.rs** - Schema storage abstraction (memory, temp files)
### Event Flow for Streaming Validation
```
XML Data → StreamingParser → [DocumentBuilder, OnePassSchemaValidator]
(same events shared)
```
## Key Types
- `XmlDocument` - Parsed XML document (DOM)
- `XmlNode` / `XmlRoNode` - Mutable / read-only node handles
- `CompiledSchema` - Compiled XSD schema for validation
- `StreamingParser` - Event-based parser for large files
- `StreamTransformer` - XPath-based stream transformation
## Feature Flags
- `ureq` - Sync HTTP client for schema fetching (enables CLI binary)
- `tokio` - Async HTTP client with reqwest
- `async-trait` - Async trait support for custom fetchers
- `compare-libxml` - Enable libxml2 comparison tests (requires system libxml2)
- `profile` - Memory profiling utilities
## Testing Notes
- Integration tests in `tests/` cover XSD validation, XPath, transforms
- `tests/common/mod.rs` contains shared test utilities
- libxml comparison tests require `libxml2-dev` system package
## Conformance Testing
The `conformance/` workspace member provides W3C/OASIS standard compliance testing:
### Test Suites
| W3C XML Conformance | 2,000+ | XML Parsing (DOM & Streaming) |
| W3C XML Schema | ~40,000 | XSD Validation (DOM & Streaming) |
| OASIS XPath 1.0 | Hundreds | XPath Evaluation |
### Running Conformance Tests
```bash
# Run tests (skips if data not available)
cargo test -p fastxml-conformance
# Download test data and run tests
FASTXML_DOWNLOAD_TESTS=1 cargo test -p fastxml-conformance
# Run specific test suite
cargo test -p fastxml-conformance w3c_xml_conformance_dom # DOM parsing
cargo test -p fastxml-conformance w3c_xml_conformance_streaming # Streaming parsing
cargo test -p fastxml-conformance w3c_xsd_conformance_dom # DOM validation
cargo test -p fastxml-conformance w3c_xsd_conformance_streaming # Streaming validation
cargo test -p fastxml-conformance oasis_xpath # XPath evaluation
# Download test data manually
cargo run -p fastxml-conformance --bin download
```
### Conformance Test Structure
```
conformance/
├── src/
│ ├── lib.rs # Common utilities and macros
│ ├── downloader.rs # Test data download/extraction
│ ├── reporter.rs # Conformance report generation
│ └── catalog/ # Test catalog parsers
└── tests/
├── w3c_xml.rs # W3C XML tests (DOM & Streaming)
├── w3c_xsd.rs # W3C XSD tests (DomValidator & OnePassSchemaValidator)
└── oasis_xpath.rs # OASIS XPath tests
```
Test data is downloaded to `conformance/data/` (gitignored).
## Development Guidelines
### TDD (Test-Driven Development)
When fixing bugs or adding features, follow the TDD approach:
1. **Write a failing test first** - Create a test that reproduces the bug or specifies the new behavior
2. **Verify the test fails** - Run the test to confirm it fails as expected
3. **Implement the fix/feature** - Write the minimum code to make the test pass
4. **Verify the test passes** - Run the test to confirm the fix works
5. **Refactor if needed** - Clean up the code while keeping tests green
### File Size Guidelines
Keep individual `.rs` files under **1000 lines**. When a file grows beyond this threshold:
- Split it into submodules (e.g., `foo.rs` → `foo/mod.rs` + `foo/bar.rs`)
- Group related functionality together
- Use `pub use` in `mod.rs` to maintain the public API