edifact-rs โก
EDIFACT for Rust โ zero-copy parsing, streaming deserialization, typed derive macros, composable validation, and rich diagnostics.
โจ Why edifact-rs?
| edifact-rs | |
|---|---|
| ๐ Zero-copy parsing | Borrows directly from the input &[u8] โ no intermediate allocations |
| ๐ Streaming I/O | Reader-based APIs process gigabyte interchanges in constant memory |
| ๐ฏ Typed mapping | #[derive(EdifactDeserialize, EdifactSerialize)] for segments and messages |
| โ Composable validation | ProfileRulePack with multi-layer, rule-ID-filtered reporting |
| ๐ฉบ Rich diagnostics | Optional miette integration for human-friendly error output |
| ๐ก๏ธ DOS hardening | Configurable max_segment_bytes guard enforced on all read paths |
| ๐๏ธ Allocation-free hot paths | SmallVec, eager WriterEmitter, and edifact_deserialize_owned |
๐ฆ Installation
[]
= "0.1"
# Optional: derive macros (included by default)
# edifact-rs = { version = "0.1", features = ["derive"] }
# Optional: rich miette diagnostics
# edifact-rs = { version = "0.1", features = ["diagnostics"] }
Feature flags
| Feature | Default | Description |
|---|---|---|
derive |
โ yes | Re-exports EdifactDeserialize / EdifactSerialize derive macros |
diagnostics |
โ no | Adds miette::Diagnostic to EdifactError for human-readable output |
๐ Quick start
Parse bytes (zero-copy)
use from_bytes;
let input = b"UNA:+.? 'UNH+1+ORDERS:D:11A:UN'BGM+220+PO-4711+9'UNT+3+1'";
let segments: = from_bytes.?;
assert_eq!;
let bgm = &segments;
assert_eq!;
assert_eq!; // document code
assert_eq!; // document number
# Ok::
Typed deserialization with derive
use ;
let input = b"BGM+220+PO-4711+9'";
let segments: = from_bytes.?;
let bgm = edifact_deserialize?;
assert_eq!;
assert_eq!;
assert_eq!;
# Ok::
Map a full message with qualifier-based fields
use ;
let input = b"UNH+1+ORDERS:D:11A:UN'BGM+220+PO-4711+9'\
NAD+BY+4000001000002::9'NAD+SU+4000001000001::9'UNT+5+1'";
let segments: = from_bytes.?;
let msg = edifact_deserialize?;
println!;
println!;
# Ok::
Serialize to wire format
use ser;
# use ;
#
#
#
let bgm = Bgm ;
let wire = to_string?;
assert_eq!;
# Ok::
๐ก Streaming APIs
Low-memory typed extraction
Scan a large interchange and extract matching segments without buffering everything:
use ;
#
#
#
let input = new;
// Stop after the first match โ O(1) memory:
let first: Bgm = deserialize_first_from_reader?;
assert_eq!;
// Collect all matches โ only matching segments are kept:
let all: = deserialize_all_from_reader?;
assert_eq!;
# Ok::
Message-window streaming (UNH..UNT)
Process multi-message interchanges one window at a time, with O(1) memory per message:
use ;
#
#
#
#
#
let interchange = new;
// Iterate raw windows:
for window in message_windows_from_reader
// Or deserialize directly โ zero Vec<Segment> allocation per window:
let messages: =
.?;
assert_eq!;
# Ok::
Performance note:
deserialize_messages_from_readercallsedifact_deserialize_owned, a method generated by the derive macro that works directly on&[OwnedSegment]โ no intermediateVec<Segment<'_>>is ever materialized.
โ Validation
Profile rule packs
Compose business-level validation rules with stable rule IDs that can be filtered and reported independently:
use ;
let segments: =
from_bytes
.?;
let document_pack = builder
.for_message_type
.with_rule_fn;
let report = builder
.with_profile_pack
.build
.validate_lenient;
// Filter by rule namespace:
let doc_issues = report.filter_by_rule_prefix;
println!;
# Ok::
Multi-layer validation
Separate structure, code-list, and profile checks into distinct layers:
use ;
;
let context = builder
.with_message_type
.with_validator
.build;
# Ok::
๐ฉบ Diagnostics (optional feature)
Enable the diagnostics feature for human-readable, span-annotated error output powered by miette:
= { = "0.1", = ["diagnostics"] }
Error: invalid code value "999" at offset 42
โญโ input.edi:2:5
โ
2 โ BGM+999+PO-4711+9'
โ ^^^ code "999" is not in code list 1001
โ
Error Code: E007
Help: Use a valid document name code from UNTDID 1001
use ;
// With `diagnostics` feature, errors implement miette::Diagnostic.
// Use miette's Report for pretty-printing to the terminal.
See cookbook_diagnostics.rs for a complete example.
๐๏ธ Architecture
edifact-rs workspace
โ
โโโ edifact-rs โ core library
โ โโโ tokenizer zero-copy byte scanning, UNA handling
โ โโโ parser segment assembly, release-char resolution
โ โโโ model Segment / Element / OwnedSegment types
โ โโโ writer streaming wire-format serialization
โ โโโ event EdifactEvent / WriterEmitter (allocation-free)
โ โโโ de EdifactDeserialize trait + free helpers
โ โโโ ser EdifactSerialize trait
โ โโโ envelope UNB/UNH/UNT/UNZ validation
โ โโโ validator Validator / ValidationContext / ProfileRulePack
โ โโโ directory_validator SegmentDefinition / DirectoryValidator
โ
โโโ edifact-rs-derive proc-macro crate
โโโ #[derive(EdifactDeserialize, EdifactSerialize)]
Two parsing modes:
| Mode | API | Allocation model |
|---|---|---|
| Zero-copy | from_bytes(input: &[u8]) |
Borrows from input โ no heap for segment data |
| Owned streaming | from_reader_iter(reader) |
One OwnedSegment per segment; reader not buffered |
Key types:
| Type | Description |
|---|---|
Segment<'a> |
Zero-copy view with tag: &'a str and borrowed elements |
OwnedSegment |
Heap-owned copy; .borrow() returns O(1) BorrowedSegment |
BorrowedSegment<'a> |
Zero-allocation view of OwnedSegment |
EdifactError |
Stable error codes (E001โE020) with byte offsets |
ValidationReport |
Collected issues with lenient/strict modes |
ProfileRulePack |
Composable, filterable business-rule bundles |
๐ง Low-level API
Segment and element access
use ;
let input = b"NAD+BY+4000001000002::9'NAD+SU+4000001000001::9'";
let segs: = from_bytes.?;
let buyer = find_qualified_segment.unwrap;
assert_eq!;
assert_eq!;
# Ok::
Reader with DOS guard
use ;
use BufReader;
let config = ReaderConfig ;
let reader = new;
let segments = from_bufread_stream_with_config?;
# Ok::
Write segments
use ;
let mut buf: = Vecnew;
let mut writer = new;
writer.write_segment?;
writer.finish?;
assert_eq!;
# Ok::
๐ Async / tokio integration
edifact-rs is intentionally synchronous โ EDIFACT parsing is CPU-bound and imposing an async runtime on all users would be wrong. Two clean patterns bridge to async:
Pattern A โ read into memory, then parse synchronously (recommended for < 1 MB):
let bytes = read.await?;
let windows: = message_windows_bytes
.?;
Pattern B โ spawn_blocking for large files or blocking sources:
let messages = spawn_blocking.await??;
๐ Examples
Run any example with cargo run -p edifact-rs --example <name>:
| Example | What it shows |
|---|---|
cookbook_parse_map_validate_write |
Parse โ extract fields โ validate โ round-trip write |
cookbook_typed_derive |
Full derive workflow with qualifier-based NAD mapping |
cookbook_typed_streaming |
All four streaming extraction APIs |
cookbook_profile_packs |
Composing and filtering profile rule packs |
cookbook_streamed_progressive_validation |
Per-window validation over reader-based interchange |
cookbook_fixture_validation |
Custom Validator implementation with fixture data |
cookbook_diagnostics |
Rich miette diagnostics (--features diagnostics) |
๐งช Testing
# Run all tests (unit + integration + doc-tests + derive UI tests):
# Clippy (zero warnings policy):
# Benchmarks (criterion + custom):
# Fuzz (requires cargo-bolero):
๐ Workspace layout
edifact-rs/
โโโ crates/
โ โโโ edifact-rs/ core library crate
โ โ โโโ src/
โ โ โโโ examples/ runnable cookbooks
โ โ โโโ tests/ integration + conformance tests
โ โ โโโ benches/ criterion benchmarks
โ โโโ edifact-rs-derive/ proc-macro crate
โ โโโ src/
โ โโโ tests/ui/ trybuild compile-fail test suite
โโโ scripts/ UNECE source download helpers
โโโ CHANGELOG.md
โโโ CONCEPT.md
โโโ FINDINGS.md independent code-review findings (24/24 fixed โ
)
โโโ RELEASE_POLICY.md
โ๏ธ MSRV and edition
- Minimum Supported Rust Version: 1.85
- Edition: 2024
- MSRV is enforced by CI on every push.
๐ License
Licensed under either of:
- Apache License, Version 2.0 (LICENSE-APACHE or http://www.apache.org/licenses/LICENSE-2.0)
- MIT license (LICENSE-MIT or http://opensource.org/licenses/MIT)
at your option.