lex-core 0.11.0

Parser library for the lex format
Documentation
//! Wire-AST codec: lex-core internal AST ↔ `lex_extension::WireNode`.
//!
//! This module bridges lex-core's typed AST (`Document`, `ContentItem` and
//! friends) to the wire-format types defined in the public `lex-extension`
//! crate. The codec is what lets the registry-driven resolve pass round-trip
//! handler-returned wire ASTs back into typed lex-core nodes for splicing.
//!
//! # Direction
//!
//! - [`to_wire_node`] — forward: total over the AST shapes a parsed lex
//!   document can produce. Output is a [`lex_extension::WireNode`] tree.
//! - [`from_wire_node`] — reverse: fallible. Recognised `WireNode`
//!   variants become lex-core [`crate::lex::ast::ContentItem`]s; unknown
//!   shapes return [`FromWireError::UnsupportedKind`].
//!
//! # Lossy in places, by design
//!
//! The forward codec preserves *block structure* but drops several
//! representation-only details that the wire format does not have
//! slots for:
//!
//! - `Range::span` (byte offsets) — the wire format encodes only
//!   `(line, column)`. Reverse codec reconstructs `span = 0..0` since
//!   spliced content's byte offsets are advisory.
//! - **Inline-attached annotations** on inline nodes — wire
//!   `WireInline` doesn't carry annotation slots.
//! - **Block-level annotations** on `Paragraph`, `Session`, `List`,
//!   `Table`, etc. — none of the wire `WireNode` variants carry an
//!   `annotations` field, so attached annotations are dropped in the
//!   forward direction. (Standalone `ContentItem::Annotation` nodes
//!   *are* round-tripped fully via `WireNode::Annotation`.) The
//!   `LexIncludeHandler` mitigates this for include splicing by
//!   promoting `Document.annotations` to leading root children
//!   *before* the codec runs, matching the legacy
//!   `prepare_splice_list` behaviour.
//! - **Document-level metadata** — `Document.title` and
//!   `Document.annotations` are dropped by the codec itself. The
//!   `LexIncludeHandler` re-applies the legacy `prepare_splice_list`
//!   transformation in front of the codec so these surface as
//!   leading children of the wire `Document`.
//! - **Marker structure** on sessions and lists — the wire format
//!   stringifies the marker (`"1.1."`, `"(a)"`); the parser
//!   reconstructs the typed marker on the next parse.
//! - **List-item per-item markers** — list-item markers are derived
//!   from the parent list's `marker_style` plus item index in the
//!   reverse direction; the original raw marker text on each item is
//!   not preserved.
//! - **Verbatim multi-group bodies** — a multi-group verbatim block
//!   collapses to its first group in the forward direction; the
//!   additional groups are dropped. `lex.include` never returns
//!   multi-group verbatims, so this loss is invisible to the current
//!   codec consumer.
//! - **Table per-cell alignment** — wire tables carry one alignment
//!   string for the whole table; lex-core tracks alignment per cell.
//!   Forward picks the first non-`None` *body* cell's alignment
//!   (header-row alignment is skipped because it is often a styling
//!   artefact); reverse applies that alignment to every cell.
//! - **Tables with block-content cells** — `WireTableCell` only
//!   carries inline content; lex-core's `TableCell.children` (block
//!   content inside a cell, e.g. nested lists) has no slot in the
//!   wire form. Rather than silently drop that content, the forward
//!   codec emits a `lex.internal.unsupported.table_block_cells`
//!   placeholder; the reverse codec rejects it with
//!   `FromWireError::UnsupportedKind`. Future codec work that
//!   introduces an escape-hatch encoding for these tables (e.g.,
//!   `body_text` carrying the raw source) can lift this restriction.
//! - **`TextContent`** uses the parsed-inline path
//!   ([`TextContent::inline_nodes`]) when available, producing
//!   matching `WireInline` variants; otherwise emits the raw source
//!   as a single `WireInline::Text`. Reverse codec re-serialises
//!   through a `.lex` source-form string that the parser
//!   re-interprets identically.
//!
//! Verbatim `subject` and `mode` (`Inflow` / `Fullwidth`) **are**
//! preserved end-to-end: `WireNode::Verbatim` carries dedicated
//! `subject` and `mode` fields that the forward codec populates and
//! the reverse codec applies during reconstruction.
//!
//! For the consumer that matters today (`LexIncludeHandler`), the
//! remaining losses do not change observable include output: the
//! handler's `prepare_splice_list`-equivalent normalisation covers
//! the document-level losses, table-with-block-cells surfaces as a
//! visible `UnsupportedKind` rather than silent drop, and the
//! representation-only losses (spans, marker structure, per-cell
//! alignment) re-derive identically when the spliced content is
//! re-formatted to `.lex` source.
//!
//! # Versioning
//!
//! This codec speaks `lex_extension::WIRE_VERSION = 1`. Wire-format
//! changes that bump that constant require codec updates here.

mod error;
pub mod from_wire;
mod inline;
mod range;
pub mod to_wire;

#[cfg(test)]
mod tests;

pub use error::FromWireError;
pub use from_wire::{from_wire_node, from_wire_subtree};
pub use to_wire::{to_wire_document, to_wire_node};