1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
//! High-performance, zero-copy, streaming XML syntax reader.
//!
//! This crate tokenizes well-formed XML into fine-grained events (start tags,
//! attributes, text, comments, etc.) delivered through a [`Visitor`] trait.
//! It does not validate that xml or attribute names are legal, build a tree, resolve namespaces,
//! or expand entity references.
//!
//! # Quick start
//!
//! Implement [`Visitor`] to receive events, then feed input to a [`Reader`]:
//!
//! ```
//! use xml_syntax_reader::{Reader, Visitor, Span};
//!
//! struct Print;
//! impl Visitor for Print {
//! type Error = std::convert::Infallible;
//! fn start_tag_open(&mut self, name: &[u8], _: Span) -> Result<(), Self::Error> {
//! println!("element: {}", String::from_utf8_lossy(name));
//! Ok(())
//! }
//! }
//!
//! let mut reader = Reader::new();
//! reader.parse_slice(b"<hello/>", &mut Print).unwrap();
//! ```
//!
//! For streaming use, call [`Reader::parse`] in a loop - it returns the
//! number of bytes consumed so the caller can shift the buffer and append
//! more data. [`parse_read`] wraps this loop for [`std::io::Read`] sources.
//!
//! # Encoding
//!
//! The parser operates on bytes and assumes UTF-8 input. Use
//! [`probe_encoding`] to detect the transport encoding (BOM / XML
//! declaration) and transcode if necessary before parsing.
//!
//! ## Input Limits
//!
//! The parser enforces hardcoded limits to prevent resource exhaustion:
//!
//! - **Names** (element, attribute, PI target, DOCTYPE, entity references):
//! maximum **1,000 bytes**. Exceeding this produces [`ErrorKind::NameTooLong`].
//!
//! - **Character references**: maximum **7 bytes** for the value between
//! `&#` and `;` (the longest valid reference is `` or
//! ``). Exceeding this produces [`ErrorKind::CharRefTooLong`].
//!
//! - **Text content, attribute values, and content bodies** (comments, CDATA
//! sections, processing instructions, and DOCTYPE declarations) are all
//! **streamed in chunks** at buffer boundaries. The visitor receives zero or
//! more content calls with contiguous spans - zero for empty constructs
//! (e.g. `<!---->`, `<?target?>`), and more than one when the body spans
//! buffer boundaries. Text content (`characters`) is additionally
//! interleaved with `entity_ref` / `char_ref` callbacks at reference
//! boundaries. Attribute values are chunked at both buffer boundaries and
//! entity/character reference boundaries, which produce separate
//! `attribute_entity_ref` and `attribute_char_ref` callbacks. There is no
//! size limit on any of these. See the [`Visitor`] trait documentation for
//! the full callback sequences.
extern crate std;
pub use ;
pub use Visitor;
pub use Reader;
pub use ;
pub use ;