1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
//! This is a parser for Unilang instructions.
//!
//! It provides functionality to parse single or multiple instructions from a string,
//! handling command paths, arguments, and various syntax rules.
//!
//! The parser is designed to be robust against various input formats and provides
//! detailed error reporting for invalid instructions.
extern crate alloc;
/// `unilang_parser` is a Rust crate designed to parse `unilang` CLI-like instruction strings.
/// It leverages `strs_tools` for initial itemization (splitting the input string into lexical tokens)
/// and then performs syntactic analysis to produce structured `GenericInstruction` objects.
///
/// ## Features
///
/// - Parses command paths (single or multi-segment).
/// - Handles positional arguments.
/// - Handles named arguments in the format `name ::value`.
/// - Supports quoted arguments (e.g., `"value with spaces"`, `'another value'`) with basic escape sequence handling
/// (`\\`, `\"`, `\'`, `\n`, `\t`).
/// - Parses the help operator `?` (if it's the last token after a command path).
/// - Splits multiple instructions separated by `;;`.
/// - Provides detailed, location-aware error reporting using `ParseError` and `SourceLocation`
/// to pinpoint issues in the input string or slice segments.
/// - Configurable parsing behavior via `UnilangParserOptions` (e.g., error on duplicate named arguments,
/// error on positional arguments after named ones).
/// - `no_std` support (optional, via feature flag).
///
/// ## Core Components
///
/// - [`Parser`] : The main entry point for parsing instructions.
/// - [`UnilangParserOptions`] : Allows customization of parsing behavior.
/// - [`GenericInstruction`] : The primary output structure, representing a single parsed instruction with its
/// command path, positional arguments, and named arguments.
/// - [`Argument`] : Represents a parsed argument (either positional or named).
/// - [`ParseError`] : Encapsulates parsing errors, including an `ErrorKind` and `SourceLocation`.
/// - [`SourceLocation`] : Specifies the location of a token or error within the input (either a string span or a slice segment).
/// ## Basic Usage Example
///
/// ```rust
/// use unilang_parser :: { Parser, UnilangParserOptions };
///
/// fn main() -> Result< (), Box<dyn std ::error ::Error >> {
/// let options = UnilangParserOptions ::default();
/// let parser = Parser ::new(options);
/// let input = "my.command arg1 name ::value";
///
/// let instruction = parser.parse_repl_input(input)?;
///
/// println!("Command Path: {:?}", instruction.command_path_slices);
/// Ok(())
/// }
/// ```
///
/// ## ⚠️ CLI Integration: Using Shell Arguments Correctly
///
/// When integrating `unilang_parser` into a CLI application that receives arguments from a shell,
/// **you must use the `parse_from_argv()` method**, NOT `split_whitespace()`.
///
/// ### ✅ Correct Usage (CLI Application)
///
/// ```rust,no_run
/// use unilang_parser :: { Parser, UnilangParserOptions };
///
/// fn main() -> Result< (), Box<dyn std ::error ::Error >> {
/// let options = UnilangParserOptions ::default();
/// let parser = Parser ::new(options);
///
/// // Collect shell arguments (already tokenized by the shell)
/// let argv : Vec<String> = std ::env ::args().collect();
///
/// // ✅ CORRECT: Use parse_from_argv for shell arguments
/// let instruction = parser.parse_from_argv(&argv)?;
///
/// println!("Command: {:?}", instruction.command_path_slices);
/// Ok(())
/// }
/// ```
///
/// ### ❌ Common Pitfall (WRONG)
///
/// ```rust,ignore
/// use unilang_parser :: { Parser, UnilangParserOptions };
///
/// fn main() -> Result< (), Box<dyn std ::error ::Error >> {
/// let options = UnilangParserOptions ::default();
/// let parser = Parser ::new(options);
///
/// let argv : Vec<String> = std ::env ::args().collect();
/// let joined = argv.join(" ");
///
/// // ❌ WRONG: Don't use split_whitespace() on shell argv!
/// // This breaks quote handling that the shell already performed
/// let instruction = parser.parse_repl_input(&joined)?;
///
/// Ok(())
/// }
/// ```
///
/// ### Why This Matters
///
/// The shell has **already tokenized** the arguments, handling quotes, escapes, and whitespace.
/// When you receive `argv` from the shell:
///
/// - `my-app "foo bar"` → shell produces `argv = ["my-app", "foo bar"]` (2 tokens)
/// - If you join and re-split: `"my-app foo bar".split_whitespace()` → produces `["my-app", "foo", "bar"]` (3 tokens) ❌
///
/// **Result:** Arguments containing spaces are incorrectly split, breaking user expectations.
///
/// ### Rule of Thumb
///
/// - **From shell (CLI app):** Use `parse_from_argv(&argv)` - shell already tokenized
/// - **From string (embedded/scripting):** Use `parse_repl_input(input)` - string needs parsing
/// Defines error types for the parser.
/// Defines instruction and argument structures.
/// Adapts and classifies items from the splitter.
/// Contains the core parsing engine.
/// CLI parameter parsing convenience API.
/// Input marker newtypes for type-safe parser entry points.
/// Prelude for commonly used items.
pub use prelude :: *;