string_pipeline/
lib.rs

1//! # string_pipeline
2//!
3//! A powerful string transformation CLI tool and Rust library that makes complex text processing simple.
4//! Transform data using intuitive **template syntax** โ€” chain operations like **split**, **join**, **replace**,
5//! **filter**, and **20+ others** in a single readable expression.
6//!
7//! ## Features
8//!
9//! - **๐Ÿ”— Chainable Operations**: Pipe operations together naturally
10//! - **๐ŸŽฏ Precise Control**: Python-like ranges with Rust syntax (`-2..`, `1..=3`)
11//! - **๐Ÿ—บ๏ธ Powerful Mapping**: Apply sub-pipelines to each list item
12//! - **๐Ÿ” Regex Support**: sed-like patterns for complex transformations
13//! - **๐Ÿ› Debug Mode**: Hierarchical operation visualization with detailed tracing
14//! - **๐Ÿ“ฅ Flexible I/O**: CLI tool + embeddable Rust library
15//! - **๐Ÿฆ€ Performance optimized**: Zero-copy operations where possible, efficient memory usage
16//! - **๐ŸŒ Unicode support**: Full UTF-8 and Unicode character handling
17//! - **๐Ÿ›ก๏ธ Error handling**: Comprehensive error reporting for invalid operations
18//!
19//! ## Quick Start
20//!
21//! ```rust
22//! use string_pipeline::Template;
23//!
24//! // Split by comma, take first 2 items, join with " and "
25//! let template = Template::parse("{split:,:0..2|join: and }").unwrap();
26//! let result = template.format("apple,banana,cherry,date").unwrap();
27//! assert_eq!(result, "apple and banana");
28//! ```
29//!
30//! ## Template Syntax Overview
31//!
32//! Templates are enclosed in `{}` and consist of operations separated by `|`:
33//!
34//! ```text
35//! {operation1|operation2|operation3}
36//! ```
37//!
38//! ### Core Operations (20+ Available)
39//!
40//! **๐Ÿ”ช Text Splitting & Joining**
41//! - **`split:sep:range`** - Split text and optionally select range
42//! - **`join:sep`** - Join list items with separator
43//! - **`slice:range`** - Select list elements by range
44//!
45//! **โœจ Text Transformation**
46//! - **`upper`**, **`lower`** - Case conversion
47//! - **`trim[:chars][:direction]`** - Remove whitespace or custom characters
48//! - **`append:text`**, **`prepend:text`** - Add text to ends
49//! - **`pad:width[:char][:direction]`** - Pad string to width
50//! - **`substring:range`** - Extract characters from string
51//!
52//! **๐Ÿ” Pattern Matching & Replacement**
53//! - **`replace:s/pattern/replacement/flags`** - Regex find/replace (sed-like)
54//! - **`regex_extract:pattern[:group]`** - Extract with regex pattern
55//! - **`filter:pattern`** - Keep items matching regex
56//! - **`filter_not:pattern`** - Remove items matching regex
57//!
58//! **๐Ÿ—‚๏ธ List Processing**
59//! - **`sort[:asc|desc]`** - Sort items alphabetically
60//! - **`reverse`** - Reverse string or list order
61//! - **`unique`** - Remove duplicate list items
62//! - **`map:{operations}`** - Apply sub-pipeline to each list item
63//!
64//! **๐Ÿงน Utility Operations**
65//! - **`strip_ansi`** - Remove ANSI escape sequences
66//!
67//! ### Range Syntax
68//!
69//! Supports Rust-like syntax with negative indexing:
70//!
71//! - **`N`** - Single index (`1` = second item)
72//! - **`N..M`** - Range exclusive (`1..3` = items 1,2)
73//! - **`N..=M`** - Range inclusive (`1..=3` = items 1,2,3)
74//! - **`N..`** - From N to end
75//! - **`..M`** - From start to M-1
76//! - **`..`** - All items
77//!
78//! Negative indices count from end (`-1` = last item).
79//!
80//! ### Debug Mode
81//!
82//! Add `!` after opening `{` to enable detailed operation tracing:
83//!
84//! ```rust
85//! use string_pipeline::Template;
86//!
87//! let template = Template::parse("{!split:,:..}").unwrap();
88//! // Outputs detailed debug information during processing
89//! let result = template.format("a,b,c").unwrap();
90//! assert_eq!(result, "a,b,c");
91//! ```
92//!
93//! ## Multi-Template Support
94//!
95//! Beyond simple templates, the library supports **multi-templates** that combine literal text
96//! with multiple template sections, featuring automatic caching for performance:
97//!
98//! ```rust
99//! use string_pipeline::MultiTemplate;
100//!
101//! // Combine literal text with template operations
102//! let template = MultiTemplate::parse("Name: {split: :0} Age: {split: :1}").unwrap();
103//! let result = template.format("John 25").unwrap();
104//! assert_eq!(result, "Name: John Age: 25");
105//!
106//! // Automatic caching: split operation performed only once
107//! let template = MultiTemplate::parse("First: {split:,:0} Second: {split:,:1}").unwrap();
108//! let result = template.format("apple,banana").unwrap();
109//! assert_eq!(result, "First: apple Second: banana");
110//! ```
111//!
112//! ## Common Use Cases
113//!
114//! ### Basic Text Processing
115//! ```rust
116//! use string_pipeline::Template;
117//!
118//! // Clean and normalize text
119//! let cleaner = Template::parse("{trim|replace:s/\\s+/ /g|lower}").unwrap();
120//! let result = cleaner.format("  Hello    WORLD  ").unwrap();
121//! assert_eq!(result, "hello world");
122//! ```
123//!
124//! ### Data Extraction
125//! ```rust
126//! use string_pipeline::Template;
127//!
128//! // Extract second field from space-separated data
129//! let extractor = Template::parse("{split: :1}").unwrap();
130//! let result = extractor.format("user 1234 active").unwrap();
131//! assert_eq!(result, "1234");
132//! ```
133//!
134//! ### List Processing with Map
135//! ```rust
136//! use string_pipeline::Template;
137//!
138//! // Process each item in a list
139//! let processor = Template::parse("{split:,:..|map:{trim|upper}|join:\\|}").unwrap();
140//! let result = processor.format(" apple, banana , cherry ").unwrap();
141//! assert_eq!(result, "APPLE|BANANA|CHERRY");
142//! ```
143//!
144//! ### Advanced Data Processing
145//! ```rust
146//! use string_pipeline::Template;
147//!
148//! // Extract domains from URLs
149//! let domain_extractor = Template::parse("{split:,:..|map:{regex_extract://([^/]+):1|upper}}").unwrap();
150//! let result = domain_extractor.format("https://github.com,https://google.com").unwrap();
151//! assert_eq!(result, "GITHUB.COM,GOOGLE.COM");
152//! ```
153//!
154//! ### Log Processing
155//! ```rust
156//! use string_pipeline::Template;
157//!
158//! // Extract timestamps from log entries
159//! let log_parser = Template::parse(r"{split:\n:..|map:{regex_extract:\d\d\d\d-\d\d-\d\d}|filter_not:^$|join:\n}").unwrap();
160//! let logs = "2023-12-01 ERROR: Failed\n2023-12-02 INFO: Success\nInvalid line";
161//! let result = log_parser.format(logs).unwrap();
162//! assert_eq!(result, "2023-12-01\n2023-12-02");
163//! ```
164//!
165//! ### Filter Operations
166//! ```rust
167//! use string_pipeline::Template;
168//!
169//! // Filter files by extension
170//! let py_filter = Template::parse("{split:,:..|filter:\\.py$|sort|join:\\n}").unwrap();
171//! let files = "app.py,readme.md,test.py,data.json";
172//! let result = py_filter.format(files).unwrap();
173//! assert_eq!(result, "app.py\ntest.py");
174//! ```
175//!
176//! ## Type System
177//!
178//! The pipeline system has a clear type system that distinguishes between:
179//! - **String operations**: Work only on strings (e.g., `upper`, `lower`, `trim`, `replace`)
180//! - **List operations**: Work only on lists (e.g., `sort`, `unique`, `slice`)
181//! - **Type-preserving operations**: Accept both types (e.g., `filter`, `reverse`)
182//! - **Type-converting operations**: Change between types (e.g., `split` converts stringโ†’list, `join` converts listโ†’string)
183//!
184//! Use `map:{operation}` to apply string operations to each item in a list.
185//!
186//! ## Error Handling
187//!
188//! All operations return `Result<String, String>` for comprehensive error handling:
189//!
190//! ```rust
191//! use string_pipeline::Template;
192//!
193//! // Invalid template syntax
194//! let result = Template::parse("{split:}");
195//! assert!(result.is_err());
196//!
197//! // Type mismatch errors are clear and helpful
198//! let template = Template::parse("{sort}").unwrap();
199//! let result = template.format("not_a_list");
200//! assert!(result.is_err());
201//! // Error: "Sort operation can only be applied to lists"
202//! ```
203//!
204//! ## Performance Notes
205//!
206//! - Templates are compiled once and can be reused efficiently
207//! - Operations use zero-copy techniques where possible
208//! - Large datasets are processed with optimized algorithms
209//! - Regex patterns are compiled and cached internally
210//! - Memory allocation is minimized for common operations
211//!
212//! For high-throughput applications, compile templates once and reuse them:
213//!
214//! ```rust
215//! use string_pipeline::Template;
216//!
217//! // Compile once
218//! let template = Template::parse("{split:,:0}").unwrap();
219//!
220//! // Reuse many times
221//! for input in &["a,b,c", "x,y,z", "1,2,3"] {
222//!     let result = template.format(input).unwrap();
223//!     println!("{}", result);
224//! }
225//! ```
226//!
227//! For complete documentation including all operations, advanced features, and debugging techniques,
228//! see the [`Template`] and [`MultiTemplate`] documentation and the comprehensive guides in the `docs/` directory.
229
230mod pipeline;
231
232pub use pipeline::{MultiTemplate, Template};