Expand description
§string_pipeline
A powerful string transformation CLI tool and Rust library that makes complex text processing simple. Transform data using intuitive template syntax — chain operations like split, join, replace, filter, and 20+ others in a single readable expression.
§Features
- 🔗 Chainable Operations: Pipe operations together naturally
- 🎯 Precise Control: Python-like ranges with Rust syntax (
-2..
,1..=3
) - 🗺️ Powerful Mapping: Apply sub-pipelines to each list item
- 🔍 Regex Support: sed-like patterns for complex transformations
- 🐛 Debug Mode: Hierarchical operation visualization with detailed tracing
- 📥 Flexible I/O: CLI tool + embeddable Rust library
- 🦀 Performance optimized: Zero-copy operations where possible, efficient memory usage
- 🌍 Unicode support: Full UTF-8 and Unicode character handling
- 🛡️ Error handling: Comprehensive error reporting for invalid operations
§Quick Start
use string_pipeline::Template;
// Split by comma, take first 2 items, join with " and "
let template = Template::parse("{split:,:0..2|join: and }").unwrap();
let result = template.format("apple,banana,cherry,date").unwrap();
assert_eq!(result, "apple and banana");
§Template Syntax Overview
Templates are enclosed in {}
and consist of operations separated by |
:
{operation1|operation2|operation3}
§Core Operations (20+ Available)
🔪 Text Splitting & Joining
split:sep:range
- Split text and optionally select rangejoin:sep
- Join list items with separatorslice:range
- Select list elements by range
✨ Text Transformation
upper
,lower
- Case conversiontrim[:chars][:direction]
- Remove whitespace or custom charactersappend:text
,prepend:text
- Add text to endspad:width[:char][:direction]
- Pad string to widthsubstring:range
- Extract characters from string
🔍 Pattern Matching & Replacement
replace:s/pattern/replacement/flags
- Regex find/replace (sed-like)regex_extract:pattern[:group]
- Extract with regex patternfilter:pattern
- Keep items matching regexfilter_not:pattern
- Remove items matching regex
🗂️ List Processing
sort[:asc|desc]
- Sort items alphabeticallyreverse
- Reverse string or list orderunique
- Remove duplicate list itemsmap:{operations}
- Apply sub-pipeline to each list item
🧹 Utility Operations
strip_ansi
- Remove ANSI escape sequences
§Range Syntax
Supports Rust-like syntax with negative indexing:
N
- Single index (1
= second item)N..M
- Range exclusive (1..3
= items 1,2)N..=M
- Range inclusive (1..=3
= items 1,2,3)N..
- From N to end..M
- From start to M-1..
- All items
Negative indices count from end (-1
= last item).
§Debug Mode
Add !
after opening {
to enable detailed operation tracing:
use string_pipeline::Template;
let template = Template::parse("{!split:,:..}").unwrap();
// Outputs detailed debug information during processing
let result = template.format("a,b,c").unwrap();
assert_eq!(result, "a,b,c");
§Multi-Template Support
Beyond simple templates, the library supports multi-templates that combine literal text with multiple template sections, featuring automatic caching for performance:
use string_pipeline::MultiTemplate;
// Combine literal text with template operations
let template = MultiTemplate::parse("Name: {split: :0} Age: {split: :1}").unwrap();
let result = template.format("John 25").unwrap();
assert_eq!(result, "Name: John Age: 25");
// Automatic caching: split operation performed only once
let template = MultiTemplate::parse("First: {split:,:0} Second: {split:,:1}").unwrap();
let result = template.format("apple,banana").unwrap();
assert_eq!(result, "First: apple Second: banana");
§Common Use Cases
§Basic Text Processing
use string_pipeline::Template;
// Clean and normalize text
let cleaner = Template::parse("{trim|replace:s/\\s+/ /g|lower}").unwrap();
let result = cleaner.format(" Hello WORLD ").unwrap();
assert_eq!(result, "hello world");
§Data Extraction
use string_pipeline::Template;
// Extract second field from space-separated data
let extractor = Template::parse("{split: :1}").unwrap();
let result = extractor.format("user 1234 active").unwrap();
assert_eq!(result, "1234");
§List Processing with Map
use string_pipeline::Template;
// Process each item in a list
let processor = Template::parse("{split:,:..|map:{trim|upper}|join:\\|}").unwrap();
let result = processor.format(" apple, banana , cherry ").unwrap();
assert_eq!(result, "APPLE|BANANA|CHERRY");
§Advanced Data Processing
use string_pipeline::Template;
// Extract domains from URLs
let domain_extractor = Template::parse("{split:,:..|map:{regex_extract://([^/]+):1|upper}}").unwrap();
let result = domain_extractor.format("https://github.com,https://google.com").unwrap();
assert_eq!(result, "GITHUB.COM,GOOGLE.COM");
§Log Processing
use string_pipeline::Template;
// Extract timestamps from log entries
let log_parser = Template::parse(r"{split:\n:..|map:{regex_extract:\d\d\d\d-\d\d-\d\d}|filter_not:^$|join:\n}").unwrap();
let logs = "2023-12-01 ERROR: Failed\n2023-12-02 INFO: Success\nInvalid line";
let result = log_parser.format(logs).unwrap();
assert_eq!(result, "2023-12-01\n2023-12-02");
§Filter Operations
use string_pipeline::Template;
// Filter files by extension
let py_filter = Template::parse("{split:,:..|filter:\\.py$|sort|join:\\n}").unwrap();
let files = "app.py,readme.md,test.py,data.json";
let result = py_filter.format(files).unwrap();
assert_eq!(result, "app.py\ntest.py");
§Type System
The pipeline system has a clear type system that distinguishes between:
- String operations: Work only on strings (e.g.,
upper
,lower
,trim
,replace
) - List operations: Work only on lists (e.g.,
sort
,unique
,slice
) - Type-preserving operations: Accept both types (e.g.,
filter
,reverse
) - Type-converting operations: Change between types (e.g.,
split
converts string→list,join
converts list→string)
Use map:{operation}
to apply string operations to each item in a list.
§Error Handling
All operations return Result<String, String>
for comprehensive error handling:
use string_pipeline::Template;
// Invalid template syntax
let result = Template::parse("{split:}");
assert!(result.is_err());
// Type mismatch errors are clear and helpful
let template = Template::parse("{sort}").unwrap();
let result = template.format("not_a_list");
assert!(result.is_err());
// Error: "Sort operation can only be applied to lists"
§Performance Notes
- Templates are compiled once and can be reused efficiently
- Operations use zero-copy techniques where possible
- Large datasets are processed with optimized algorithms
- Regex patterns are compiled and cached internally
- Memory allocation is minimized for common operations
For high-throughput applications, compile templates once and reuse them:
use string_pipeline::Template;
// Compile once
let template = Template::parse("{split:,:0}").unwrap();
// Reuse many times
for input in &["a,b,c", "x,y,z", "1,2,3"] {
let result = template.format(input).unwrap();
println!("{}", result);
}
For complete documentation including all operations, advanced features, and debugging techniques,
see the Template
and MultiTemplate
documentation and the comprehensive guides in the docs/
directory.
Structs§
- Multi
Template - A template engine supporting mixed literal text and string transformation operations.
Type Aliases§
- Template
- Type alias for backward compatibility.