# Changelog
All notable changes to this project will be documented in this file.
The format is based on [Keep a Changelog](https://keepachangelog.com/en/1.0.0/),
and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0.html).
## [0.2.2] - 2025-07-18
### 🎉 New Features
#### Logical Operations
- **NOT operator support**: Added `not` operator for negating conditions in `select()` statements
- Syntax: `select(not (.condition))` - parentheses are required for clarity
- Works with all comparison operators and string operations
- Examples: `select(not (.age > 65))`, `select(not (.email | contains("spam")))`
- **OR operator support**: Added pipe-delimited pattern matching for OR conditions
- Syntax: `select(.field | contains("pattern1|pattern2|pattern3"))`
- Multi-pattern matching in a single operation
- Examples: `select(.level | contains("ERROR|FATAL"))`, `select(.role | contains("admin|manager|supervisor"))`
#### Array Slicing Operations
- **Python-style slicing**: Complete slice notation support for arrays and string split results
- Basic slicing: `.[start:end]`, `.[start:]`, `.[:end]`, `.[:]`
- Negative index support: `.[-5:]`, `.[:-3]`, `.[-10:-5]`
- Field-specific slicing: `.users[0:10]`, `.logs[-100:]`
- **String split slicing**: Advanced slicing support for string split operations
- Direct slicing after split: `split(",")[1:3]`, `split("/")[-1]`
- Complex path processing: `split("/")[1:-1]` for middle path components
- CSV column extraction: `split(",")[2:5]` for specific column ranges
- **Negative indexing**: Support for negative array indices
- Last element access: `.array[-1]`, `.users[-1]`
- Reverse indexing: `.array[-3]` for third from last
- Compatible with all array operations
#### Enhanced Filtering Capabilities
- **Complex logical combinations**: Combine NOT and OR operations for sophisticated filtering
- Example: `select(not (.status | contains("deleted|suspended|inactive")))`
- Multi-condition filtering: `select(not (.type | contains("debug|trace|verbose")))`
- **Pattern-based exclusion**: Exclude multiple patterns efficiently
- File filtering: `select(not (.filename | contains(".tmp|.bak|.swp")))`
- Content filtering: `select(not (. | contains("DEBUG|INFO|TRACE")))`
### 🔧 Improvements
#### Enhanced String Operations
- **Improved split operations**: Better integration with slicing for complex text processing
- **Optimized pattern matching**: More efficient OR pattern processing with pipe-delimited syntax
- **Better error handling**: Clearer error messages for logical operation syntax errors
#### Performance Optimizations
- **Slice operation efficiency**: Optimized memory usage for large array slicing operations
- **Pattern matching performance**: Improved performance for multi-pattern OR operations
- **Logical operation caching**: Better performance for complex NOT/OR combinations
#### Documentation and Examples
- **Comprehensive logical operation examples**: Real-world use cases for NOT and OR operations
- **Slicing operation guide**: Complete reference for all slicing capabilities
- **Advanced filtering patterns**: Examples combining multiple logical operations
### 📊 New Use Cases Enabled
#### Advanced Log Analysis
```bash
# Exclude multiple log levels efficiently
# Get recent critical errors only
# Extract specific time ranges with slicing
#### Sophisticated Data Filtering
```bash
# Filter active users excluding test accounts
# Find high-priority items excluding archived
# Process middle sections of arrays
#### Complex Text Processing
```bash
# Extract domain names efficiently
# Get file paths without extension
# Process CSV columns with exclusions
#### Advanced Data Analysis Workflows
```bash
# Multi-step filtering with slicing
# Recent data analysis with pattern exclusion
# Complex field extraction with logical operations
### 🛠️ Technical Improvements
#### Filter Module Enhancements
- **New function `parse_not_condition_with_parentheses`**: Robust NOT operator parsing with mandatory parentheses
- **Enhanced `apply_filter_with_string_operations`**: Support for NOT operator in string operation pipelines
- **Improved error handling**: Better error messages for missing parentheses and invalid syntax
#### Slicing Infrastructure
- **Universal slicing support**: `apply_universal_slice_operation` handles all data types
- **Negative index processing**: `parse_index_with_negative` for Python-style negative indexing
- **Data structure detection**: `detect_data_structure` for intelligent slicing behavior
#### Pattern Matching Optimization
- **OR pattern preprocessing**: Efficient pipe-delimited pattern parsing
- **Multi-pattern contains operations**: Optimized string matching for multiple patterns
- **Regex-free implementation**: Fast pattern matching without regex overhead
### 🔄 Breaking Changes
None. This release is fully backward compatible with v0.2.x.
### 🐛 Bug Fixes
- Fixed slice boundary checking for out-of-range indices
- Improved pattern matching edge cases with empty patterns
- Enhanced error reporting for malformed logical operations
- Fixed memory usage issues with large slice operations
### 📖 Documentation Updates
- **Complete logical operations reference**: Documentation for NOT and OR operators
- **Comprehensive slicing guide**: All slicing capabilities with examples
- **Advanced filtering patterns**: Real-world use case examples
- **Performance best practices**: Guidelines for efficient query construction
### 🚀 Migration Guide
#### For users upgrading from v0.2.1:
All existing queries continue to work without changes. New features are additive:
**New NOT operator usage:**
```bash
# Old approach (still works)
# New approach with NOT operator
**New OR operator usage:**
```bash
# Old approach with multiple queries
# New approach with OR operator
**New slicing capabilities:**
```bash
# Get last 10 users (new)
hawk '.users[-10:]' users.json
# Get middle section of data (new)
hawk '.data[100:200]' data.json
# Extract filename from path (new)
## [0.2.1] - 2024-07-16
### 🐛 Bug Fixes
- Fixed single object field access (e.g., `.Parameters` in CloudFormation templates)
- Corrected info display for single objects ("Single Object" vs "Object Array")
- Enhanced support for YAML/JSON single object files
### 🔧 Improvements
- Better error messages for field access
- Improved CloudFormation, Docker Compose, Kubernetes manifest support
## [0.2.0] - 2025-07-16
### 🎉 Major Features Added
#### Plain Text Support
- **Universal file format support**: Now processes plain text files, log files, configuration files, and any text-based data
- **Automatic format detection**: Intelligently detects JSON, YAML, CSV, and plain text files
- **Unified query syntax**: Same query language works across all supported formats
- **Text-as-array processing**: Each line becomes a string element in an array for consistent processing
#### String Operations
- **Complete string manipulation suite**: `upper`, `lower`, `trim`, `trim_start`, `trim_end`
- **String analysis functions**: `length`, `reverse`
- **Pattern matching**: `contains(pattern)`, `starts_with(pattern)`, `ends_with(pattern)`
- **Text transformation**: `replace(old, new)`, `substring(start, length)`
- **String parsing**: `split(delimiter)` to convert strings to arrays
- **Array joining**: `join(delimiter)` to convert arrays back to strings
#### Enhanced map() Function
- **Data transformation pipeline**: Transform data elements with chained string operations
- **Type-safe operations**: Proper error handling for incompatible data types
- **Complex pipelines**: Support for multi-step transformations like `map(. | trim | upper | replace("old", "new"))`
#### Statistical Functions
- **Descriptive statistics**: `median`, `stddev` (standard deviation)
- **Data manipulation**: `unique` (remove duplicates), `sort` (sort values)
- **Array operations**: `length` for counting elements
- **Field-specific operations**: All statistical functions support field specification (e.g., `median(.price)`)
#### Colored Output
- **Automatic TTY detection**: Colors in terminal, plain text when piped or redirected
- **Beautiful syntax highlighting**:
- Table headers in blue with bold formatting
- Numbers in green
- Boolean values in yellow
- Null values in gray
- JSON syntax highlighting with colored keys and values
- **Environment variable support**: Respects `NO_COLOR` environment variable
- **Multiple output formats**: Enhanced table, JSON, and list outputs with appropriate coloring
### 🔧 Improvements
#### Enhanced Error Handling
- **Detailed error messages**: Context-aware error reporting with specific field and operation information
- **Type-safe operations**: Better validation of operations on different data types
- **Pipeline debugging**: Improved error location reporting in complex query pipelines
#### Better File Format Detection
- **Robust detection algorithms**: Improved heuristics for distinguishing between formats
- **Edge case handling**: Better support for malformed or ambiguous files
- **Fallback mechanisms**: Graceful degradation to text processing when format detection fails
#### Performance Optimizations
- **Memory-efficient processing**: Optimized data structures for large datasets
- **Faster pipeline execution**: Improved query parsing and execution engine
- **Reduced startup time**: Optimized initialization and dependency loading
#### Pipeline Processing Improvements
- **Parentheses-aware parsing**: Proper handling of nested operations like `map(. | contains("text") | not)`
- **Complex query support**: Better support for multi-level operations and transformations
- **Operation chaining**: Improved reliability of long pipeline chains
### 📊 New Use Cases Enabled
#### Log File Analysis
```bash
# Extract error logs with timestamps
# Count log levels
# Find unique IP addresses
#### Text Data Processing
```bash
# Clean and normalize text
# Extract file extensions
# Statistical text analysis
#### Data Cleaning and Normalization
```bash
# Email normalization
# Complex string transformations
hawk '.records[] | select(.id | length == 8) | map(.status | upper)' data.csv
```
#### Advanced Analytics
```bash
# Statistical analysis
# Median calculations
# Unique value analysis
### 🛠️ Technical Improvements
#### New Dependencies
- `termcolor ^1.4`: For colored output support
- `is-terminal ^0.4`: For TTY detection
#### Code Architecture
- **New module `string_ops`**: Centralized string operation handling
- **New module `stats_ops`**: Statistical function implementations
- **Enhanced `filter.rs`**: Improved pipeline operation handling
- **Updated `output.rs`**: Comprehensive colored output support
- **Improved `setup.rs`**: Better file format detection and text processing
#### Testing
- **Comprehensive test suite**: Added tests for all new string operations
- **Statistical function testing**: Validation of median, stddev, and other statistical operations
- **Integration testing**: End-to-end testing of complex pipeline operations
- **Edge case coverage**: Testing of malformed inputs and error conditions
### 🔄 Breaking Changes
None. This release is fully backward compatible with v0.1.x.
### 📦 Migration Guide
No migration required. All existing queries and workflows continue to work unchanged.
### 🐛 Bug Fixes
- Fixed pipeline parsing issues with complex nested operations
- Improved CSV type inference accuracy
- Enhanced error reporting for malformed queries
- Fixed memory usage issues with large datasets
### 📖 Documentation Updates
- **Comprehensive README update**: Added extensive documentation for new features
- **String operations guide**: Complete reference for all string manipulation functions
- **Statistical functions documentation**: Usage examples and parameter descriptions
- **Text processing examples**: Real-world use cases for log analysis and text processing
- **Enhanced query syntax reference**: Updated with all new operations and examples
### 🙏 Acknowledgments
- Community feedback on string processing needs
- Performance suggestions from early adopters
- Documentation improvements from user contributions
## [0.1.0] - 2024-07-12
### 🎉 Initial Release
#### Core Features
- **Multi-format support**: JSON, YAML, CSV parsing and processing
- **Pandas-like query language**: Intuitive syntax for data analysis
- **Field access and navigation**: Deep nested field access with array expansion
- **Filtering operations**: `select()` with comparison operators
- **Aggregation functions**: `count`, `sum`, `avg`, `min`, `max`
- **Grouping operations**: `group_by()` with aggregation support
- **Multiple output formats**: Table, JSON, list with automatic format detection
#### Technical Foundation
- **Rust implementation**: Fast, memory-safe data processing
- **serde_json integration**: Robust JSON parsing and manipulation
- **Type-aware processing**: Intelligent handling of numbers, strings, booleans
- **Error handling**: Comprehensive error reporting with thiserror
- **CLI interface**: User-friendly command-line interface with clap
#### Supported Operations
- Field access: `.field`, `.array[0]`, `.array[]`, `.nested.field`
- Filtering: `select(.field > value)`, `select(.field == "value")`
- Aggregation: `sum(.field)`, `avg(.field)`, `min(.field)`, `max(.field)`, `count`
- Grouping: `group_by(.field)` with aggregation support
- Info: `. | info` for data structure exploration
#### Output Formats
- **Table format**: Structured table output for object arrays
- **JSON format**: Pretty-printed JSON output
- **List format**: Simple list output for array data
- **Auto format**: Intelligent format selection based on data structure
---
For more details about any release, please see the [GitHub releases page](https://github.com/kyotalab/hawk/releases).