# hawk π¦
Modern data analysis tool for structured data (JSON, YAML, CSV)
[](https://www.rust-lang.org/)
[](LICENSE)
**hawk** combines the simplicity of `awk` with the power of `pandas` for data exploration. Unlike traditional text tools that work line-by-line, hawk understands structured data natively. Unlike heavy data science tools that require complex setup, hawk brings analytics to your terminal with a single command.
**Perfect for**:
- π **Data Scientists**: Quick CSV/JSON analysis without Python overhead
- π§ **DevOps Engineers**: Kubernetes YAML, Docker Compose, Terraform analysis
- π **API Developers**: REST response exploration and validation
- π **Business Analysts**: Instant insights from structured datasets
## β¨ Features
- π **Multi-format support**: JSON, YAML, CSV with automatic detection (vs jq's JSON-only)
- πΌ **Pandas-like operations**: Filtering, grouping, aggregation (vs awk's line-based processing)
- π **Smart output formatting**: Tables, lists, JSON based on data structure
- π **Fast and lightweight**: Built in Rust for performance (vs pandas' Python overhead)
- π§ **Developer-friendly**: Perfect for DevOps, data analysis, and API exploration
- π― **Type-aware**: Understands numbers, strings, booleans (vs text tools' string-only approach)
- π **Unified syntax**: Same query language across all formats (vs format-specific tools)
## π Quick Start
### Installation
```bash
# Install via Homebrew (macOS/Linux)
brew install kyotalab/tools/hawk
# Verify installation
hawk --version
```
### Basic Usage
```bash
# Explore data structure
# Access fields
hawk '.users[0].name' users.json
hawk '.users.name' users.csv
# Filter and aggregate
```
## π Query Syntax
### Field Access
```bash
.field # Access field
.array[0] # Access array element
.array[] # Access all array elements
.nested.field # Deep field access
.array[0].nested.field # Complex nested access
.array[].nested[] # Multi-level array expansion
```
### Filtering
```bash
. | select(.active == true) # Boolean comparison
. | select(.status != "inactive") # Not equal
. | select(.State.Name == "running") # Nested field filtering
```
### Field Selection
```bash
```
### Aggregation
```bash
. | avg(.score) # Average values
. | min(.price) # Minimum value
. | max(.price) # Maximum value
```
### Grouping
```bash
. | group_by(.region) | avg(.sales) # Average by group
. | group_by(.type) | sum(.amount) # Sum by group
```
### Complex Queries
```bash
# Multi-step analysis
# Multi-level array processing
# Field selection with filtering
# Data exploration workflow
.data[] | group_by(.category) | count # Count by category
```
## π― Use Cases
### API Response Analysis
```bash
# Analyze GitHub API response
# Extract specific fields
### DevOps & Infrastructure
```bash
# Kubernetes resource analysis
# AWS EC2 analysis
# Docker Compose services
```
### Data Analysis
```bash
# Sales data analysis
# Multi-field analysis
# Log analysis
```
### Configuration Management
```bash
# Ansible inventory analysis
# Terraform state analysis
## π Supported Formats
### JSON
```json
{
"users": [
{"name": "Alice", "age": 30, "department": "Engineering"},
{"name": "Bob", "age": 25, "department": "Marketing"}
]
}
```
### YAML
```yaml
users:
- name: Alice
age: 30
department: Engineering
- name: Bob
age: 25
department: Marketing
```
### CSV
```csv
name,age,department
Alice,30,Engineering
Bob,25,Marketing
```
All formats support the same query syntax!
## π¨ Output Formats
### Smart Auto-Detection (default)
```bash
hawk '.users[0].name' data.json # β Alice (list)
hawk '.users[]' data.json # β Table format
hawk '.config' data.json # β JSON format
```
### Explicit Format Control
```bash
hawk '.users[]' --format table # Force table
hawk '.users[]' --format json # Force JSON
hawk '.users.name' --format list # Force list
```
## π οΈ Advanced Examples
### Complex Data Analysis
```bash
# Multi-step pipeline analysis
# Nested data exploration
# Cross-format analysis
### Real-world DevOps Scenarios
```bash
# Find all running containers with high memory usage
# Analyze Kubernetes deployments by namespace
# AWS EC2 cost analysis
# Extract configuration errors from logs
### Data Processing Workflows
```bash
# 1. Explore structure
# 2. Filter relevant data
# 3. Multi-level processing
# 4. Group and analyze
# 5. Export results
hawk '.summary[]' data.json --format csv > results.csv
```
## π§ Installation & Setup
### Homebrew (Recommended)
```bash
# Install via Homebrew
brew install kyotalab/tools/hawk
# Or install from the main repository
brew tap kyotalab/tools
brew install hawk
```
### Build from Source
```bash
# Prerequisites: Rust 1.70 or later
git clone https://github.com/kyotalab/hawk.git
cd hawk
cargo build --release
# Add to PATH
sudo cp target/release/hawk /usr/local/bin/
```
### Binary Releases
Download pre-built binaries from [GitHub Releases](https://github.com/kyotalab/hawk/releases)
- Linux (x86_64)
- macOS (Intel & Apple Silicon)
## π Documentation
### Command Line Options
```bash
hawk --help # Show help
hawk --version # Show version
hawk '.query' file.json # Basic usage
hawk '.query' --format json # Specify output format
```
### Query Language Reference
| Field access | `.field` | `.name` |
| Array index | `.array[0]` | `.users[0]` |
| Array iteration | `.array[]` | `.users[]` |
| Multi-level arrays | `.array[].nested[]` | `.Reservations[].Instances[]` |
| Field selection | `\| select_fields(field1,field2)` | `\| select_fields(name,age)` |
| Filtering | `\| select(.field > value)` | `\| select(.age > 30)` |
| Nested filtering | `\| select(.nested.field == value)` | `\| select(.State.Name == "running")` |
| Grouping | `\| group_by(.field)` | `\| group_by(.department)` |
| Counting | `\| count` | `.users \| count` |
| Aggregation | `\| sum/avg/min/max(.field)` | `\| avg(.salary)` |
| Info | `\| info` | `. \| info` |
### Supported Operators
- **Comparison**: `>`, `<`, `==`, `!=`
- **Aggregation**: `count`, `sum`, `avg`, `min`, `max`
- **Grouping**: `group_by`
- **Filtering**: `select`
## π€ Contributing
We welcome contributions! Please see our [Contributing Guide](CONTRIBUTING.md) for details.
### Development Setup
```bash
git clone https://github.com/kyotalab/hawk.git
cd hawk
cargo build
cargo test
```
### Running Tests
```bash
cargo test # Run all tests
```
## π License
This project is licensed under the MIT License - see the [LICENSE](LICENSE) file for details.
## π Acknowledgments
- Inspired by the simplicity of `awk` and the power of `pandas`
- Built with the amazing Rust ecosystem
- Special thanks to the `serde`, `clap`, and `csv` crate maintainers
## π Related Tools & Comparison
| **awk** | Text processing, log parsing | Line-based, no JSON/YAML support | Structured data focus, type-aware operations |
| **jq** | JSON transformation | JSON-only, complex syntax for data analysis | Multi-format, pandas-like analytics |
| **pandas** | Heavy data science | Requires Python setup, overkill for CLI | Lightweight, terminal-native |
| **sed/grep** | Text manipulation | No structured data understanding | Schema-aware processing |
### Why Choose hawk?
**π― For structured data analysis**, hawk fills the gap between simple text tools and heavy data science frameworks:
```bash
# awk: Limited structured data support
awk -F',' '$3 > 30 {print $1}' data.csv
# jq: JSON-only, verbose for analytics
# hawk: Unified, intuitive syntax across all formats
hawk '.[] | select(.age > 30) | .name' data.yaml # Same syntax for YAML
```
**π pandas power, awk simplicity**:
```bash
# Complex analytics made simple
```
**π§ DevOps & IaC optimized**:
```bash
# Kubernetes config analysis (YAML native)
```
---
**Happy data exploring with hawk!** π¦
For questions, issues, or feature requests, please visit our [GitHub repository](https://github.com/kyotalab/hawk).