kazoe-0.1.4 is not a library.
kazoe
Fast wc replacement. Counts words, lines, and bytes.
Command: kz
Installation
From crates.io
From source
From GitHub
Performance
Benchmarked on a 1GB text file:
Word counting: 21x faster than wc (48ms vs 1.0s)
All counts (lwc): 16x faster than wc (63ms vs 1.0s)
Pattern matching: 90x faster than grep (32ms vs 2.9s)
Line counting: 1.7x faster than wc (32ms vs 56ms)
Multiple files: 23x faster than wc (86ms vs 2.0s)
Performance scales with file size:
- Files < 512KB: sequential processing, similar speed to wc
- Files > 1GB: 15-90x faster depending on operation
Basic Usage
# Default (lines, words, bytes)
# Count lines
# Count words
# Count bytes
# Count characters (UTF-8 aware)
# Show max line length
# Combine flags
# Multiple files with totals
# Read from stdin
|
# Count pattern occurrences
Advanced Features
JSON Output
Output as JSON (combine with -l, -w, -c etc. to select counts):
Example output:
Statistics Mode
Show file statistics (combine with -l, -w, -c etc. to include counts):
Example output:
Statistics:
Lines: 100
Words: 500
Bytes: 2048
Mean line length: 20.48
Median line length: 18
Std deviation: 12.34
Min line length: 0
Max line length: 80
Empty lines: 5
Histogram
Line length distribution:
Example output:
Line Length Histogram:
0- 9: 21 ███████████████
10- 19: 5 ███
20- 29: 24 ████████████████
30- 39: 18 ████████████
Unique Word Count
Count unique words:
Recursive Directory Processing
Recursive directory processing:
Exclude specific patterns:
Binary File Detection
Automatically detects and skips binary files:
# Output: kz: binary.exe: binary file detected, skipping
Format-Aware Counting
Count only code (skip comments and blank lines):
Count markdown text (skip code blocks):
Fast Mode
Skip UTF-8 validation for faster processing:
Null-Terminated File Lists
Process files from null-terminated list:
|
Shell Completions
Generate shell completions:
# Bash
# Zsh
# Fish
# PowerShell
Features Summary
Core Counting
-l, --lines- Print line counts-w, --words- Print word counts-c, --bytes- Print byte counts-m, --chars- Print character counts (UTF-8 aware)-L, --max-line-length- Print length of longest line--unique- Count unique words
Advanced Analysis
--stats- Show detailed statistics (mean, median, std dev)--histogram- Show line length distribution--pattern <PATTERN>- Count occurrences of a pattern
Output Formats
--json- Output results as JSON
File Processing
-r, --recursive- Process directories recursively--exclude <PATTERN>- Exclude files matching pattern--files0-from <FILE>- Read null-terminated file names- Multiple files with automatic totals
Performance
--fast- Skip UTF-8 validation for speed- Automatic binary file detection
- Memory mapped I/O
Format-Aware
--code- Count only code (skip comments)--markdown- Count markdown (skip code blocks)
Shell Integration
--generate-completion <SHELL>- Generate completions- Stdin support
- Compatible with
wcoutput format
Building
Testing
# Create 1GB test file
|
# Benchmark
Implementation
- Parallel processing with Rayon (1MB chunks)
- Memory mapped I/O with fallback for special files
- memchr for SIMD pattern matching
- Files < 512KB processed sequentially to avoid thread overhead
- UTF-8 aware character counting
- Binary file detection
- Format-aware filtering for code and markdown
License
MIT