Kelora
Scriptable log processor for the command line. Treats logs as structured events and lets you filter, transform, and analyze them using embedded Rhai scripts with 40+ built-in functions.
[!WARNING]
Experimental tool. Vibe-coded. APIs may change without notice.
How It Works
Kelora parses log lines into structured events (e.level, e.timestamp, e.message), then processes them through a pipeline: filters decide which events to keep, exec scripts transform the data, and formatters produce output. It's a programmable Unix pipeline for log data.
Quick Start
# Find all errors in the last hour
# Filter and enrich JSON logs
# Count HTTP status codes with metrics
# Real-time monitoring with pattern detection
|
# Process gzipped logs (auto-detects compression for files and pipes)
|
# Monitor for brute force attacks using sliding windows
# Analyze entire JSON configuration as single document
Install
# Install from crates.io
# Or build from source
Pre-built binaries are available in the releases section of this GitHub repository.
Core Concepts
Events are structured objects created from log lines. Access fields like e.level or e["content-type"], add new ones with e.severity = "critical". Works like JSON objects your scripts can read and modify.
Pipeline processes data through independent stages: Input → Parse → Filter → Transform → Format → Output. Mix any parser with any script with any formatter.
Scripts provide programmable logic:
- Filters: Boolean expressions (
e.status.to_int() >= 500) that decide which events to keep - Execs: Transform statements (
e.category = "error") that modify events - Windows: Access recent events (
window[1].user) for pattern detection
Common Tasks
Finding Problems: Filter by criteria with -l error, --filter 'e.status.to_int() >= 500', or time ranges like --since 1h --levels error,fatal.
Understanding Patterns: Count and measure with --exec 'track_count("by_status_" + e.status)', track averages with track_avg("response_time", e.duration). View results with --metrics or --stats.
Detecting Sequences: Use --window N to access recent events. Detect changes with window[0].status != window[1].status or patterns like window_values(window, "user").len() >= 2 for repeated users.
Transforming Data: Add/modify fields with --exec 'e.severity = if e.status.to_int() >= 500 { "critical" } else { "normal" }'. Chain transformations with multiple --exec statements.
Input Formats
Each parser creates events with different fields:
| Format | Fields Created | Example |
|---|---|---|
line |
line |
Clean text processing |
raw |
raw |
Preserves text artifacts for analysis |
json |
All JSON keys | {"level":"info","msg":"started"} |
logfmt |
Key-value pairs | level=info msg="started" user=alice |
syslog |
timestamp, host, facility, message |
Jan 15 14:30:45 host app: message |
cef |
vendor, product, severity + extensions |
ArcSight CEF logs |
csv/tsv |
Column headers as fields | Structured data files |
combined |
ip, status, request, method, path, request_time |
Apache/NGINX web server logs |
All formats support gzip compression automatically (detects magic bytes 1F 8B 08) for both files and stdin. No additional flags needed - works with pipes, .gz files, and files without extensions. Use -f format to specify (-j is a shortcut for -f json).
Raw vs Line Format
Choose between text preservation and clean output:
Raw Format (-f raw): Preserves ALL text artifacts including newlines, backslashes, and carriage returns. Perfect for text analysis where you need to count lines or preserve exact formatting:
# Count lines in multiline events
# Preserve backslash continuations for analysis
Line Format (-f line): Provides clean, readable output by replacing newlines with spaces. Better for general processing and display:
# Clean multiline processing
Use Raw when you need the full text structure, Line when you want clean output.
Prefix Extraction
Extract prefixed text from logs before parsing with --extract-prefix FIELD. Useful for Docker Compose logs, service-prefixed logs, and any format with separators:
# Docker Compose logs: "web_1 | message"
|
# Custom separator: "auth-service :: message"
# Works with any format
Prefix extraction runs before parsing, so the extracted prefix becomes a field in the parsed event. Default separator is |, configurable with --prefix-sep.
Built-in Functions
Text Extraction: extract_re(pattern) finds regex matches, extract_ip() pulls IP addresses, parse_kv("=", ";") converts key-value pairs to fields.
Safe Conversion: to_number(), to_bool() safely convert types, mask_ip(octets) anonymizes IPs, upper(), lower(), trim() normalize text.
Time Operations: to_datetime(string, format, timezone) converts text into a timestamp, to_duration("5m") converts strings into durations, now_utc() gets current time.
Array Processing: emit_each(array) fans out arrays into individual events, emit_each(array, base) adds common fields to each. Transforms nested data like {"users": [{"name": "alice"}, {"name": "bob"}]} into separate events for each user. Original event is suppressed.
Column Mapping: line.parse_cols("ts(2) level *msg") declaratively assigns whitespace-delimited columns, line.parse_cols("ts level *msg", "|") honors literal separators, and ["field","values"].parse_cols("name value") works with pre-split arrays.
Specs use short tokens:
nameassigns a single column to the field (e.g.level).name(n)consumesn ≥ 2columns, joining them with the current separator (ts(2)grabs the date and time).-skips one column,-(n)skips many.*namecaptures the remainder verbatim (strings) or joined with spaces/separator (arrays); it must be last and unique.
Most scripts simply replace the event with the parsed map: e = e.line.parse_cols("ts level *msg").
# Sample input file with mixed log levels:
# 2025-09-22 12:33:44 INFO IgnoreMe: hello world!
# 2025-09-23 08:15:32 ERROR SomeService: connection failed
# Output:
# ts='2025-09-22 12:33:44' level='INFO' msg='hello world!'
# ts='2025-09-23 08:15:32' level='ERROR' msg='connection failed'
# ts='2025-09-23 14:22:01' level='WARN' msg='timeout occurred'
# ts='2025-09-23 16:45:18' level='DEBUG' msg='user login successful'
Metrics: track_count(key) increments counters, track_sum/avg/min/max(key, value) accumulate statistics, track_unique(key, value) counts distinct values. Access via metrics map in --end scripts or display with --metrics.
Output: Use eprint() for alerts and diagnostics (writes to stderr), print() for data output (writes to stdout). Since kelora's processed events go to stdout, eprint() prevents interference with the data pipeline.
Environment: get_env(var) returns environment variable or empty string, get_env(var, default) with fallback. Useful for CI/CD pipelines and configuration-driven processing.
Advanced Features
Window Analysis
Access recent events with --window N. Use window[0] (current), window[1] (previous), etc. Window helper: window_values(window, "field") extracts field values from all events.
# Detect status changes
# Brute force detection (3+ failures from same IP)
Multi-Stage Processing
Chain filters and execs in any order for complex pipelines:
# Error analysis: filter → extract → classify → count
Output Control
- Fields:
-k field1,field2(include only),-K field3(exclude),-c(core fields only),-b(brief/values only) - Levels:
-l error,warn(include),-L debug,trace(exclude) - Time:
--since 1h,--until 5m,--since "2024-01-15 14:00" - Formats:
-F default|inspect|json|logfmt|levelmap|csv|tsv|csvnh|tsvnh|none(default isdefault;inspectshows typed field breakdown, prefixes each event with---, and honors-vfor longer values),-J(json shortcut)
Performance & Configuration
- Processing:
--parallelfor batch files (2-10x faster),--threads N,--batch-size N - Timezones:
--input-tz Europe/Berlin(parse),-z(display local),-Z(display UTC) - Multiline:
-M timestamp(Java stacks),-M indent(continuation lines),-M backslash(line continuation),-M whole(entire input as single event) - Scripts:
-E script.rhai(from file),--begin 'conf.config = ...'(initialization),--end 'print(metrics.total)'(final reporting) - Error Handling: Default is resilient (skip errors),
--strictfor fail-fast,--verbosefor details,--no-emojito disable emoji prefixes - Verbose Output: Uses standardized emoji prefixes - 🔹 (blue diamond) for general output like stats and processing messages, 🔸 (orange diamond) for errors and warnings
- Config:
~/.config/kelora/config.inifor defaults and aliases,--config-file path/to/config.inifor custom config,--show-configto view
Complete Examples
End-to-End Log Analysis Pipeline
# Real-time nginx monitoring: stdin → filter → transform → metrics → alert
| \
Security Analysis
# Comprehensive authentication monitoring
Data Transformation
# Convert and enrich syslog to structured JSON
Array Fan-Out Processing
# Process nested JSON arrays: each user becomes a separate event
| \
# Output: {"name": "alice", "role": "admin", "batch_id": "b123", "processed": true}
# {"name": "bob", "role": "user", "batch_id": "b123", "processed": true}
# Multi-level fan-out: batches → requests → errors (only for failed requests)
| \
# Result: type='timeout' url='/api' status=500
# type='db_error' url='/api' status=500
Learning Kelora (Recommended Path)
Start Here: The Essentials
- Events - understand that logs become structured objects (
e.field) - Parsing - see how different formats create different fields (
-f json,-f combined) - Basic Scripts - learn to filter (
--filter) and transform (--exec)
Next: Real-World Usage
- Metrics - track counts and calculations across events (
track_count,--metrics) - Pipelines - combine multiple processing steps (multiple
--filterand--exec) - Output Formats - control how results are displayed (
-F json,-k field1,field2)
Advanced: Pattern Detection
- Windows - access sequences of events for pattern matching (
--window N) - Multi-stage Processing - complex analysis pipelines with initialization (
--begin,--end)
Why This Order: Each concept builds naturally on the previous ones. You can't understand windows without understanding events, but you can use events productively without ever learning about windows.
Help & Documentation
Configuration File Example
Create ~/.config/kelora/config.ini:
# Set default arguments for all kelora commands
defaults = --format auto --stats --parallel --input-tz UTC
[aliases]
errors = -l error --since 1h --stats
warnings = --filter 'e.level == "warn" || e.level == "warning"'
slow-queries = --filter 'e.duration > 1000' --exec 'e.slow = true' --keys timestamp,query,duration
Usage:
Kelora vs Other Tools (When to Use What)
Kelora's Purpose: Transform and analyze structured log events with programmable logic
Choose Kelora When: You need to filter, transform, or analyze log data programmatically
Choose Other Tools When:
- Browsing/Exploring →
lnav: Purpose is interactive log exploration with syntax highlighting - Simple Text Search →
ripgrep: Purpose is fast pattern matching across files - Complex JSON →
jq: Purpose is sophisticated JSON querying and transformation - Visualization → Grafana: Purpose is creating dashboards and charts
- Log Collection → Fluentd: Purpose is shipping logs between systems
The Independence Principle: You can pipe Kelora's output to these tools - they complement rather than compete with each other.
Similar Tools in the Log Processing Space
Log Processing:
- angle-grinder - Rust-based log processor with query syntax
- lnav - Advanced log viewer with many formats
- pq - Log parser and query tool
Text Processing:
License
MIT - See LICENSE file for details.