kelora 0.5.0

A command-line log analysis tool with embedded Rhai scripting
kelora-0.5.0 is not a library.
Visit the last successful build: kelora-2.0.0

Kelora

Scriptable log processor for the command line. Treats logs as structured events and lets you filter, transform, and analyze them using embedded Rhai scripts with 40+ built-in functions.

[!WARNING] Experimental tool. Vibe-coded. APIs may change without notice.

Table of Contents

Overview

Kelora parses log streams into structured events and runs them through a programmable pipeline powered by Rhai scripting.

  • Turns lines into structured events you can access as e.field or e["field-name"].
  • Embeds 40+ built-in Rhai helpers for parsing, enrichment, metrics, and windowed analysis.
  • Speaks JSON, logfmt, syslog, CSV/TSV, column specs, and gzip-compressed inputs out of the box.
  • Handles streaming or batch workloads with sequential and --parallel execution modes.
  • Emits metrics and processing stats so you can observe pipelines while they run.

Quick Start

[!TIP] Use the fixtures in example_logs/ when experimenting—no need to point at production logs.

Basics

# Filter error-level events from the logfmt sample
kelora -f logfmt -l error example_logs/sample.logfmt

# Work with JSON logs and enrich the event before printing selected fields
kelora -j example_logs/sample.jsonl \
  --filter 'e.level == "ERROR"' \
  --exec 'e.retry_count = e.get_path("retry", 0)' \
  --keys timestamp,level,message,retry_count

# Parse Apache/Nginx access logs, keep key fields, and surface stats
kelora -f combined example_logs/sample.nginx \
  --keys ip,status,request_time,request \
  --stats

# Show errors with surrounding context (like grep -A/-B/-C)
kelora -j example_logs/sample.jsonl \
  --filter 'e.level == "ERROR"' \
  --after-context 2 --before-context 1

Advanced Moves

# Count slow responses and surface metrics alongside real-time output
kelora -f logfmt example_logs/sample.logfmt \
  --filter 'e.duration.to_int() >= 1000' \
  --exec 'track_count("slow_requests"); e.bucket = if e.duration.to_int() >= 2000 { "very_slow" } else { "slow" }' \
  --metrics

# Sliding window alerting for login failures (stream from any source)
kubectl logs -f deploy/auth | \
  kelora -j --window 5 \
    --filter 'e.event == "login_failed"' \
    --exec 'let users = window_values("user"); if users.len() >= 3 { eprint("🚨 brute force detected for " + e.user); }'

# Run a scripted pipeline from disk (save your Rhai to pipelines/critical_filter.rhai first)
kelora -j example_logs/sample.jsonl \
  --begin 'conf.critical_components = ["database", "auth"]' \
  --exec-file pipelines/critical_filter.rhai \
  --output-file filtered.json

Installation

Kelora targets stable Rust; keep your toolchain fresh (rustup update) for best results.

# Install from crates.io
cargo install kelora

# Build from source
git clone https://github.com/dloss/kelora
cd kelora
cargo install --path .

Pre-built binaries live in the GitHub releases. They're handy for CI or containers.

Core Concepts

  • Events - Every log line becomes a structured map. Fields are accessible as e.level, e["user-agent"], or via helper functions like e.get_path("payload.id").
  • Pipeline - Logs flow through Input -> Parse -> Filter -> Transform -> Format -> Output. Mix and match parsers, filters, exec scripts, and formatters freely.
  • Scripts - Rhai expressions drive filtering (--filter), transformations (--exec/--exec-file), initialization (--begin), and teardown (--end).
  • Windows - --window N exposes recent events for sequence detection via helpers like window_values("field").

CLI Feature Tour

Input & Parsing

  • -f, --format <FORMAT> selects a parser (json, logfmt, syslog, combined, cols:<spec>, ...); -j is shorthand for JSON.
  • --file-order {cli|name|mtime} controls multi-file processing order.
  • --skip-lines, --keep-lines, and --ignore-lines filter raw input before parsing.
  • -M, --multiline <strategy> enables multi-line reconstruction (see Multiline Strategies).
  • --extract-prefix FIELD + --prefix-sep SEP slices service prefixes before parsing.
  • --cols-sep SEP provides a custom delimiter when using column specs.

Filtering & Selection

  • --filter 'expression' runs boolean Rhai expressions; chain multiple occurrences.
  • -l/--levels and -L/--exclude-levels gate standard log levels.
  • --since, --until, and --take trim by time range or limit output volume.
  • -A/--after-context, -B/--before-context, and -C/--context show surrounding lines around matches (requires filtering).
  • --keep-lines/--ignore-lines pair well with --strict to enforce hygiene.
  • --allow-fs-writes enables Rhai file helpers (mkdir, truncate_file, append_file) so scripts can shard or persist results; without it these functions return false and perform no I/O.

Transformations & State

  • -e/--exec and -E/--exec-file mutate events (e), emit new ones (emit_each), or track metrics.
  • --begin seeds global read-only configuration through the conf map; --end performs final aggregation or reporting.
  • --window N exposes window helpers for sliding analyses; combine with window_values(...), window_numbers(...), or custom loops.
  • Default mode is resilient—errors skip the offending event. Use --strict for fail-fast pipelines.
  • Dial verbosity with -v / -vvv for debugging or -q / -qqq for quiet pipelines.
  • -I/--include prepends Rhai files to define reusable functions for --exec, --begin, and --end stages.

Output & Reporting

  • -F/--output-format controls formatting (default, json, logfmt, inspect, levelmap, CSV/TSV variants, or none). -J is shorthand for JSON.
  • -k/--keys and -K/--exclude-keys shape the output payload; -c keeps only core fields; -b switches to brief mode.
  • --pretty-ts, -z, and -Z format timestamps (local vs UTC).
  • --stats, --metrics, and -S/--stats-only expose processing telemetry; --metrics-file writes JSON metrics to disk.
  • --mark-gaps inserts visual separators when time jumps exceed a duration; --no-emoji disables emoji prefixes.
  • Context lines are marked with visual prefixes: * for matches, / for before-context, \ for after-context.

Performance & Reliability

  • --parallel and --threads control concurrency; pair with --batch-size (lines per worker batch, default 1000) and --batch-timeout (max ms to flush a partial batch, default 200) to balance parallel throughput and latency.
  • --unordered relaxes output ordering for faster parallel flushes.
  • Sequential mode (default) shines for streaming sources; --parallel excels at log archives.
  • Combine with --config-file or aliases for repeatable pipelines at scale.

Parsers & Formats

Kelora defaults to -f line, which trims trailing newline/CR characters and exposes the result as e.line. Reach for -f raw when you need a byte-for-byte copy (including trailing newlines or escape markers), and reserve -f 'cols:<spec>' for bespoke formats that the built-in parsers do not cover. All parsers expect UTF-8 text; binary or other encodings will raise input errors.

Format Fields Produced Typical Source
line (default) line Newline-delimited text where trimming the trailing newline is acceptable
raw raw Exact text preservation (newline-sensitive data, continuation markers, binary artifacts)
json Original JSON keys JSONL or JSON arrays
logfmt Key-value pairs Logfmt structured logs
syslog timestamp, host, facility, message, ... RFC3164/RFC5424 syslog
cef Header fields + extension map ArcSight/Common Event Format
csv / tsv Column headers as fields Delimited datasets
combined ip, status, method, path, request, request_time, ... Apache/Nginx access logs
cols:<spec> Named fields defined by your spec (ts, level, *rest, ...) Custom or proprietary log formats

All parsers auto-detect gzip compression (files and stdin) by magic bytes—no extra flags required.

Format Recipes

Raw vs Line

Choose the right baseline for text pre-processing. -f line is the default: it trims the trailing newline/CR and gives you a tidy e.line field for downstream filters. -f raw keeps every byte (including trailing delimiters and escape markers) in e.raw, which is invaluable when you need to preserve continuation characters, feed the data into another parser verbatim, or re-emit the original payload.

# Preserve every byte (newline-sensitive analyses)
kelora -f raw example_logs/sample.log \
  --exec 'e.byte_len = e.raw.len()'

# Treat each line as plain text for simple filtering
kelora -f line example_logs/sample.log \
  --filter 'e.line.contains("ERROR")'

Prefix Extraction

Strip infrastructure prefixes before parsing structured payloads.

docker compose logs | \
  kelora --extract-prefix container --prefix-sep " | " --filter 'e.container == "web_1"'

Column Specs (cols:<spec>)

Declarative column parsing with skips, joins, and tail captures. This mode shines when your data has a repeatable column layout but no dedicated parser—think bespoke appliance logs, legacy flat files, or regex capture groups you want to map into fields.

Spec tokens are space-separated:

  • field — consume one column into field.
  • field(n) — consume n ≥ 2 columns, joined together (whitespace joins in default mode, literal separator joins when you set one).
  • - / -(n) — skip one or n columns with no output field.
  • *field — capture the remaining text verbatim; must appear once and always at the end.

Whitespace is the default separator; add --cols-sep "|" (or another literal) when your columns are delimited. You can also feed pre-split arrays to parse_cols from Rhai (let caps = e.line.extract_all_re(...); caps.parse_cols("ip user ts *msg");). Missing data fills fields with () in resilient mode, while --strict turns shortages/extras into errors.

echo "2025-09-22 12:33:44 INFO alice login success" | \
  kelora -f 'cols:date time level user *message' \
    --exec 'e.timestamp = (e.date + " " + e.time).to_datetime()' \
    --keys timestamp,level,user,message

Rhai Building Blocks

Kelora exposes the full Rhai language plus domain-specific helpers.

  • Text & parsing - extract_re, parse_logfmt, parse_cols, mask_ip, encode_* / decode_*.
  • Arrays & maps - sorted, sorted_by, array.flatten, map.get_path, map.flatten, emit_each.
  • Hashing & anonymization - bucket (fast sampling), hash (multi-algo), anonymize (salted SHA-256), pseudonym (short IDs).
  • Metrics - track_count, track_avg, track_bucket, track_unique power --metrics and --end reports.
  • Datetime - to_datetime, to_duration, now_utc, formatting helpers, arithmetic.
  • Environment & control - get_env, read_file, read_lines, exit.

Example pipeline with shared configuration and sliding logic:

kelora -j example_logs/sample.jsonl \
  --begin 'conf.error_levels = ["ERROR", "FATAL"]; conf.retry_threshold = 2' \
  --filter 'conf.error_levels.contains(e.level)' \
  --exec 'if e.get_path("retry", 0) >= conf.retry_threshold { track_count("retries"); }' \
  --window 1 \
  --exec 'let comps = window_values("component"); if comps.len() > 1 && comps[0] == comps[1] { e.context = "repeat_component"; }' \
  --metrics

See kelora --help-rhai for syntax essentials and kelora --help-functions for the complete catalog.

Multiline Strategies

Preset Use When Equivalent
stacktrace ISO or syslog timestamps leading each entry -M timestamp
docker Docker JSON logs (RFC3339 timestamps) -M timestamp:pattern=^\\d{4}-\\d{2}-\\d{2}T
syslog RFC3164/5424 headers (Jan 2, <34>1 2024-01-01T...) `-M timestamp:pattern=^(<\d+>\d\s+\d{4}-\d{2}-\d{2}T
combined Apache/Nginx access logs with remote host prefix -M start:^\\S+\\s+\\S+\\s+\\S+\\s+\\[
nginx Nginx error logs prefixed with [dd/Mon/yyyy: -M timestamp:pattern=^\\[[0-9]{2}/[A-Za-z]{3}/[0-9]{4}:
continuation Lines ending with \\ continue the current event -M backslash
block BEGIN ... END sections form a single event -M boundary:start=^BEGIN:end=^END
whole Treat the entire input as one event (fixtures, preformatted payloads) -M whole

When you pick a timestamp-based strategy (the presets above or -M timestamp), you can also provide --ts-format=<chrono fmt> so Kelora matches your exact timestamp prefix instead of relying solely on the preset regex.

Build custom strategies with timestamp:pattern=..., indent, start:REGEX, end:REGEX, boundary, backslash[:char=...], or whole (single mega-event). Remember that buffering happens until a boundary is found; when running with --parallel, lower --batch-size or --batch-timeout to keep long multi-line frames flushable, and whole will buffer the entire stream in memory.

Configuration & Defaults

Define repeatable pipelines in ~/.config/kelora/config.ini:

# Defaults applied to every run
defaults = --stats --parallel --input-tz UTC

[aliases]
errors = -l error --since 1h --stats
warnings = --filter 'e.level == "WARN" || e.level == "WARNING"'
slow-queries = --filter 'e.duration > 1000' --exec 'e.slow = true' --keys timestamp,query,duration

Usage:

kelora example_logs/sample.jsonl            # Uses defaults
kelora --config-file custom.ini example_logs/sample.log  # Swap configuration files
kelora --no-stats example_logs/sample.log                # Override a default
kelora -a errors example_logs/sample.jsonl  # Run the alias
kelora --show-config                        # Inspect the merged configuration
kelora -l error --stats --save-alias errors # Save current command as alias

Pair configs with --ignore-config for hermetic runs or CI pipelines.

Example Pipelines

# Real-time nginx monitoring with enrichment, metrics, and alerting
tail -f /var/log/nginx/access.log | \
  kelora -f combined \
    --exec 'let status = e.status.to_int(); e.class = if status >= 500 { "server_error" } else if status >= 400 { "client_error" } else { "ok" };' \
    --filter 'e.class != "ok"' \
    --exec 'track_count("errors"); if e.class == "server_error" { eprint("🚨 " + e.status + " " + e.request); }' \
    --metrics

# Authentication watch with sliding windows and unique counters
kelora -f syslog example_logs/sample.syslog \
  --filter 'e.message.contains("Failed login")' \
  --window 4 \
  --exec 'let count = 0; for msg in window_values("message") { if msg.contains("Failed login") { count += 1; } } if count >= 3 { eprint("🚨 repeated failures: " + e.message.extract_ip()); track_unique("alert_ips", e.message.extract_ip()); }' \
  --metrics

# Convert syslog to structured JSON and redact sensitive fields
kelora -f syslog example_logs/sample.syslog \
  --exec 'e.severity_label = if e.severity <= 3 { "critical" } else if e.severity <= 4 { "error" } else { "info" }; e.host = e.host.mask_ip(1);' \
  -J

# Anonymize PII while maintaining linkability and sampling
kelora -j access.log --salt "$KELORA_SALT" \
  --exec 'e.user_pseudo = pseudonym(e.user_id, 8); e.ip_anon = anonymize(e.ip)' \
  --filter 'bucket(e.user_id) % 10 == 0' \
  --keys user_pseudo,ip_anon,action,status

Learning Path

  1. Events - Practice accessing and mutating e.field values on JSON or logfmt samples.
  2. Parsing - Try multiple formats (-f json, -f combined, -f 'cols:...') and experiment with --extract-prefix.
  3. Basic Scripts - Layer --filter, --exec, and --keys for simple transformations.
  4. Metrics & Stats - Introduce track_count, track_avg, --metrics, and --stats.
  5. Pipelines - Chain multiple --filter/--exec stages, add --begin configuration, and export results.
  6. Output Control - Switch between -F modes, apply -k/-K, and format timestamps.
  7. Windows - Explore --window, window_values, and window_numbers for sequence detection.
  8. Multi-stage Workloads - Combine --parallel, config aliases, and --end summarization for production-style jobs.

Each milestone builds on the previous one; you can be productive early, then layer in advanced concepts as needed.

Documentation Shortcuts

kelora --help            # CLI overview, option catalog, examples
kelora --help-rhai       # Rhai syntax and scripting patterns
kelora --help-functions  # Built-in functions grouped by domain
kelora --help-multiline  # Multiline strategy reference
kelora --help-time       # Timestamp parsing and formatting guide

When to Reach for Kelora

  • Use Kelora when you need programmable filtering, enrichment, aggregation, or custom metrics directly in the terminal.
  • Pair with other tools: pipe results into jq, ripgrep, or lnav—Kelora focuses on transformation, not visualization.
  • Prefer other tools when you need interactive browsing (lnav), raw text search (ripgrep), heavy-duty JSON querying (jq), dashboards (Grafana), or centralized log shipping (Fluentd).

See also: angle-grinder, pq, Miller.

License

MIT - see the license file for full details.