fail2ban-log-parser-core 0.1.1

Fast, zero-copy fail2ban log parser built with winnow
Documentation

Fail2Ban log parser

[!IMPORTANT] This library is still WIP.

Installation

cargo add fail2ban-log-parser-core

With serde support:

cargo add fail2ban-log-parser-core --features serde

With parallel parsing (multi-threaded via Rayon):

cargo add fail2ban-log-parser-core --features parallel

Usage

Parse a single line

use fail2ban_log_parser_core::parse;

let line = "2024-01-15 14:32:01,847 fail2ban.filter [12345] INFO [sshd] Found 192.168.1.1";
let log = parse(line).next().unwrap().unwrap();

assert_eq!(log.jail(), Some("sshd"));
assert_eq!(log.pid(), Some(12345));
println!("{:?} {:?} {:?}", log.event(), log.ip(), log.timestamp());

Parse a batch and handle errors

use fail2ban_log_parser_core::parse;

let input = "\
2024-01-15 14:32:01,847 fail2ban.filter [12345] INFO [sshd] Found 192.168.1.100
this line is not a valid log entry
2024-01-15 14:32:03,456 fail2ban.actions [12345] NOTICE [sshd] Ban 192.168.1.100";

for result in parse(input) {
    match result {
        Ok(log) => println!("{:?} {:?}", log.event(), log.ip()),
        Err(e) => eprintln!("parse error: {e}"),
    }
}

Filter specific events

use fail2ban_log_parser_core::{Fail2BanEvent, parse};

let input = "..."; // multi-line log
let bans: Vec<_> = parse(input)
    .filter_map(|r| r.ok())
    .filter(|log| log.event() == Some(&Fail2BanEvent::Ban))
    .collect();

Log structure

2024-01-15 14:32:01,847  fail2ban.filter  [12345]  INFO  [sshd] Found 1.2.3.4
|___________________|    |______________|  |____|   |__|  |____| |__| |______|
    timestamp              header            pid    level  jail  event  IP address

Supported formats

Field Formats
Timestamp 2024-01-15 14:32:01,847, Jan 15 2024 14:32:01,847, 2024-01-15T14:32:01,847Z, ±HH:MM offset
Header fail2ban.filter, fail2ban.actions, fail2ban.server
Level INFO, NOTICE, WARNING, ERROR, DEBUG (case-insensitive)
Event Found, Ban, Unban, Restore, Ignore, AlreadyBanned, Failed, Unknown
IP IPv4 (192.168.1.1) and IPv6 (2001:db8::1)

API

Type Description
parse(&str) Returns an Iterator<Item = Result<Fail2BanStructuredLog, ParseError>>. Sequential by default; with the parallel feature, lines are parsed concurrently via Rayon (same API).
Fail2BanStructuredLog Parsed log line with accessor methods: timestamp(), header(), pid(), level(), jail(), event(), ip()
Fail2BanEvent Enum: Found, Ban, Unban, Restore, Ignore, AlreadyBanned, Failed, Unknown
Fail2BanHeaderType Enum: Filter, Actions, Server
Fail2BanLevel Enum: Info, Notice, Warning, Error, Debug
ParseError Contains line_number: usize and line: String

Features

Feature Description
serde Enables Serialize/Deserialize on all public types
parallel Multi-threaded parsing via Rayon. Same parse() API, lines parsed concurrently. Not supported on wasm32 targets (compile-time error).
debug_errors Extra error debugging information

Examples

cargo run -p fail2ban-log-parser-core --example parse_single
cargo run -p fail2ban-log-parser-core --example parse_batch
cargo run -p fail2ban-log-parser-core --example filter_bans

How it works

flowchart TD
    A["&str (multi-line log)"] --> B["parse()"]
    B --> C{parallel feature?}
    C -->|No| D["Split into lines (lazy)"]
    C -->|Yes| E["Collect lines + Rayon par_iter"]
    D --> F["parse_log_line() per line"]
    E --> F

    F --> T["timestamp"]
    F --> H["header"]
    F --> P["pid"]
    F --> L["level"]
    F --> J["jail"]
    F --> EV["event"]
    F --> I["ip"]

    T & H & P & L & J & EV & I --> R{Result}
    R -->|Ok| S["Fail2BanStructuredLog"]
    R -->|Err| ERR["ParseError"]

    S & ERR --> O["Iterator of Results"]

Benchmarks

Measured with Criterion.rs + dhat on Apple M4 Pro (12 cores) / 48 GB / rustc 1.94.0.

Performance evolution can be seen here.

Single-line parsing

Time to parse one log line in isolation.

Variant Time
ISO date + IPv4 ~121 ns
Syslog date ~127 ns
ISO 8601 (T-separator) ~121 ns
IPv6 address ~144 ns

Batch parsing

Sequential (default)

Single-threaded, lazy iterator. Each line is parsed on demand.

Lines Total time Per line
10 ~1.35 µs ~135 ns
100 ~14.1 µs ~141 ns
1,000 ~142 µs ~142 ns
10,000 ~1.42 ms ~142 ns
100,000 ~14.3 ms ~143 ns
1,000,000 ~143 ms ~143 ns

Parallel (--features parallel)

Multi-threaded via Rayon. All lines are parsed concurrently, then yielded in order.

Lines Total time Per line Speedup
1,000 ~114 µs ~114 ns 1.2x
10,000 ~442 µs ~44 ns 3.2x
100,000 ~2.77 ms ~28 ns 5.2x
1,000,000 ~26.3 ms ~26 ns 5.4x

Parallel overhead makes it slower than sequential for small inputs (<1,000 lines). The speedup scales with core count, expect different results on different hardware.

Collection strategies

Time to process 1,000 lines (sequential) with different consumption patterns.

Strategy Time
Iterate + count ~144 µs
Collect to Vec ~150 µs
Partition ok/err ~147 µs

Error handling

Scenario Time
1,000 lines (50% invalid) ~81 µs
All 8 event types (8 lines) ~1.14 µs

Memory usage (sequential, collect to Vec)

Heap allocations when collecting all parsed results into a Vec. Measured with dhat.

Lines Total allocated Peak in-use Alloc count Per line
1 224 B 224 B 1 224 B
100 13.78 KB 7.00 KB 6 141 B
1,000 111.78 KB 56.00 KB 9 114 B
10,000 1.75 MB 896.00 KB 13 183 B
100,000 14.00 MB 7.00 MB 16 146 B
1,000,000 112.00 MB 56.00 MB 19 117 B

The low alloc count (19 for 1M lines) is because Fail2BanStructuredLog borrows from the input, only the Vec itself and its backing buffer are heap-allocated.

Reproduce

# Sequential benchmarks
cargo bench -p fail2ban-log-parser-core --bench parsing

# Parallel benchmarks (requires parallel feature)
cargo bench -p fail2ban-log-parser-core --bench parsing --features parallel