gene 0.7.0

Crate providing a log matching framework written in Rust
Documentation

Gene - High-Performance Event Scanning and Filtering Engine

Crates.io Version Documentation Documentation Crates.io MSRV Crates.io License

Project Overview

Gene is a Rust implementation of the original Gene project designed for high-performance event scanning and filtering. Built primarily to power the Kunai security monitoring system, Gene provides a flexible and efficient rule-based engine for processing structured log events.

Purpose

  • Embeddable security event scanning engine
  • High-throughput log processing and filtering
  • Rule-based detection system for security monitoring

Key Technologies

  • Rule Format: YAML-based rule definitions for easy authoring
  • Pattern Matching: Advanced field matching with XPath-like syntax
  • Performance: Optimized for low-latency, high-volume event processing

Target Audience

  • Security engineers building detection systems
  • DevOps teams implementing log monitoring
  • Rust developers needing event processing capabilities

Installation

Add Gene to your project:

cargo add gene
cargo add gene_derive

Quickstart

use gene::{Compiler, Engine, Event, FieldGetter, FieldValue, FieldNameIterator};
use gene_derive::{Event, FieldGetter};

// 1. Define your event structure
#[derive(Event, FieldGetter)]
#[event(id = 1, source = "syslog".into())]
struct LogEvent {
    message: String,
    severity: u8,
}

// 2. Create compiler and load rules
let mut compiler = Compiler::new();
compiler.load_rules_from_str(
    r#"
name: high.severity
matches:
    $sev: .severity > '5'
condition: $sev"#
).unwrap();

// 3. Build the scanning engine
let mut engine = Engine::try_from(compiler).unwrap();

// 4. Scan events
let event = LogEvent {
    message: "Critical error".to_string(),
    severity: 8,
};

let scan_result = engine.scan(&event).unwrap();
if scan_result.includes_detection("high.severity") {
    println!("High severity event detected!");
}

Core Concepts

Concept Description
Events Structured data representing log entries or system events
Rules Pattern matching and condition evaluation definitions
Matches Field extraction and pattern matching expressions
Conditions Boolean logic combining match results
Decisions Include/exclude logic for scan results
Templates Dynamic rule configuration through variable substitution

Rule Format

Gene uses YAML for rule definitions, providing a clean and structured format:

name: mimic.kthread
meta:
    tags: [ 'os:linux' ]
    attack: [ T1036 ]
    authors: [ 0xrawsec ]
    comments:
        - tries to catch binaries masquerading kernel threads
match-on:
    events:
        kunai: [1,2]  # Match specific event types
matches:
    $task_is_kthread: .info.task.flags &= '0x200000'
    $kthread_names: .info.task.name ~= '^(kworker)'
condition: not $task_is_kthread and $kthread_names
severity: 10

Rule Components

  • name: Unique rule identifier
  • meta: Metadata including tags, attack IDs, authors
  • match-on: Event type filtering
  • matches: Field extraction and pattern matching
  • condition: Boolean logic for detection
  • severity: Numerical severity level

Features

High Performance

  • Optimized for low-latency event processing
  • Efficient pattern matching algorithms
  • Minimal memory overhead

Flexible Matching

  • XPath-like field access (.field.subfield)
  • Regular expression support (~= operator)
  • Bitwise operations (&=, |=, etc.)
  • Comparison operators (>, <, ==, etc.)

Advanced Capabilities

  • Rule Dependencies: Chain rules together for complex detection logic
  • Template System: Dynamic rule configuration with variable substitution
  • Metadata Support: Rich metadata including MITRE ATT&CK mappings
  • Decision System: Fine-grained control over event inclusion/exclusion

Performance Benchmarks

Benchmarks conducted with real detection rules and security events:

Hundred-ish Rules (127 rules)

Number of scanned events: 1,001,600 (1,327.72 MB)
Scan duration: 1.28s
Throughput: 1,037.66 MB/s | 782,784.83 events/s
Detections: 550

Thousand-ish Rules (1,016 rules)

Number of scanned events: 1,001,600 (1,327.72 MB)
Scan duration: 9.54s
Throughput: 139.24 MB/s | 105,042.31 events/s
Detections: 550

Note: Performance scales with rule complexity. These benchmarks demonstrate that Gene remains efficient even with large rule sets, avoiding bottleneck issues in embedded applications.

Contributing

  • Report issues on GitHub
  • Submit pull requests with clear descriptions
  • Follow Rust API guidelines and documentation standards
  • Maintain cargo test and cargo clippy cleanliness

License

Gene is licensed under the GPL-3.0 - see the LICENSE file for details.