systemd-journal-sdk-core 0.6.3

Core pure Rust systemd journal file reader and writer primitives
Documentation

Rust Journal SDK

This workspace contains pure-Rust systemd journal reader and writer components. It does not link to libsystemd or other system journal libraries for SDK behavior.

crates.io Usage

The public Rust SDK package is systemd-journal-sdk. Use a Cargo dependency alias if existing code should import it as journal:

[dependencies]
journal = { package = "systemd-journal-sdk", version = "0.6.1" }

The workspace also publishes project-prefixed lower-level packages for consumers that need direct access to the same internal layers used by the SDK:

  • systemd-journal-sdk-common
  • systemd-journal-sdk-core
  • systemd-journal-sdk-registry
  • systemd-journal-sdk-log-writer
  • systemd-journal-sdk-index
  • systemd-journal-sdk-engine

Current writer scope:

  • regular journal files by default and compact journal files with JournalFileOptions::with_compact(true) or Config::with_compact(true);
  • uncompressed DATA objects by default;
  • optional zstd, xz, and lz4-compressed DATA object writing through JournalFileOptions and journal::Config, using systemd's 512-byte default threshold and 8-byte minimum clamp;
  • keyed hash tables using the journal file ID;
  • byte-safe field values through &[u8] field payloads;
  • direct-file writing through journal_core;
  • high-level directory writing through journal::Log;
  • systemd-compatible 0640 journal file permissions by default, configurable for newly-created files through JournalFileOptions::with_file_mode() and Config::with_file_mode();
  • chain active naming by default, with Config::with_strict_systemd_naming(true) available for strict systemd <source>.journal active naming;
  • shared field-name policy layers for direct-file and directory writers: default FieldNamePolicy::Journald, app-facing FieldNamePolicy::JournalApp, and structure-level FieldNamePolicy::Raw;
  • entry-count, file-size, and duration rotation;
  • tracked journal-file-count, byte-size, and age retention;
  • optional pure cross-SDK cooperative lockfile with stale-owner detection when callers explicitly acquire journal_core::file::lock::WriterLock;
  • Forward Secure Sealing TAG writing through SealOptions, including stock journalctl --verify --verify-key coverage for sealed files generated by this writer;
  • FSS SealOptions::start_usec normalization to systemd's verification-key epoch boundary, so unaligned source timestamps still produce sealed files that stock journalctl --verify --verify-key can validate;
  • low-level EntryWriteOptions::seqnum(...) and EntryWriteOptions::boot_id(...) exact-regeneration support for preserving ENTRY sequence gaps and per-entry boot IDs when rewriting existing journal files. Leave them unset for normal auto-incrementing sequence numbers and the writer-wide boot ID;
  • native systemd writers do not participate in the SDK lock protocol and remain an operational exclusion;
  • live stock-reader validation for the current writer slice with journalctl --file, journalctl --file --follow --no-tail --boot=all, and libsystemd reader APIs, including live sequence-order checks;
  • configurable explicit live-reader publication cadence through JournalWriter::set_live_publish_every_entries() and Config::with_live_publish_every_entries(), defaulting to systemd-compatible publication after every entry.

Deferred scope:

  • appending to arbitrary historical or systemd-created journal variants. In particular, append-open on historical unkeyed-hash files is unsupported and returns a controlled error before entry mutation;
  • the imported legacy jf journal_file::JournalWriter remains available for compatibility with that crate's public surface, but it is not the supported production writer path. It also returns a controlled unsupported-file error for unkeyed append targets instead of panicking. New writer integrations should use journal_core::file::JournalWriter or the high-level journal::Log directory writer;
  • full systemd object-graph verification parity beyond the current repository verification API.

Current reader scope:

  • regular and compact journal files;
  • .journal, .journal~, .journal.zst, and .journal~.zst files;
  • zstd-compressed fixture files;
  • zstd, lz4, and xz-compressed DATA objects through pure-Rust dependencies;
  • directory reading across active and archived files with bounded recursive traversal, symlink-cycle protection, and interleaved multi-file ordering, including mixed regular/compact, compressed/uncompressed, sealed/unsealed, and whole-file .journal.zst files in one directory;
  • forward/backward iteration, cursors, realtime and monotonic timestamps, seqnum metadata, field enumeration, binary field values, repeated field values, stateful current-entry data enumeration, unique value enumeration, and export/json/text formatting;
  • byte-preserving RAW field-name access through Entry::raw_fields(), Entry::get_raw(), and Entry::get_raw_values(); Entry.fields and Entry.field_values are UTF-8 string-keyed convenience maps and do not synthesize lossy names for non-UTF8 RAW field names;
  • export byte output preserves non-UTF8 RAW field names; JSON output, field enumeration, unique queries, and get_data facade helpers remain UTF-8 field-name surfaces;
  • libsystemd-compatible facade functions for open file/directory/files, close, seek head/tail/realtime/cursor, next/previous/skip, match groups, current-entry data enumeration, field enumeration, unique value enumeration, realtime/monotonic/seqnum/cursor metadata, and boot listing;
  • facade cursor seeking follows libsystemd semantics: valid missing cursors are accepted as seek locations, while test_cursor checks exact current position;
  • current-entry facade data enumeration returns borrowed FIELD=value bytes for the current DATA object, matching libsystemd-style validity until the current row is reset or the reader advances; uncompressed DATA is returned directly from the mmap-backed journal payload, while compressed DATA is copied into row-owned stable storage so later compressed DATA reads cannot invalidate earlier pointers from the same row;
  • direct facade unique queries return language-native (field, value) pairs; stateful unique enumeration returns full binary-safe FIELD=value payloads;
  • FileReader::visit_unique_values() and DirectoryReader::visit_unique_values() stream indexed unique values without first materializing the full result set;
  • FileReader::explore() provides an optimized single-file query surface for log-explorer workloads: exact indexed filters, selected facet counters, optional histogram, optional FTS, optional returned rows, and query counters. It lazily classifies reusable DATA objects by DATA offset during candidate-row traversal, groups facets that share the same effective filter set into one traversal pass, and expands all fields only for returned rows. ExplorerAnchor::Auto is the default: forward queries start from the lower time bound or file head, while backward queries start from the upper time bound or file tail. ExplorerFieldMode::FirstValue is the default explorer accounting mode: one selected facet/histogram/source field contributes at most one value per row, so traversal may stop after all required fields are found. ExplorerFieldMode::AllValues is available when a caller needs exact duplicate-value accounting and accepts the slower full-row scan. explore() owns the reader position and replaces the reader match state while it runs; callers should explicitly seek and reapply any manual matches before continuing normal iteration after an explorer query;
  • FileReader::explore_with_strategy() exposes explicit strategy selection. ExplorerStrategy::Traversal is the default behavior used by explore(). ExplorerStrategy::Index derives all-values facet and histogram counts from FIELD/DATA indexes and DATA entry posting lists. It rejects default first-value semantics, FTS, and source-realtime-bounded queries instead of returning approximate results. ExplorerStrategy::Compare runs traversal and index, fails if their logical outputs differ, and returns timing/counter diagnostics in ExplorerResult::comparison. There is no automatic planner because index aggregation is faster only for some query shapes;
  • journal::netdata provides the Netdata-specific Rust function boundary over the explorer. It is the SDK API intended to replace Netdata's generic systemd-journal.plugin logs function. NetdataJournalFunction::systemd_journal() runs a systemd-journal request JSON against a journal directory and returns Netdata-shaped function JSON. This layer owns Netdata request parsing, default facets, default display columns, histogram defaults, field presentation transforms, row options, and zero-count vocabulary padding for filtered requests. The default profile keeps UID/GID values as raw journal data and does not resolve host user or group names. The separate NetdataJournalFunction::systemd_journal_plugin_compatible() constructor opts into host user/group name presentation to emulate Netdata's installed plugin, with per-query UID/GID display caching so repeated values do not repeatedly call host name-service lookups. This layer is intentionally separate from the core journal file-format reader. Consumers that need Netdata function control can use run_directory_request_json_with_options() or run_directory_request_bytes_with_options() with NetdataFunctionRunOptions to supply a timeout, progress callback, cancellation callback, and optional caller-owned NetdataFunctionState. Progress is reported against the files selected for the query after source and time-window preselection, including file-end progress for small or fast files. Cancellation is checked before each selected file, during active Explorer scans, and after file-end progress callbacks. The optional state hook lets Netdata pass registry-provided source type/name metadata and persist per-file learned journal-vs-source-realtime drift. Without state, the wrapper falls back to journal headers and plugin-compatible filename classification for built-in __logs_sources groups. Sampling uses plugin-compatible sampled, unsampled, and estimated counters for full-analysis sliced requests and is disabled for data-only requests. The query request member uses Netdata SIMPLE_PATTERN behavior: ordered | terms, leading ! negative terms, escaped separators, substring * parts, and case-insensitive matching. The SDK Netdata boundary always executes indexed slice semantics. The slice request member is retained in the normalized echo because it is part of the plugin request shape; it does not select a slower non-slice fallback path. Cancellation and no-change responses use Netdata's compact function error envelope; timeout returns a partial table response;
  • src/internal/testcmd/netdata_function_wrapper is a thin offline test adapter over the SDK Netdata boundary. It exposes the same CLI shape as Netdata's plugin test path: netdata_function_wrapper --test systemd-journal --dir <journal-dir> --timeout <seconds> < <request.json>. The request JSON is read from stdin to avoid privileged file reads in test binaries. The comparison tools under ../tests/netdata_function/ compare semantic function output against an external systemd-journal.plugin binary. The wrapper has diagnostic-only --progress-jsonl, --cancel-immediately, and --cancel-after-progress switches to validate the SDK run-control API; production consumers should call journal::netdata directly and wire callbacks to their own function framework;
  • default reader options use live/windowed mmap with a 32 MiB window. Smaller windows are available for constrained environments, but high-cardinality indexed queries can become remap-bound with very small windows;
  • --output export uses systemd's size-prefixed binary field encoding and blank-line entry separator;
  • JSON output includes realtime and monotonic timestamps, preserves valid UTF-8 strings, and encodes binary values as arrays of unsigned bytes;
  • libsystemd-style match behavior: AND between different fields, OR between values for the same field, SdJournalAddDisjunction() for +, and SdJournalAddConjunction() for explicit AND groups;
  • a file-backed journalctl command under src/cmd/journalctl with --since, --until, --boot, and --follow support for repository-backed files and directories;
  • verification APIs: journal::verify_file() for structural verification and journal::verify_file_with_key() for sealed TAG/HMAC verification;
  • a conformance adapter under src/adapter.

Platform behavior:

  • Linux is the validated reference runtime and keeps mmap-backed hot paths, monotonic timestamps, Unix directory sync, and SIGBUS handling.
  • FreeBSD and macOS builds use monotonic timestamps and the same pure file reader/writer paths. Optional identity and lock helpers are separate from the core file-format writer.
  • Windows builds use unbiased interrupt time for automatic writer timestamps and no-op directory fsync/SIGBUS hooks. Optional identity and lock helpers are separate from the core file-format writer.
  • Non-Linux build checks are compilation evidence only unless runtime evidence from that OS is recorded separately. Files written on non-Linux targets must still pass Linux stock journalctl --verify --file and repository interoperability checks before production compatibility is claimed.

Reader limitations:

  • list_boots uses file-level boot metadata in this slice;
  • full systemd object-graph verification parity is tracked separately;
  • daemon-only journalctl operations are not implemented.

Basic directory writer usage:

use journal::{Config, Log, Origin, RetentionPolicy, RotationPolicy, Source};

let origin = Origin {
    machine_id: None,
    namespace: None,
    source: Source::System,
};
let config = Config::new(
    origin,
    RotationPolicy::default()
        .with_number_of_entries(100000)
        .with_duration_of_journal_file(std::time::Duration::from_secs(3600)),
    RetentionPolicy::default()
        .with_number_of_journal_files(10)
        .with_duration_of_journal_files(std::time::Duration::from_secs(7 * 24 * 3600)),
);
let mut log = Log::new("/var/log/journal-sdk", config)?;

log.write_entry(
    &[
        b"MESSAGE=plugin started".as_slice(),
        b"PRIORITY=6".as_slice(),
        b"SYSLOG_IDENTIFIER=example-plugin".as_slice(),
    ],
    None,
)?;
log.sync()?;
log.close()?;
# Ok::<(), Box<dyn std::error::Error>>(())

Log stores files below <directory>/<machine-id>/. By default the active file uses the chain filename form <source>@<seqnum-id>-<head-seqnum>-<head-realtime>.journal; call Config::with_strict_systemd_naming(true) to use <source>.journal as the active file. If strict naming opens a directory with a stale chain-named ONLINE active file, it archives that file before creating <source>.journal, so the directory does not keep parallel active files. If an existing active file is rejected by the low-level append-open path as unsupported, Log follows journald's reliable-open behavior: it uses readable header metadata to continue sequence identity where possible, moves the old active file to a collision-safe *.journal~ disposed name, and creates a fresh active file. Direct low-level append-open still returns an unsupported error. Unset rotation and retention limits are disabled. Retention counts the tracked active/current file in file-count and committed-byte limits, but deletion only selects older unprotected files owned by the configured source; the tracked active/current file is never deleted to satisfy a retention limit. Duration rotation is checked before append using the incoming entry realtime and the active file head realtime. Call Log::enforce_retention() to apply age/count/byte retention without waiting for another append-triggered rotation or close. Call Log::close() to archive the current file and enforce retention; Drop only performs best-effort state persistence. Retention also runs once when a writer opens or creates the active file: existing-active reopen and LogOpenMode::Eager enforce it during construction, while lazy archived-only construction defers enforcement until the first append opens the active file, before the first entry is written. Use Config::with_open_mode(LogOpenMode::Eager) to create/open the active file during construction, and Config::with_identity_mode(LogIdentityMode::Strict) plus Origin.machine_id and Config::with_boot_id() to require explicit identity. LogIdentityMode::Auto uses explicit IDs when provided and otherwise generates SDK-local IDs; it does not read host identity sources. Log::configured_directory(), Log::journal_directory(), Log::active_path(), Log::machine_id(), Log::boot_id(), and Log::source() expose the same directory/identity contract as the other SDKs. Lifecycle observers receive Created, Rotated, and RetainedDeleted events; Log::with_artifact_sizer() includes per-journal sidecar bytes in retained-size decisions. write_entry_with_timestamps() accepts EntryTimestamps::source_realtime_usec for _SOURCE_REALTIME_TIMESTAMP injection and clamps non-progressing realtime and monotonic overrides forward. The low-level JournalWriter::add_entry() path preserves explicit caller-provided realtime and monotonic timestamps without clamping or rejecting them; callers using that raw API are responsible for not producing same-boot backward monotonic entries unless they are intentionally creating invalid fixtures. On reopen, Log seeds the monotonic clamp floor from a persisted chain tail only when the tail entry boot ID matches the current writer boot ID. Log is a single-writer object; callers must serialize method calls on one instance. The journal file contract is one writer per file. Acquire journal_core::file::lock::WriterLock when the caller wants the optional cooperating-writer lock helper to reject another SDK writer for the same file. Config::with_field_name_policy() selects the high-level writer field-name layer. The default FieldNamePolicy::Journald preserves trusted systemd fields such as _HOSTNAME and _TRANSPORT. FieldNamePolicy::JournalApp drops caller fields that journald would reject from untrusted applications and fails only when no caller fields remain. FieldNamePolicy::Raw accepts any non-empty field name that does not contain =, but RAW-mode files are not guaranteed to be accepted by stock systemd tooling. Producer-specific field transformations belong outside the SDK.

Journal files are created with systemd journald's 0640 default permissions. Use JournalFileOptions::with_file_mode() for direct-file writers or Config::with_file_mode() for directory writers when a consumer needs another mode. The override applies only to newly-created files; existing files keep their current filesystem permissions. POSIX modes remain subject to the process umask, matching systemd/open semantics. Non-POSIX platforms may ignore POSIX mode bits.

Live-reader publication can be tuned when the consumer does not need immediate stock follow-reader wakeups:

let config = config.with_live_publish_every_entries(64);

1 is the default and publishes after every entry. 0 disables explicit SDK live publication for poll/snapshot consumers. N > 1 publishes after every N entries. This is not an fsync or durability setting.

Binary-safe values:

log.write_entry(
    &[
        b"MESSAGE=sample with binary payload".as_slice(),
        b"BINARY_PAYLOAD=\x00\x01\x02\xff".as_slice(),
    ],
    None,
)?;
# Ok::<(), Box<dyn std::error::Error>>(())

Basic reader usage:

use journal::FileReader;

let mut reader = FileReader::open("/path/to/system.journal")?;
reader.add_match(b"PRIORITY=6");
reader.seek_head();

while reader.next()? {
    let entry = reader.get_entry()?;
    if let Some(message) = entry.get_str("MESSAGE") {
        println!("{message}");
    }
}
# Ok::<(), Box<dyn std::error::Error>>(())

Optimized single-file explorer usage:

use journal::{ExplorerQuery, FileReader};

let mut reader = FileReader::open("/path/to/system.journal")?;
let result = reader.explore(&ExplorerQuery {
    facets: vec![b"PRIORITY".to_vec()],
    limit: 0,
    ..ExplorerQuery::default()
})?;

if let Some(priority) = result.facets.get(b"PRIORITY".as_slice()) {
    for (value, count) in priority {
        println!("{} {count}", String::from_utf8_lossy(value));
    }
}
# Ok::<(), Box<dyn std::error::Error>>(())

The default first-value mode counts at most one value per selected field per row. Use ExplorerFieldMode::AllValues when a row may contain repeated values for a selected facet or histogram field and every duplicate value must count.

Explorer column catalogs are built from FIELD indexes. Do not use row traversal to discover columns in production; a comparison that needs debug_collect_column_fields_by_row_traversal has found a bug in the explorer or its column-catalog setup, not a valid operating mode.

Specialized callers can select an execution strategy:

use journal::{ExplorerFieldMode, ExplorerQuery, ExplorerStrategy, FileReader};

let mut reader = FileReader::open("/path/to/system.journal")?;
let result = reader.explore_with_strategy(
    &ExplorerQuery {
        facets: vec![b"PRIORITY".to_vec()],
        field_mode: ExplorerFieldMode::AllValues,
        use_source_realtime: false,
        limit: 0,
        ..ExplorerQuery::default()
    },
    ExplorerStrategy::Index,
)?;
# Ok::<(), Box<dyn std::error::Error>>(())

The index strategy is exact for its supported subset, but it is not a universal speedup. It can be much faster for narrow unfiltered all-values facets and histograms, and slower for many facets or selective filters. Use ExplorerStrategy::Compare when validating a query shape before relying on the index strategy; successful compare results include traversal and index timings and stats in ExplorerResult::comparison.

The default ExplorerAnchor::Auto chooses the natural scan start for the query direction. Use explicit Head, Tail, or Realtime(usec) anchors only for manual paging or when the caller intentionally wants a non-default start point.

For RAW-mode files, use the byte-keyed entry surface when field names are not guaranteed to be UTF-8:

if let Some(value) = entry.get_raw(b"\xffRAW") {
    assert_eq!(value, b"raw value");
}

for field in entry.raw_fields() {
    let name_bytes = field.name;
    let value_bytes = field.value;
}

File-backed journalctl:

cargo run --manifest-path rust/Cargo.toml -p journalctl -- \
  --file fixtures/systemd/test-data/no-rtc/system.journal.zst \
  --head 1 \
  --output json

Repeated matches for the same field are OR alternatives. Matches for different fields are ANDed. A separate + argument creates an explicit disjunction:

cargo run --manifest-path rust/Cargo.toml -p journalctl -- \
  --file ./sample.journal \
  PRIORITY=3 PRIORITY=4 + MESSAGE=boot

Realtime ranges, boot filters, and follow mode are supported for file-backed inputs:

cargo run --manifest-path rust/Cargo.toml -p journalctl -- \
  --directory ./journals --boot=all --since @1700000000 --until @1700003600
cargo run --manifest-path rust/Cargo.toml -p journalctl -- \
  --file ./active.journal --follow --no-tail --boot=all