# Rust Journal SDK
This workspace contains pure-Rust systemd journal reader and writer components.
It does not link to libsystemd or other system journal libraries for SDK
behavior.
## crates.io Usage
The public Rust SDK package is `systemd-journal-sdk`. Use a Cargo dependency
alias if existing code should import it as `journal`:
```toml
[dependencies]
journal = { package = "systemd-journal-sdk", version = "0.7.2" }
```
The workspace also publishes project-prefixed lower-level packages for
consumers that need direct access to the same internal layers used by the SDK:
- `systemd-journal-sdk-common`
- `systemd-journal-sdk-core`
- `systemd-journal-sdk-registry`
- `systemd-journal-sdk-log-writer`
- `systemd-journal-sdk-index`
- `systemd-journal-sdk-engine`
Current writer scope:
- regular journal files by default and compact journal files with
`JournalFileOptions::with_compact(true)` or `Config::with_compact(true)`;
- uncompressed DATA objects by default;
- optional zstd, xz, and lz4-compressed DATA object writing through
`JournalFileOptions` and `journal::Config`, using systemd's 512-byte default
threshold and 8-byte minimum clamp;
- keyed hash tables using the journal file ID;
- byte-safe field values through `&[u8]` field payloads;
- direct-file writing through `journal_core`;
- high-level directory writing through `journal::Log`;
- systemd-compatible `0640` journal file permissions by default, configurable
for newly-created files through `JournalFileOptions::with_file_mode()` and
`Config::with_file_mode()`;
- chain active naming by default, with
`Config::with_strict_systemd_naming(true)` available for strict systemd
`<source>.journal` active naming;
- shared field-name policy layers for direct-file and directory writers:
default `FieldNamePolicy::Journald`, app-facing
`FieldNamePolicy::JournalApp`, and structure-level `FieldNamePolicy::Raw`;
- entry-count, file-size, and duration rotation;
- tracked journal-file-count, byte-size, and age retention;
- optional pure cross-SDK cooperative lockfile with stale-owner detection when
callers explicitly acquire `journal_core::file::lock::WriterLock`;
- Forward Secure Sealing TAG writing through `SealOptions`, including stock
`journalctl --verify --verify-key` coverage for sealed files generated by
this writer;
- FSS `SealOptions::start_usec` normalization to systemd's verification-key
epoch boundary, so unaligned source timestamps still produce sealed files that
stock `journalctl --verify --verify-key` can validate;
- low-level `EntryWriteOptions::seqnum(...)` and
`EntryWriteOptions::boot_id(...)` exact-regeneration support for preserving
ENTRY sequence gaps and per-entry boot IDs when rewriting existing journal
files. Leave them unset for normal auto-incrementing sequence numbers and the
writer-wide boot ID;
- native systemd writers do not participate in the SDK lock protocol and remain
an operational exclusion;
- live stock-reader validation for the current writer slice with `journalctl
--file`, `journalctl --file --follow --no-tail --boot=all`, and libsystemd
reader APIs, including live sequence-order checks;
- configurable explicit live-reader publication cadence through
`JournalWriter::set_live_publish_every_entries()` and
`Config::with_live_publish_every_entries()`, defaulting to systemd-compatible
publication after every entry.
Deferred scope:
- appending to arbitrary historical or systemd-created journal variants. In
particular, append-open on historical unkeyed-hash files is unsupported and
returns a controlled error before entry mutation;
- the imported legacy `jf` `journal_file::JournalWriter` remains available for
compatibility with that crate's public surface, but it is not the supported
production writer path. It also returns a controlled unsupported-file error
for unkeyed append targets instead of panicking. New writer integrations
should use `journal_core::file::JournalWriter` or the high-level
`journal::Log` directory writer;
- full systemd object-graph verification parity beyond the current repository
verification API.
Current reader scope:
- regular and compact journal files;
- `.journal`, `.journal~`, `.journal.zst`, and `.journal~.zst` files;
- zstd-compressed fixture files;
- zstd, lz4, and xz-compressed DATA objects through pure-Rust dependencies;
- directory reading across active and archived files with bounded recursive
traversal, symlink-cycle protection, and interleaved multi-file ordering,
including mixed regular/compact, compressed/uncompressed,
sealed/unsealed, and whole-file `.journal.zst` files in one directory;
- forward/backward iteration, cursors, realtime and monotonic timestamps,
seqnum metadata, field enumeration, binary field values, repeated field
values, stateful current-entry data enumeration, unique value enumeration,
and export/json/text formatting;
- byte-preserving RAW field-name access through `Entry::raw_fields()`,
`Entry::get_raw()`, and `Entry::get_raw_values()`;
`Entry.fields` and `Entry.field_values` are UTF-8 string-keyed convenience
maps and do not synthesize lossy names for non-UTF8 RAW field names;
- export byte output preserves non-UTF8 RAW field names; JSON output, field
enumeration, unique queries, and `get_data` facade helpers remain UTF-8
field-name surfaces;
- libsystemd-compatible facade functions for open file/directory/files, close,
seek head/tail/realtime/cursor, next/previous/skip, match groups,
current-entry data enumeration, field enumeration, unique value enumeration,
realtime/monotonic/seqnum/cursor metadata, and boot listing;
- facade cursor seeking follows libsystemd semantics: valid missing cursors are
accepted as seek locations, while `test_cursor` checks exact current position;
- current-entry facade data enumeration returns borrowed `FIELD=value` bytes
for the current DATA object, matching libsystemd-style validity until the
current row is reset or the reader advances; uncompressed DATA is returned
directly from the mmap-backed journal payload, while compressed DATA is copied
into row-owned stable storage so later compressed DATA reads cannot invalidate
earlier pointers from the same row;
- direct facade unique queries return language-native `(field, value)` pairs;
stateful unique enumeration returns full binary-safe `FIELD=value` payloads;
- `FileReader::visit_unique_values()` and
`DirectoryReader::visit_unique_values()` stream indexed unique values without
first materializing the full result set;
- `FileReader::explore()` provides an optimized single-file query surface for
log-explorer workloads: exact indexed filters, selected facet counters,
optional histogram, optional FTS, optional returned rows, and query counters.
It lazily classifies reusable DATA objects by DATA offset during candidate-row
traversal, groups facets that share the same effective filter set into one
traversal pass, and expands all fields only for returned rows.
`ExplorerAnchor::Auto` is the default: forward queries start from the lower
time bound or file head, while backward queries start from the upper time
bound or file tail. `ExplorerFieldMode::FirstValue` is the default explorer
accounting mode: one selected facet/histogram/source field contributes at
most one value per row, so traversal may stop after all required fields are
found. `ExplorerFieldMode::AllValues` is available when a caller needs exact
duplicate-value accounting and accepts the slower full-row scan. `explore()`
owns the reader position and replaces the reader match state while it runs;
callers should explicitly seek and reapply any manual matches before
continuing normal iteration after an explorer query;
- `FileReader::explore_with_strategy()` exposes explicit strategy selection.
`ExplorerStrategy::Traversal` is the default behavior used by `explore()`.
`ExplorerStrategy::Index` derives all-values facet and histogram counts from
FIELD/DATA indexes and DATA entry posting lists. It rejects default
first-value semantics, FTS, and source-realtime-bounded queries instead of
returning approximate results. `ExplorerStrategy::Compare` runs traversal and
index, fails if their logical outputs differ, and returns timing/counter
diagnostics in `ExplorerResult::comparison`. There is no automatic planner
because index aggregation is faster only for some query shapes;
- `journal::netdata` provides the Netdata-specific Rust function boundary over
the explorer. It is the SDK API intended to replace Netdata's generic
`systemd-journal.plugin` logs function. `NetdataJournalFunction::systemd_journal()`
runs a `systemd-journal` request JSON against a journal directory and returns
Netdata-shaped function JSON. This layer owns Netdata request parsing,
default facets, default display columns, histogram defaults, field
presentation transforms, row options, and zero-count vocabulary padding for
filtered requests. The default profile keeps UID/GID values as raw journal
data and does not resolve host user or group names. The separate
`NetdataJournalFunction::systemd_journal_plugin_compatible()` constructor
opts into host user/group name presentation to emulate Netdata's installed
plugin, with per-query UID/GID display caching so repeated values do not
repeatedly call host name-service lookups. This layer is intentionally
separate from the core journal file-format reader. Consumers that need
Netdata function control can use
`run_directory_request_json_with_options()` or
`run_directory_request_bytes_with_options()` with
`NetdataFunctionRunOptions` to supply a timeout, progress callback,
cancellation callback, and optional caller-owned `NetdataFunctionState`.
Progress is reported against the files selected for the query after source
and time-window preselection, including file-end progress for small or fast
files. Cancellation is checked before each selected file, during active
Explorer scans, and after file-end progress callbacks. The optional state hook
lets Netdata pass registry-provided source type/name metadata and persist
per-file learned
journal-vs-source-realtime drift. Without state, the wrapper falls back to
journal headers and plugin-compatible filename classification for built-in
`__logs_sources` groups. `NetdataFunctionConfig::source_selector_name` and
`source_selector_help` customize only the displayed selector label/help
while preserving the `__logs_sources` wire id. Sampling uses
plugin-compatible sampled, unsampled, and estimated counters for
full-analysis sliced requests and is disabled for data-only requests. The
`query` request member uses Netdata
`SIMPLE_PATTERN` behavior: ordered `|` terms, leading `!` negative terms,
escaped separators, substring `*` parts, and case-insensitive matching.
The SDK Netdata boundary always executes indexed slice semantics. The `slice`
request member is retained in the
normalized echo because it is part of the plugin request shape; it does not
select a slower non-slice fallback path.
Cancellation and no-change responses use Netdata's compact function error
envelope; timeout returns a partial table response;
- `src/internal/testcmd/netdata_function_wrapper` is a thin offline test adapter
over the SDK Netdata boundary. It exposes the same CLI shape as Netdata's
plugin test path:
`netdata_function_wrapper --test systemd-journal --dir <journal-dir>
--timeout <seconds> < <request.json>`. The request JSON is read from stdin
to avoid privileged file reads in test binaries. The comparison tools under
`../tests/netdata_function/` compare semantic function output against an
external `systemd-journal.plugin` binary. The wrapper has diagnostic-only
`--progress-jsonl`, `--cancel-immediately`, and `--cancel-after-progress`
switches to validate the SDK run-control API; production consumers should
call `journal::netdata` directly and wire callbacks to their own function
framework;
- default reader options use live/windowed mmap with a 32 MiB window. Smaller
windows are available for constrained environments, but high-cardinality
indexed queries can become remap-bound with very small windows;
- `--output export` uses systemd's size-prefixed binary field encoding and
blank-line entry separator;
- JSON output includes realtime and monotonic timestamps, preserves valid UTF-8
strings, and encodes binary values as arrays of unsigned bytes;
- libsystemd-style match behavior: AND between different fields, OR between
values for the same field, `SdJournalAddDisjunction()` for `+`, and
`SdJournalAddConjunction()` for explicit AND groups;
- a file-backed `journalctl` command under `src/cmd/journalctl` with
`--since`, `--until`, `--boot`, and `--follow` support for repository-backed
files and directories;
- verification APIs: `journal::verify_file()` for structural verification and
`journal::verify_file_with_key()` for sealed TAG/HMAC verification;
- a conformance adapter under `src/adapter`.
Platform behavior:
- Linux is the validated reference runtime and keeps mmap-backed hot paths,
monotonic timestamps, Unix directory sync, and SIGBUS handling.
- FreeBSD and macOS builds use monotonic timestamps and the same pure file
reader/writer paths. Optional identity and lock helpers are separate from the
core file-format writer.
- Windows builds use unbiased interrupt time for automatic writer timestamps
and no-op directory fsync/SIGBUS hooks. Optional identity and lock helpers
are separate from the core file-format writer.
- Non-Linux build checks are compilation evidence only unless runtime evidence
from that OS is recorded separately. Files written on non-Linux targets must
still pass Linux stock `journalctl --verify --file` and repository
interoperability checks before production compatibility is claimed.
Reader limitations:
- `list_boots` uses file-level boot metadata in this slice;
- full systemd object-graph verification parity is tracked separately;
- daemon-only journalctl operations are not implemented.
Basic directory writer usage:
```rust
use journal::{Config, Log, Origin, RetentionPolicy, RotationPolicy, Source};
let origin = Origin {
machine_id: None,
namespace: None,
source: Source::System,
};
let config = Config::new(
origin,
RotationPolicy::default()
.with_number_of_entries(100000)
.with_duration_of_journal_file(std::time::Duration::from_secs(3600)),
RetentionPolicy::default()
.with_number_of_journal_files(10)
.with_duration_of_journal_files(std::time::Duration::from_secs(7 * 24 * 3600)),
);
let mut log = Log::new("/var/log/journal-sdk", config)?;
log.write_entry(
&[
b"MESSAGE=plugin started".as_slice(),
b"PRIORITY=6".as_slice(),
b"SYSLOG_IDENTIFIER=example-plugin".as_slice(),
],
None,
)?;
log.sync()?;
log.close()?;
# Ok::<(), Box<dyn std::error::Error>>(())
```
`Log` stores files below `<directory>/<machine-id>/`. By default the active file
uses the chain filename form
`<source>@<seqnum-id>-<head-seqnum>-<head-realtime>.journal`; call
`Config::with_strict_systemd_naming(true)` to use `<source>.journal` as the
active file.
If strict naming opens a directory with a stale chain-named `ONLINE` active
file, it archives that file before creating `<source>.journal`, so the directory
does not keep parallel active files.
If an existing active file is rejected by the low-level append-open path as
unsupported, `Log` follows journald's reliable-open behavior: it uses readable
header metadata to continue sequence identity where possible, moves the old
active file to a collision-safe `*.journal~` disposed name, and creates a fresh
active file. Direct low-level append-open still returns an unsupported error.
Unset rotation and retention limits are disabled. Retention counts the tracked
active/current file in file-count and committed-byte limits, but deletion only
selects older unprotected files owned by the configured source; the tracked
active/current file is never deleted to satisfy a retention limit. Duration
rotation is checked before append using the incoming entry realtime and the
active file head realtime.
Call `Log::enforce_retention()` to apply age/count/byte retention without
waiting for another append-triggered rotation or close. Call `Log::close()` to
archive the current file and enforce retention; `Drop` only performs best-effort
state persistence.
Retention also runs once when a writer opens or creates the active file:
existing-active reopen and `LogOpenMode::Eager` enforce it during construction,
while lazy archived-only construction defers enforcement until the first append
opens the active file, before the first entry is written.
Use `Config::with_open_mode(LogOpenMode::Eager)` to create/open the active file
during construction, and `Config::with_identity_mode(LogIdentityMode::Strict)`
plus `Origin.machine_id` and `Config::with_boot_id()` to require explicit
identity. `LogIdentityMode::Auto` uses explicit IDs when provided and otherwise
generates SDK-local IDs; it does not read host identity sources.
`Log::configured_directory()`, `Log::journal_directory()`,
`Log::active_path()`, `Log::machine_id()`, `Log::boot_id()`, and
`Log::source()` expose the same directory/identity contract as the other SDKs.
Lifecycle observers receive `Created`, `Rotated`, and `RetainedDeleted` events;
`Log::with_artifact_sizer()` includes per-journal sidecar bytes in retained-size
decisions. `write_entry_with_timestamps()` accepts
`EntryTimestamps::source_realtime_usec` for `_SOURCE_REALTIME_TIMESTAMP`
injection and clamps non-progressing realtime and monotonic overrides forward.
The low-level `JournalWriter::add_entry()` path preserves explicit
caller-provided realtime and monotonic timestamps without clamping or rejecting
them; callers using that raw API are responsible for not producing same-boot
backward monotonic entries unless they are intentionally creating invalid
fixtures. On reopen, `Log` seeds the monotonic clamp floor from a persisted
chain tail only when the tail entry boot ID matches the current writer boot ID.
`Log` is a single-writer object; callers must serialize method calls on one
instance. The journal file contract is one writer per file. Acquire
`journal_core::file::lock::WriterLock` when the caller wants the optional
cooperating-writer lock helper to reject another SDK writer for the same file.
`Config::with_field_name_policy()` selects the high-level writer field-name
layer. The default `FieldNamePolicy::Journald` preserves trusted systemd fields
such as `_HOSTNAME` and `_TRANSPORT`. `FieldNamePolicy::JournalApp` drops caller
fields that journald would reject from untrusted applications and fails only
when no caller fields remain. `FieldNamePolicy::Raw` accepts any non-empty
field name that does not contain `=`, but RAW-mode files are not guaranteed to
be accepted by stock systemd tooling. Producer-specific field transformations
belong outside the SDK.
Journal files are created with systemd journald's `0640` default permissions.
Use `JournalFileOptions::with_file_mode()` for direct-file writers or
`Config::with_file_mode()` for directory writers when a consumer needs another
mode. The override applies only to newly-created files; existing files keep
their current filesystem permissions. POSIX modes remain subject to the
process umask, matching systemd/open semantics. Non-POSIX platforms may ignore
POSIX mode bits.
Live-reader publication can be tuned when the consumer does not need immediate
stock follow-reader wakeups:
```rust
let config = config.with_live_publish_every_entries(64);
```
`1` is the default and publishes after every entry. `0` disables explicit SDK
live publication for poll/snapshot consumers. `N > 1` publishes after every
`N` entries. This is not an `fsync` or durability setting.
Binary-safe values:
```rust
log.write_entry(
&[
b"MESSAGE=sample with binary payload".as_slice(),
b"BINARY_PAYLOAD=\x00\x01\x02\xff".as_slice(),
],
None,
)?;
# Ok::<(), Box<dyn std::error::Error>>(())
```
Basic reader usage:
```rust
use journal::FileReader;
let mut reader = FileReader::open("/path/to/system.journal")?;
reader.add_match(b"PRIORITY=6");
reader.seek_head();
while reader.next()? {
let entry = reader.get_entry()?;
if let Some(message) = entry.get_str("MESSAGE") {
println!("{message}");
}
}
# Ok::<(), Box<dyn std::error::Error>>(())
```
Optimized single-file explorer usage:
```rust
use journal::{ExplorerQuery, FileReader};
let mut reader = FileReader::open("/path/to/system.journal")?;
let result = reader.explore(&ExplorerQuery {
facets: vec![b"PRIORITY".to_vec()],
limit: 0,
..ExplorerQuery::default()
})?;
if let Some(priority) = result.facets.get(b"PRIORITY".as_slice()) {
for (value, count) in priority {
println!("{} {count}", String::from_utf8_lossy(value));
}
}
# Ok::<(), Box<dyn std::error::Error>>(())
```
The default first-value mode counts at most one value per selected field per
row. Use `ExplorerFieldMode::AllValues` when a row may contain repeated values
for a selected facet or histogram field and every duplicate value must count.
Explorer column catalogs are built from FIELD indexes. Do not use row traversal
to discover columns in production; a comparison that needs
`debug_collect_column_fields_by_row_traversal` has found a bug in the explorer
or its column-catalog setup, not a valid operating mode.
Specialized callers can select an execution strategy:
```rust
use journal::{ExplorerFieldMode, ExplorerQuery, ExplorerStrategy, FileReader};
let mut reader = FileReader::open("/path/to/system.journal")?;
let result = reader.explore_with_strategy(
&ExplorerQuery {
facets: vec![b"PRIORITY".to_vec()],
field_mode: ExplorerFieldMode::AllValues,
use_source_realtime: false,
limit: 0,
..ExplorerQuery::default()
},
ExplorerStrategy::Index,
)?;
# Ok::<(), Box<dyn std::error::Error>>(())
```
The index strategy is exact for its supported subset, but it is not a universal
speedup. It can be much faster for narrow unfiltered all-values facets and
histograms, and slower for many facets or selective filters. Use
`ExplorerStrategy::Compare` when validating a query shape before relying on the
index strategy; successful compare results include traversal and index timings
and stats in `ExplorerResult::comparison`.
The default `ExplorerAnchor::Auto` chooses the natural scan start for the query
direction. Use explicit `Head`, `Tail`, or `Realtime(usec)` anchors only for
manual paging or when the caller intentionally wants a non-default start point.
For RAW-mode files, use the byte-keyed entry surface when field names are not
guaranteed to be UTF-8:
```rust
if let Some(value) = entry.get_raw(b"\xffRAW") {
assert_eq!(value, b"raw value");
}
for field in entry.raw_fields() {
let name_bytes = field.name;
let value_bytes = field.value;
}
```
File-backed journalctl:
```sh
cargo run --manifest-path rust/Cargo.toml -p journalctl -- \
--file fixtures/systemd/test-data/no-rtc/system.journal.zst \
--head 1 \
--output json
```
Repeated matches for the same field are OR alternatives. Matches for different
fields are ANDed. A separate `+` argument creates an explicit disjunction:
```sh
cargo run --manifest-path rust/Cargo.toml -p journalctl -- \
--file ./sample.journal \
PRIORITY=3 PRIORITY=4 + MESSAGE=boot
```
Realtime ranges, boot filters, and follow mode are supported for file-backed
inputs:
```sh
cargo run --manifest-path rust/Cargo.toml -p journalctl -- \
--directory ./journals --boot=all --since @1700000000 --until @1700003600
cargo run --manifest-path rust/Cargo.toml -p journalctl -- \
--file ./active.journal --follow --no-tail --boot=all
```