graphstream
Experimental. graphstream is under active development and not yet stable. APIs will change without notice.
Journal replication engine for graph databases. Logical WAL shipping via .hadbj segments to S3-compatible storage.
graphstream handles journal writing, segment rotation, upload/download, compaction, and segment replay for Kuzu/LadybugDB HA. Used by hakuzu for graph database high availability.
Part of the hadb ecosystem. Shared infrastructure (S3 via ObjectStore trait, retry, circuit breaker) provided by hadb-io.
What it does
- Journal writing: Cypher queries + params written to
.hadbjsegment files via hadb-changeset binary format with SHA-256 chain hashing. - Segment rotation: Auto-rotate at configurable size (default 4MB).
SealForUploadseals the current segment for upload. - S3 upload: Sealed segments uploaded via
hadb_io::ObjectStoretrait. Up to 4 concurrent uploads viaJoinSet. SupportsUploadWithAckfor synchronous upload confirmation (used by hakuzu'ssync()for graceful shutdown). - Segment download: Followers poll S3 for new segments via
ObjectStoreand download incrementally. - Journal reader: Streaming reads from segments, 1MB
BufReader, streaming zstd decompression for compressed segments (encrypted segments use full-body decode). - Compression + encryption: Optional zstd compression and ChaCha20-Poly1305 encryption per segment.
- Disk cache: Manifest-based upload tracking with age/size-based cleanup. Crash-safe (write
.tmp+ rename). - Compaction: Streaming compaction merges segments without loading all into memory. Background compaction triggers after N uploaded segments.
- O(1) recovery: 32-byte chain hash trailer on sealed segments. Recovery reads one header + trailer instead of scanning all entries.
- Retry + circuit breaker: Via hadb-io. Transient errors retried with exponential backoff + jitter. Circuit breaker prevents hammering degraded endpoints.
- Prometheus metrics: Lock-free
AtomicU64counters for entries, segments, uploads, downloads, errors, bytes.snapshot().to_prometheus()text format.
Architecture
Writer -> PendingEntry -> JournalWriter -> .hadbj segments -> Uploader -> ObjectStore (S3)
| |
SegmentCache (manifest) Background Compaction
|
Follower <- replay_entries <- JournalReader <- .hadbj segments <- ObjectStore Download
Entry format (protobuf)
message JournalEntry {
string query = 1; // Cypher query
repeated ParamEntry params = 2; // Named parameters
int64 timestamp_ms = 3; // Deterministic timestamp
optional string uuid_override = 4; // Deterministic UUID
}
Segment format
Uses hadb-changeset's .hadbj binary format:
- Header (128 bytes): HADBJ magic + flags (sealed, compressed, chain hash) + sequence range + body length + checksums
- Body: binary entries (sequence + prev_hash + payload), zstd-compressed when sealed
- Trailer (sealed only): 32-byte SHA-256 chain hash for O(1) recovery and cross-segment integrity
Usage
use ;
use spawn_journal_uploader;
// Spawn journal writer
let state = new;
let tx = spawn_journal_writer;
// Write an entry
tx.send?;
// Spawn S3 uploader (streams sealed segments via ObjectStore)
let = channel;
let = spawn_journal_uploader;
Dependencies
hadb-changeset--.hadbjbinary format (encode, decode, seal, chain hashing, S3 key layout)hadb-io-- S3 via ObjectStore trait, retry, circuit breakerprost-- Protobuf serialization (journal entry payload)zstd-- Compression (sealed segments)
Tests
Covers journal writing, chain hash integrity, streaming decompression, compaction, uploader (including ack), cache, and metrics.
CC=/opt/homebrew/opt/llvm/bin/clang CXX=/opt/homebrew/opt/llvm/bin/clang++ \
RUSTFLAGS="-L /opt/homebrew/opt/llvm/lib/c++"
License
Apache-2.0