bitcoinleveldb-versionedit 0.1.19

LevelDB-compatible VersionEdit encoding/decoding and manipulation utilities used by bitcoin-rs for manifest and version metadata management.
docs.rs failed to build bitcoinleveldb-versionedit-0.1.19
Please check the build logs for more information.
See Builds for ideas on how to fix a failed build, or Metadata for how to configure docs.rs builds.
If you believe this is docs.rs' fault, open an issue.
Visit the last successful build: bitcoinleveldb-versionedit-0.1.16-alpha.0

bitcoinleveldb-versionedit

Low-level encoding/decoding and manipulation of LevelDB VersionEdit records, extracted from the bitcoin-rs project. This crate provides a faithful, byte-for-byte compatible Rust implementation of LevelDB's manifest version-edit logic as used by Bitcoin-like workloads.


Overview

LevelDB stores the evolution of its on-disk state (files per level, sequence-number metadata, compaction pointers, etc.) in a manifest file. Each record in the manifest is a VersionEdit: a compact, varint-encoded description of mutations to the logical database version.

This crate implements:

  • A VersionEdit struct mirroring LevelDB's internal representation
  • Deterministic encoding of VersionEdit into the manifest binary format
  • Robust decoding from manifest records back into a VersionEdit
  • Convenience APIs to:
    • Track added files (per level)
    • Track deleted files (per level)
    • Maintain compaction pointers
    • Maintain log / sequence-number bookkeeping
    • Derive human-readable debug summaries

It is designed to be interoperable with existing LevelDB/Bitcoin data, focusing on correctness of serialization and deterministic ordering.

This crate is not a full LevelDB implementation; it is targeted infrastructure for higher-level components (like VersionSet and the full storage engine) in bitcoin-rs.


Features

  • Binary compatibility with LevelDB manifest format

    • Uses varint32/varint64 and length-prefixed slices to match LevelDB's on-disk representation
    • Tags and field semantics match LevelDB's VersionEdit:
      • kComparator (tag 1)
      • kLogNumber (tag 2)
      • kNextFileNumber (tag 3)
      • kLastSequence (tag 4)
      • kCompactPointer (tag 5)
      • kDeletedFile (tag 6)
      • kNewFile (tag 7)
      • kPrevLogNumber (tag 9)
  • Deterministic encoding

    • Deleted-file entries are sorted by (level, file_number) prior to encoding, guaranteeing that encode_to -> decode_from -> encode_to produces bit-identical manifest bytes.
  • Convenient high-level mutation API

    • add_file(level, file, size, smallest, largest)
    • delete_file(level, file)
    • set_comparator_name, set_log_number, set_prev_log_number, set_next_file, set_last_sequence
    • set_compact_pointer(level, key)
  • Introspectable

    • debug_string() yields a multi-line, human-readable summary suitable for logging and debugging, including all scalar fields, compaction pointers, deletions, and new files.
  • Safe defaults & state reset

    • Default constructs an empty, "no-op" VersionEdit with all has_ flags cleared.
    • clear()/reset_core_state() allow reuse of a VersionEdit while preserving compaction pointers if desired.

Crate Status

  • License: MIT
  • Edition: Rust 2021
  • Repository: https://github.com/klebs6/bitcoin-rs
  • Intended users: implementers of LevelDB-compatible storage layers, Bitcoin node developers, and systems programmers requiring exact reproduction of LevelDB manifest semantics.

Core Data Structures

VersionEdit

pub struct VersionEdit  {
    comparator:           String,
    log_number:           u64,
    prev_log_number:      u64,
    next_file_number:     u64,
    last_sequence:        SequenceNumber,
    has_comparator:       bool,
    has_log_number:       bool,
    has_prev_log_number:  bool,
    has_next_file_number: bool,
    has_last_sequence:    bool,
    compact_pointers:     Vec<(i32, InternalKey)>,
    deleted_files:        VersionEditDeletedFileSet,
    new_files:            Vec<(i32, FileMetaData)>,
}

pub type VersionEditDeletedFileSet = HashSet<(i32, u64)>;

Conceptually, a VersionEdit is a sparse patch to the current logical version:

  • Scalar metadata

    • comparator: name of the key comparator
    • log_number: current log file number
    • prev_log_number: previous log file number
    • next_file_number: global file-number allocator watermark
    • last_sequence: maximal sequence number visible after applying this edit
    • has_* flags: which of the above are present in this edit
  • Collections

    • compact_pointers: Vec<(level, InternalKey)>
    • deleted_files: HashSet<(level, file_number)>
    • new_files: Vec<(level, FileMetaData)>

These mutate the file layout per compaction level.

Helper functions

These implement the manifest's binary protocol for specific logical units:

pub fn get_level(input: &mut Slice, level: &mut i32) -> bool { ... }

pub fn get_internal_key(input: &mut Slice, key: &mut InternalKey) -> bool { ... }
  • get_level reads a LevelDB level (0..N) from a varint32-encoded field.
  • get_internal_key reads a length-prefixed slice and decodes it into an InternalKey.

Encoding & Decoding Semantics

Encoding: VersionEdit::encode_to

impl VersionEdit {
    pub fn encode_to(&self, dst: *mut String) { ... }
}
  • Accepts a raw pointer to an owned String that serves as a byte buffer.
  • Serializes the VersionEdit fields into the LevelDB manifest wire format:
    • Scalars are emitted only if the corresponding has_* flag is true.
    • compact_pointers, deleted_files, and new_files are written sequentially.
  • deleted_files are pre-sorted:
let mut deleted_files_sorted: Vec<(i32, u64)> =
    self.deleted_files().iter().copied().collect();
deleted_files_sorted.sort_unstable();

This guarantees deterministic encoding irrespective of the internal HashSet iteration order.

Safety model:

  • The method uses unsafe for the raw pointer; you must ensure:
    • dst is non-null and points to a valid String
    • The String outlives the call

A higher-level wrapper can be constructed to hide the raw pointer, e.g. by allocating and passing &mut String and then casting internally.

Decoding: VersionEdit::decode_from

impl VersionEdit {
    pub fn decode_from(&mut self, src: &Slice) -> Status { ... }
}
  • Resets the core scalar state and file collections before decoding.
  • Consumes a copy of the input Slice and incrementally parses tagged fields.
  • Each tag is matched against the LevelDB tag set; unknown or malformed tags result in a Status::corruption with contextual diagnostics.
  • Parsed values are routed through the higher-level mutation functions (set_*, add_file, delete_file, set_compact_pointer).

The loop structure is essentially:

while msg.is_none() && get_varint32(&mut input, &mut tag) {
    match tag {
        1 => { /* comparator */ }
        2 => { /* log number */ }
        3 => { /* next file number */ }
        4 => { /* last sequence */ }
        5 => { /* compact pointer */ }
        6 => { /* deleted file */ }
        7 => { /* new file */ }
        9 => { /* prev log number */ }
        _ => { msg = Some("unknown tag"); }
    }
}

Post-conditions:

  • On success: returns Status::ok() and a fully-populated VersionEdit.
  • On failure: returns a corruption Status indicating the failing component, and leaves the VersionEdit in a reset state (partial mutations are not guaranteed useful).

Public API Usage

Constructing a basic VersionEdit

use bitcoinleveldb_versionedit::VersionEdit;
use bitcoinleveldb_types::{InternalKey, SequenceNumber};

fn build_simple_edit() -> VersionEdit {
    let mut edit = VersionEdit::default();

    // set comparator name
    let cmp_name = Slice::from("leveldb.BytewiseComparator".as_bytes());
    edit.set_comparator_name(&cmp_name);

    // log / sequence metadata
    edit.set_log_number(42);
    edit.set_prev_log_number(41);
    edit.set_next_file(1000);
    edit.set_last_sequence(123_456 as SequenceNumber);

    edit
}

Adding a new file

fn add_new_sstable(
    edit: &mut VersionEdit,
    level: i32,
    file_number: u64,
    file_size: u64,
    smallest: &InternalKey,
    largest: &InternalKey,
) {
    edit.add_file(level, file_number, file_size, smallest, largest);
}

Preconditions (mirroring LevelDB's invariants):

  • smallest and largest must be the true extremal internal keys in the file.
  • The file must not have been persisted to the VersionSet yet (VersionSet::SaveTo() expectation).

Deleting a file

fn mark_file_deleted(edit: &mut VersionEdit, level: i32, file_number: u64) {
    edit.delete_file(level, file_number);
}

Internally, this records (level, file_number) in deleted_files, which will be serialized as one or more kDeletedFile entries.

Compaction pointers

fn update_compaction_pointer(
    edit: &mut VersionEdit,
    level: i32,
    key: &InternalKey,
) {
    edit.set_compact_pointer(level, key);
}

This denotes the logical resume key for future compactions at that level.

Debugging

fn log_version_edit(edit: &VersionEdit) {
    println!("{}", edit.debug_string());
}

Example output:

VersionEdit {
  Comparator: leveldb.BytewiseComparator
  LogNumber: 42
  PrevLogNumber: 41
  NextFile: 1000
  LastSeq: 123456
  CompactPointer: 1 userkey1@123
  DeleteFile: 2 57
  AddFile: 1 1001 1048576 smallest_key .. largest_key
}

Clearing and reusing a VersionEdit

fn reuse_edit(edit: &mut VersionEdit) {
    // Reset scalar state and file collections; compact_pointers remain.
    edit.clear();

    // Now you can repopulate it with new metadata and file deltas.
}

clear() simply delegates to reset_core_state(), which zeroes scalars, clears deleted_files and new_files, and resets has_* flags.


Binary Format Details

This crate encodes/decodes the same schema as canonical LevelDB:

  • Tags are varint32-encoded integers.
  • Levels are varint32-encoded unsigned integers, cast to i32 in-memory.
  • File numbers and sizes use varint64.
  • Internal keys are serialized as a length-prefixed slice (len as varint32, followed by bytes) and then decoded via InternalKey::decode_from.
  • Comparator name is also a length-prefixed slice of UTF-8 bytes.

The serialization order is purely determined by the order of fields in the VersionEdit and the order of compact_pointers and new_files vectors, except for deleted_files, which are explicitly sorted, providing deterministic binary output.

This determinism is critical when one wants to ensure that two logically identical VersionEdits result in the same manifest bytes, which facilitates:

  • Reproducible tests
  • Content-addressable storage and hashing
  • Stable replication and snapshot mechanics across nodes

Relationship to LevelDB and Bitcoin

In LevelDB (and by inheritance, Bitcoin Core's database layout), VersionEdit is the backbone for describing structural mutations in the set of SSTables. Bitcoin stores UTXO and block index information in LevelDB-style databases; exact adherence to manifest semantics is mandatory if you need to:

  • Read or write existing Bitcoin Core databases
  • Implement alternative nodes that share storage layouts
  • Perform analysis or replay of historical LevelDB states from archived manifests

This crate intentionally mirrors the C++ LevelDB logic, with additional Rust idioms (e.g., Default, strong typing around Status, and improved logging).


Safety & Concurrency Considerations

  • The core APIs are &mut self and therefore not thread-safe by themselves; wrap in synchronization primitives (Mutex, etc.) if accessed concurrently.
  • encode_to uses a raw pointer. Incorrect usage can lead to undefined behavior. If you design higher-level APIs on top of this crate, you are encouraged to encapsulate this unsafety in a small, well-tested layer that exposes only safe abstractions.
  • decode_from trusts the Slice size; it validates structure but not cryptographic authenticity. For untrusted input, pair it with higher-level validation or checksums.

When to Use This Crate

Use this crate if you need:

  • Precise, LevelDB-compatible manifest handling in Rust
  • To interoperate with Bitcoin or other LevelDB-based systems at the storage format level
  • Deterministic, testable VersionEdit encoding/decoding

This crate is probably too low-level if you only need a high-level key-value database abstraction; in that case, integrate through whatever higher-layer VersionSet or storage API bitcoin-rs exposes.


License

This crate is distributed under the MIT license, consistent with the bitcoin-rs repository.


Provenance

This crate is part of the bitcoin-rs repository and focuses exclusively on the VersionEdit component of a LevelDB-compatible storage engine.