Please check the build logs for more information.
See Builds for ideas on how to fix a failed build, or Metadata for how to configure docs.rs builds.
If you believe this is docs.rs' fault, open an issue.
bitcoinleveldb-coding
Low-level, allocation-conscious encoders and decoders for LevelDB-style binary formats used in bitcoin-rs. This crate exposes pointer-based primitives for:
- Fixed-width little-endian integers (
u32,u64) - Varint-encoded integers (
u32,u64) - Length-prefixed slices
- Conversions between
SliceandString/UTF‑8
The implementation is intentionally close to the original LevelDB C++ code, with Rust idioms where they do not compromise layout compatibility or performance.
Design goals
- Bit-level compatibility with LevelDB: Encodings are little-endian and follow LevelDB's varint and length-prefix conventions so data can be shared with existing LevelDB implementations.
- Zero extra allocation in hot paths: Pointer-based APIs allow writing directly into preallocated buffers and reading from raw memory without intermediate copies.
- Predictable performance: Varint encoders use simple branch patterns, and decoders operate in tight loops amenable to inlining and optimization.
- Logging-friendly: Functions are instrumented with
trace!,debug!, andwarn!calls (using thelogfacade ortracing-style macros, depending on the parent crate) to aid in debugging complex storage issues.
The crate is primarily intended as an internal component of the bitcoin-rs LevelDB port, but it can be used independently wherever LevelDB-like encodings are needed.
Encoding primitives
Fixed-width little-endian integers
These functions read/write 32-bit and 64-bit integers in little-endian order directly to/from raw pointers:
use ;
// Write a 32-bit value into an 8-byte buffer
let mut buf = ;
unsafe
assert_eq!;
// Read it back
let v = unsafe ;
assert_eq!;
APIs:
fn encode_fixed32(dst: *mut u8, value: u32)fn encode_fixed64(dst: *mut u8, value: u64)fn decode_fixed32(ptr: *const u8) -> u32fn decode_fixed64(ptr: *const u8) -> u64
These functions perform no bounds checking and are unsafe to call in a memory-safety sense. Callers must guarantee that dst/ptr points to at least 4 (for 32-bit) or 8 (for 64-bit) valid bytes.
Varint encoding
Varint encoding represents an integer using a base-128 scheme:
- Each byte carries 7 bits of payload in the low bits.
- The high bit (bit 7) is a continuation flag:
1means another byte follows,0terminates the varint.
This is identical to the scheme used in LevelDB and many other storage systems. Values in [0, 2^7) fit in 1 byte, [2^7, 2^14) in 2 bytes, etc.
Pointer-based varint encoding
use ;
let mut buf = ;
let start = buf.as_mut_ptr;
let end32 = unsafe ;
let len32 = unsafe ;
let end64 = unsafe ;
let len64 = unsafe ;
assert!;
assert!;
APIs:
fn encode_varint32(dst: *mut u8, v: u32) -> *mut u8fn encode_varint64(dst: *mut u8, v: u64) -> *mut u8
Both functions:
- Assume
dstpoints to a buffer with enough capacity (≤ 5bytes foru32,≤ 10bytes foru64). - Return a pointer to the first byte after the encoded value.
The helper fn varint_length(v: u64) -> i32 computes the length (in bytes) of the varint encoding of v. This is useful when pre-sizing buffers:
use varint_length;
let v: u64 = 1_000_000;
let len = varint_length;
assert!;
String-backed varint and fixed-width encoding
Instead of working with raw pointers, you can append encodings directly into String buffers. This matches the original LevelDB design, where std::string served as a generic byte buffer.
use ;
let mut s = Stringnew;
unsafe
let bytes = s.into_bytes;
// ``bytes`` now begins with the varint-encoded 1000, followed by 8 LE bytes
APIs:
fn put_varint32(dst: *mut String, v: u32)fn put_varint64(dst: *mut String, v: u64)fn put_fixed32(dst: *mut String, value: u32)fn put_fixed64(dst: *mut String, value: u64)
These functions:
- Treat
Stringas an opaque byte buffer viaString::as_mut_vec. - Append encoded bytes; they do not clear or truncate existing data.
- Expose a raw
*mut Stringinterface because they are designed to be called from unsafe internals where borrowing rules are already enforced at a higher level.
Decoding primitives with Slice
The crate interoperates with a Slice abstraction that behaves like a non-owning byte span with a cursor.
Varint decoding from pointer ranges
These functions decode varints from [p, limit) and either return a pointer to the first byte after the value or null() on failure.
use ;
let mut buf = ;
let start = buf.as_mut_ptr;
unsafe
APIs:
fn get_varint_32ptr(p: *const u8, limit: *const u8, value: *mut u32) -> *const u8fn get_varint_32ptr_fallback(p: *const u8, limit: *const u8, value: *mut u32) -> *const u8fn get_varint_64ptr(p: *const u8, limit: *const u8, value: *mut u64) -> *const u8
get_varint_32ptr uses a fast path for single-byte varints, then falls back to the more general get_varint_32ptr_fallback for multi-byte values.
Varint decoding from Slice
These functions parse a varint at the beginning of a Slice and advance the slice on success.
use ;
use Slice; // pseudoname; use the actual path in the repo
let mut storage = Stringnew;
unsafe
let bytes = storage.into_bytes;
let mut slice = from_ptr_len;
let mut out: u32 = 0;
let ok = unsafe ;
assert!;
assert_eq!;
// ``slice`` has been advanced past the varint
APIs:
fn get_varint32(input: *mut Slice, value: *mut u32) -> boolfn get_varint64(input: *mut Slice, value: *mut u64) -> bool
Semantics:
- On success, return
true, write the decoded value to*value, and callinput.remove_prefix(consumed_bytes). - On failure (overflow or not enough bytes), return
falseand leaveinputunchanged.
Length-prefixed slices
Length-prefixed slices are encoded as:
- A
u32lengthLencoded as varint32. - Followed by
Lraw bytes.
This format is omnipresent in LevelDB metadata (keys, values, and other structures).
Encoding length-prefixed slices
use put_length_prefixed_slice;
use Slice; // adjust path to actual crate
let mut s = Stringnew;
let data = b"hello world";
let slice = unsafe ;
unsafe
// s now holds: varint32(len=11) + b"hello world"
API:
fn put_length_prefixed_slice(dst: *mut String, value: &Slice)
Behavior:
- Panics are avoided: if length exceeds
u32::MAX, the function logs an error and returns early. - For zero-length slices, only the length varint (0) is written.
Decoding length-prefixed slices
From a mutable Slice cursor:
use get_length_prefixed_slice;
use Slice;
// suppose ``input`` points at a length-prefixed slice
let mut input: Slice = /* ... */;
let mut out: Slice = default; // or uninitialized according to actual API
let ok = unsafe ;
if ok
From raw pointers with an explicit limit:
use get_length_prefixed_slice_with_limit;
use Slice;
let buf: & = /* ... */;
let mut out: Slice = default;
let next = unsafe ;
if !next.is_null
APIs:
fn get_length_prefixed_slice(input: *mut Slice, result: *mut Slice) -> boolfn get_length_prefixed_slice_with_limit(p: *const u8, limit: *const u8, result: *mut Slice) -> *const u8
Both validate that the declared length does not exceed the available bytes.
Slice to UTF‑8 conversion
For debugging or higher-level string handling, slice_to_utf8 converts a Slice into an owned String using from_utf8_lossy semantics:
use slice_to_utf8;
use Slice;
let bytes = b"example";
let slice = unsafe ;
let s = slice_to_utf8;
assert_eq!;
API:
fn slice_to_utf8(slice: &Slice) -> String
Behavior:
- If the slice is empty or has a null data pointer, returns an empty
String. - Invalid UTF‑8 sequences are replaced with the Unicode replacement character; this is deliberate to avoid panics in low-level diagnostics.
Safety and invariants
Almost all functions in this crate are unsafe to use indirectly because they operate on raw pointers or manipulate String internals.
Callers must ensure:
- Pointers (
*const u8/*mut u8) point to valid, appropriately sized memory. limitpointers in decoding functions delimit the actual readable range;p <= limitand the region[p, limit)must remain valid for the duration of the call.Slicevalues obey their own invariants:data()andsize()reflect a valid contiguous region.- No concurrent mutable aliasing of the same
StringorSliceoccurs across threads without synchronization.
The crate itself does not attempt to enforce Rust's aliasing rules; it assumes that higher-level abstractions (e.g., the LevelDB table code) orchestrate these invariants.
Relationship to mathematics and bit-level representation
Varint encoding is effectively a representation of a non-negative integer in base 128 with a self-delimiting prefix code:
- Let
vbe a non-negative integer. - Repeatedly emit
v mod 128(7 bits) and set the continuation bit to1whilev >= 128. - For the final byte, emit
v mod 128with continuation bit0.
This yields a prefix-free code over u64 with the following length function:
[ \ell(v) = 1 + \left\lfloor \log_{128} v \right\rfloor \quad (v > 0), \quad \ell(0) = 1. ]
By encoding smaller integers with fewer bytes, storage layouts benefit significantly when keys and lengths are typically small (common in LevelDB metadata and in many Bitcoin-related indices).
Integration within bitcoin-rs
This crate lives in the bitcoin-rs monorepo and is designed to be used by the LevelDB-compatible storage layer that underpins components such as block indexes, UTXO sets, or other key-value stores.
Typical usage pattern:
- Serialize structured metadata into a
StringorVec<u8>usingput_*APIs. - Store that byte sequence in LevelDB or a LevelDB-compatible backend.
- Deserialize on load using
get_*pointer orSlice-based APIs.
Because the encodings match the canonical C++ LevelDB representation, databases can be shared between Rust and C++ nodes without reindexing.
Crate metadata
- Name:
bitcoinleveldb-coding - Version:
0.1.19 - Edition:
2021 - License: MIT
- Repository: https://github.com/klebs6/bitcoin-rs
- Authors:
klebs <none>
This crate is intended for advanced users who are comfortable reasoning about memory safety, binary layout, and cross-language interoperability.