FWOB
FWOB is a Rust implementation of the Fixed-Width Ordered Binary format family.
The project provides two format versions:
- FWOB v1 for compact fixed-width ordered files.
- FWOB v2, a fixed-page compressed format for high-performance random access, range queries, and bulk append workloads.
FWOB v2 keeps page addresses arithmetic while allowing each page to contain a variable number of fixed-width frames. A page is a fixed-size on-disk container with an 80-byte header, compressed payload, and zero padding.
Workspace
fwob-core: shared schema, frame, key, reader/writer handles, service traits, and error types.fwob-derive: derive macro for strongly typed fixed-width frames.fwob-v1: FWOB v1 reader, writer, verifier, and compatibility tests.fwob-v2: compressed fixed-page FWOB v2 reader and writer.fwob: the primary library facade with auto-detectingReader,Writer,Editor,Maintenance, andOrganizerAPIs, plus the command-line tool.
The logical Rust API is documented in docs/api.md.
Repair can promote complete ordered frames or pages left beyond the committed
count by an interrupted write, while truncating partial or invalid tails.
Installation
Install the command-line tool from crates.io:
Library crates are available separately as fwob, fwob-core, fwob-derive,
fwob-v1, and fwob-v2.
Library Quick Start
use ;
use ;
let schema = new?;
let mut writer = create_v2?;
writer.append_frame?;
writer.finish?;
let mut reader = open?;
let first = reader.first_frame?.expect;
assert_eq!;
# Ok::
See docs/api.md for reading, appending, editing, maintenance,
organization, typed-frame, and format-specific examples.
Typed frames map ordinary Rust structs directly to the stored schema:
use ;
use FwobFrame;
let mut writer = create_v2?;
writer.append?;
writer.finish?;
let mut reader = open?;
assert_eq!;
let mut editor = open?;
editor.delete_ranges?;
# Ok::
Fixed-width UTF-8 fields can use fwob_core::FixedString<N>. Values are
space-padded to exactly N bytes and rejected when their encoded byte length
exceeds the declared width.
The typed API also re-exports fwob_core::Decimal with the legacy 16-byte
decimal representation.
Ordered keys may be integers, f32, f64, or Decimal.
String-table fields may use StringIndex8, StringIndex16, StringIndex
(32-bit), or StringIndex64.
On v2, integer fields may declare Unix epoch semantics with
#[fwob(timestamp = "seconds")] or the millisecond, microsecond, and
nanosecond variants. Table and Markdown output render those fields as UTC.
Command Examples
fwob create and fwob concat refuse to overwrite an existing output. Pass
--force (or --overwrite) to replace it explicitly.
Append and concat assume every input file is internally valid. They validate
cross-file schema, string-table, and key-boundary compatibility without rescanning
each complete input. Run fwob verify FILE first when input corruption is a
concern. Mixed v1/v2 concat warns when v1's missing semantic metadata requires a
relaxed comparison. V2 output preserves available v2 semantics; forced v1 output
drops them because v1 has no semantic metadata slot.
fwob info summarizes FWOB files as a padded table. With no paths it lists
immediate *.fwob files in the current directory. Each supplied path may be a
file or directory; directory discovery is non-recursive. Add table, md,
csv, or jsonl to select the output format. The summary includes path, format,
title, frame type, key-field index, field count, frame length/count, boundary
keys, raw frame bytes, and physical-to-raw ratio.
fwob convert accepts file-to-file, file-to-directory, and
directory-to-directory conversion. Directory input discovers immediate
*.fwob files, preserves each filename, and creates the output directory when
needed. For file input, a nonexistent extensionless output is treated as a
directory; an explicit output filename should use the .fwob extension. Files
are converted concurrently; --parallelism N sets the worker
limit and defaults to the logical CPU count. Progress lines may interleave on
stderr and include the input filename, while each file's structured stdout
summary is printed atomically.
V2 writes default consistently across convert, append, concat, delete, and
split: zstd level 6, columnar-basic encoding, and estimate-shrink packing.
New v2 outputs use 512 KiB pages unless another page size is supplied; append
and delete retain the existing file's fixed page size. Create, convert, and
concat default to v2 output; pass v1 explicitly when v1 output is required.
Convert, append, concat, split, and delete write progress diagnostics to stderr
and keep structured TOML on stdout. Mutation summaries contain one
operation-specific section followed by [parameters], [packing],
[compression], and [page_stats]; the operation section includes
elapsed_seconds.
Positional tokens are case-sensitive. For example, v2, zstd, and 1MiB
are tokens; V2, ZSTD, and 1MIB are treated as paths or values rather than
their lowercase token forms.
Tuning Parameters
| Parameter | What It Controls | Typical Values |
|---|---|---|
| page-size token | Fixed physical page size. Integer with B, KB, KiB, MB, or MiB; range 1KiB..16MiB. |
512KiB (default), 1MB, 1MiB, 2MiB |
| codec token | Page compression codec. | zstd (default), lz4, smallest, uncompressed |
--zstd-level |
zstd compression level. Affects write/convert speed heavily, read speed lightly. | 3, 6 (default), 9, 12, 15, 19 |
| encoding token | Page payload layout before compression. smallest tries columnar-basic and columnar-delta per page and stores the winning concrete encoding in page metadata. |
row-raw, columnar-basic (default), columnar-delta, smallest |
| page-packing token | Packing strategy for compressed pages. | estimate-shrink (default), tight-fit |
compress-partial-page token |
Compress the final partial output page instead of leaving the final non-overflowing remainder raw. | omitted (default), present |