Crate vsf

Crate vsf 

Source
Expand description

§VSF (Versatile Storage Format)

Self-describing binary format with hierarchical structure, strong typing, and cryptographic primitives.

§Features

  • Self-describing: Type markers embedded in the data stream
  • Hierarchical: Offset-based seeking, unlimited nesting depth
  • Strongly-typed: Primitives (u0-u7, i3-i7, f32, f64, complex), tensors, Spirix Scalars and Circles
  • Cryptographic: Built-in BLAKE3 hashing and Ed25519 signing
  • Eagle Time: universal timestamps
  • Huffman text compression: ~2× compression over UTF-8 for strings

§Core Type System

§Primitives

  • Integers: u0-u7 (unsigned), i3-i7 (signed)
  • IEEE Floats: f5 (f32), f6 (f64)
  • IEEE Complex: j5 (Complex), j6 (Complex)
  • Spirix: s33-s77 (Scalar), c33-c77 (Circle)

§Tensors

  • Contiguous (t): Row-major multi-dimensional arrays (1D-4D)
  • Strided (q): Non-contiguous views with explicit stride

§Metadata and Labels

VSF uses labels within sections for metadata:

  • l: Label text - identifies a field within a section (e.g., “shutter_speed”, “author”)
  • Section fields can contain multiple values: (label:value1,value2,value3)
  • Sections can contain hierarchical fields: [dImaging (lshutter_speed:f6{0.01})(laperture:f5{2.8})]

Other Metadata Types:

  • x: Huffman compressed Unicode text strings
  • e: Eagle Time (seconds since lunar landing)
  • d: Data type identifier
  • o: Byte offsets
  • b: Byte lengths
  • n: Counts
  • g: Cryptographic signatures
  • h: Cryptographic hashes

§File Structure

VSF files follow a hierarchical structure:

RÅ<                                  Magic number + header start
  z3{5}                        Format version (FIRST - determines encoding)
  y3{5}               Backward compatibility version
  b#{header length}                  Header size (now we know how to encode it!)
  ef5{current time as f32}                  Eagle Time current timestamp when last edited (f32, ~2min precision)
  hp3{31}{provenance hash}            Provenance: BLAKE3 hash of content (required, always 32 bytes)
  ge{64}{signature}                  Ed25519 signature over entire file AFTER provinence hash is patched in (optional, rolling or provinence, must have one or the other)
  hb{31}{rolling_hash}               Rolling: BLAKE3 of current state with History (optional)
  k#{key}                            File-level encryption key (optional)
  n#{field count}                    Number of fields
  (d3{9}raw_image:h#{hash},o#{offset},b#{size},n#{count})     Field with values
  (d3{9}thumbnail:h#{hash},o#{offset},b#{size},n#{count})
  ...
>                                    Header end

[(section_fields...)...]           Section data at offset for RAW image, note that if section is not encrypted and closer than 1MB from the header, section name, count and length are not required. otherwise all three.
[d3{9}thumbnailn#{number of fields}b#{length of section}(section_fields...)...]

Hash Strategy (Always BLAKE3):

  • hp (hash provenance): Content identity - BLAKE3 hash of immutable content. Required. Computed with hp field as zeros, then filled in. Creates stable identifier for original content.
  • ge (signature): Optional Ed25519 signature. When signing, compute hp, sign it, then replace hp bytes with ge signature.
  • hb (hash rolling): Current file state - Optional BLAKE3 hash including History section. Updates when History updates. Useful for tracking mutable file evolution. ge or hb, must have one.

Provenance Verification: To verify a file’s provenance, zero the hp and signature/rolling hash fields and compute BLAKE3 - it will match the stored hp if original. If present, verify the ge signature against hp to authenticate the creator.

Terminology:

  • Header: Everything between RÅ< and >
  • Provenance primitives: Version, timestamp, hash, signature (NOT wrapped in ())
  • Header field: Section pointer (d"name" o b n) with POSITIONAL values (no : or ,)
  • Section: Actual data blocks after the header, located at specified offsets
  • Section field: Individual (field:value) or (field:v0,v1) entries within a section
  • ? and {}: ? indicates length (ASCII 0-Z), {} indicates binary data

The : and , separators in label records make the format human-readable in hex editors and aid in forensics and corruption analysis with minimal overhead.

§Section Flattening Example

A section with hierarchical fields for camera metadata:

[d{Imaging}
  (l{shutter_speed}:f6{0.01})      // 1/100s as f64
  (l{aperture}:f5{2.8})            // f/2.8 as f32
  (l{iso}:u4{400})                 // ISO 400
]

Which flattens to:

'[' + 'd' + '3' + {7u8} + "Imaging" +
'(' + 'l' + '3' + {13u8} + "shutter_speed" + ':' + 'f' + '6' + {0.01f64} + ')' +
'(' + 'l' + '3' + {8u8} + "aperture"      + ':' + 'f' + '5' + {2.8f32} + ')' +
'(' + 'l' + '3' + {3u8} + "iso" + ':' + 'u' + '4'+ {400u16} + ')' + ']'

Where ‘char’ indicates a single byte character, and “string” indicates ASCII text bytes.

And the final flattened byte stream is:

[d3{0x07}Imaging(l3{0x0D}shutter_speed:f6{0x7B 14 AE 47 E1 7A 84 3F})(l3{0x08}aperture:f5{0x33 33 33 40})(l3{0x03}iso:u4{0x01 90})]

Each section field is enclosed by ()’s and always starts with a text identifier (l marker + ASCII string), followed by : and its value(s) separated by ,. Section fields are flattened sequentially, creating a self-describing stream.

§Optional History Section (Will change heavily as design matures)

For applications requiring detailed tracking beyond the immutable creation timestamp:

[dHistory
 (ef6{1234567890.5},hb{256}{hash_at_creation},ltool:x{Lumis},lversion:z{0.1.2},lhost:x{workstation-sea})
 (ef6{1234567920.3},hb{256}{hash_after_modify},ltool:x{Photon},laction:x{modified},lhost:x{laptop-pdx})
 (ef6{1234567950.1},hb{256}{hash_after_access},laction:x{accessed},lhost:x{phone-mobile})
]

Each history entry records the file’s hb hash at that point in time, creating a verifiable chain of file states. To verify history integrity, recompute hb for each historical state by truncating the History section to that entry.

Which flattens to:

'[' + 'd' + '1' + {7u8} + "History" +
'(' + 'e' + 'f' + '6' + {1234567890.5f64} + ',' +
      'h' + 'b' + '3' + {32u8} + {32 bytes BLAKE3 hash} + ',' +
      'l' + '1' + {4u8} + "tool" + ':' + 'x' + '1' + {5u8} + "Lumis" + ',' +
      'l' + '1' + {7u8} + "version" + ':' + 'z' + '1' + {5u8} + "0.1.2" + ',' +
      'l' + '1' + {4u8} + "host" + ':' + 'x' + '2' + {15u8} + "workstation-sea" + ')' +
'(' + 'e' + 'f' + '6' + {1234567920.3f64} + ',' +
      'h' + 'b' + '3' + {32u8} + {32 bytes BLAKE3 hash} + ',' +
      'l' + '1' + {4u8} + "tool" + ':' + 'x' + '1' + {6u8} + "Photon" + ',' +
      'l' + '1' + {6u8} + "action" + ':' + 'x' + '1' + {8u8} + "modified" + ',' +
      'l' + '1' + {4u8} + "host" + ':' + 'x' + '1' + {10u8} + "laptop-pdx" + ')' +
'(' + 'e' + 'f' + '6' + {1234567950.1f64} + ',' +
      'h' + 'b' + '3' + {32u8} + {32 bytes BLAKE3 hash} + ',' +
      'l' + '1' + {6u8} + "action" + ':' + 'x' + '1' + {8u8} + "accessed" + ',' +
      'l' + '1' + {4u8} + "host" + ':' + 'x' + '1' + {12u8} + "mobile" + ')' + ']'

Each history entry is a complete event enclosed in ()’s with timestamp, tool, action, and context. The History section has its own hash in the header label record for integrity verification, but is NOT included in hs (static content hash). It IS included in hb (rolling file hash).


# Quick Start

use vsf::{VsfType, VsfBuilder, Tensor, parse};

// Encode a tensor let tensor = Tensor::new(vec![3, 4], vec![1u16, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12]); let encoded = VsfType::t_u4(tensor).flatten();

// Decode it back let mut ptr = 0; let decoded = parse(&encoded, &mut ptr).unwrap();

// Build a complete VSF file with header let vsf_file = VsfBuilder::new() .add_section(“metadata”, vec![ (“width”.to_string(), VsfType::u(1920, false)), (“height”.to_string(), VsfType::u(1080, false)), ]) .add_unboxed(“pixels”, vec![0xFF; 1024]) .build() .unwrap();


# Eagle Time Formats

Eagle Time counts seconds since 1969-07-20 20:17:40 UTC (Apollo 11 lunar landing).
Always coordinated, no timezones, no daylight saving. One universal time standard.

- **ef5**: 32-bit float (`f32`) - ~2 minute precision, compact for most uses
- **ef6**: 64-bit float (`f64`) - microsecond precision, for high-accuracy timestamps

The format version doesn't change the epoch or duration of seconds - both use SI seconds
counted from the same lunar landing moment. Only the precision differs.

# Parsing and Encoding

**Element-level parsing:**
```ignore
use vsf::parse;
let data = vec![b'u', b'3', 42];
let mut ptr = 0;
let value = parse(&data, &mut ptr)?;  // Parses one VsfType element

Header encoding (VsfHeader):

use vsf::file_format::VsfHeader;
let mut header = VsfHeader::new(version, backward_compat);
header.add_field(field);
let bytes = header.encode()?;  // Encodes header to bytes

Note: VsfHeader::decode() is not yet implemented. To parse headers, use element-level parse() to read individual fields. A future schema system will provide type-safe header and section parsing with automatic validation.

§Parsing APIs: Two Tiers

VSF provides two parsing approaches for sections, each suited to different use cases:

§Low-Level: VsfSection::parse() (file_format.rs)

Schema-agnostic parsing that extracts raw data without validation:

use vsf::VsfSection;

let mut ptr = 0;
let section = VsfSection::parse(&bytes, &mut ptr)?;
// Returns VsfSection with name and Vec<VsfField>
// No schema required, no validation performed

Use when:

  • Reading unknown/arbitrary VSF data
  • Debugging or inspecting files
  • Building tooling that handles any section type
  • You don’t have or need a schema

§High-Level: SectionBuilder::parse() (schema/section.rs)

Schema-validated parsing for type-safe workflows:

use vsf::schema::{SectionSchema, SectionBuilder, TypeConstraint};

let schema = SectionSchema::new("camera")
    .field("iso", TypeConstraint::AnyUnsigned)
    .field("shutter", TypeConstraint::AnyFloat);

let builder = SectionBuilder::parse(schema, &section_bytes)?;
// Validates section name matches schema
// Validates each field against type constraints
// Returns SectionBuilder for modify → re-encode workflow

Use when:

  • You know the expected structure
  • Type safety and validation matter
  • You need to modify and re-encode sections
  • Building applications with defined schemas

Both parse the same [d"name"(d"field":value)...] binary format—SectionBuilder adds schema enforcement on top of the low-level parsing.

§Module Structure

  • types - Core type definitions (VsfType, Tensor, EagleTime, WorldCoord)
  • encoding - Binary serialization (exponential-width integers, flatten)
  • decoding - Binary parsing with parse() function
  • file_format - VSF file headers and sections (VsfHeader, VsfSection)
  • vsf_builder - High-level builder for complete files
  • schema - Type-safe section schemas with field validation and parse→modify→encode
  • verification - Cryptographic hashing and signing
  • crypto_algorithms - Algorithm identifiers for hashes, signatures, keys, MACs
  • decrypt - Decryption utilities (requires crypto feature)
  • text_encoding - Huffman compression for Unicode strings (requires text feature)
  • colour - Colourspace conversions (VSF RGB, Rec.2020, sRGB, XYZ)
  • builders - Domain-specific builders (RAW images)
  • inspect - Inspection and formatting utilities (requires inspect feature)

Re-exports§

pub use types::datetime_to_eagle_time;
pub use types::eagle_time_nanos;
pub use types::EagleTime;
pub use types::EtType;
pub use types::LayoutOrder;
pub use types::StridedTensor;
pub use types::Tensor;
pub use types::VsfType;
pub use types::WorldCoord;
pub use colour::convert::ColourFormat;
pub use colour::convert::RgbLinear;
pub use colour::convert::RgbaLinear;
pub use encoding::EncodeNumber;
pub use encoding::EncodeNumberInclusive;
pub use decoding::parse;
pub use decoding::parse;
pub use file_format::validate_name;
pub use file_format::HeaderField;
pub use file_format::VsfField;
pub use file_format::VsfHeader;
pub use file_format::VsfSection;
pub use vsf_builder::SectionMeta;
pub use vsf_builder::VsfBuilder;
pub use builders::build_raw_image;
pub use builders::lumis_raw_capture;
pub use builders::parse_raw_image;
pub use builders::Aperture;
pub use builders::BlackLevel;
pub use builders::CalibrationHash;
pub use builders::CameraBuilder;
pub use builders::CameraSettings;
pub use builders::CfaPattern;
pub use builders::ExposureCompensation;
pub use builders::FlashFired;
pub use builders::FocalLength;
pub use builders::FocusDistance;
pub use builders::IsoSpeed;
pub use builders::LensBuilder;
pub use builders::LensInfo;
pub use builders::Magic9;
pub use builders::Manufacturer;
pub use builders::MeteringMode;
pub use builders::ModelName;
pub use builders::ParsedRawImage;
pub use builders::RawImageBuilder;
pub use builders::RawMetadata;
pub use builders::RawMetadataBuilder;
pub use builders::SerialNumber;
pub use builders::ShutterTime;
pub use builders::WhiteLevel;

Modules§

builders
High-level builders for common VSF use cases
colour
VSF Colourspace Library
crypto_algorithms
Cryptographic algorithm identifiers for VSF hash, signature, and key types
decoding
VSF Decoding Module
encoding
VSF Encoding Module
file_format
VSF file format with headers and hierarchical fields
schema
VSF Schema System - Type-safe section and field validation
types
VSF Type System
verification
VSF verification functions for hashing and signing
vsf_builder
High-level builder for VSF files

Constants§

VSF_BACKWARD_COMPAT
Backward compatibility version (oldest version this implementation can read)
VSF_VERSION
Current VSF format version