Expand description
§Docufort
This is an append only file format with built in error correction and recovery.
This allows for recovery and consistency of partially written data (due to power loss, improper shutdown, etc).
§Features
- ECC: Error Correction Codes are used to correct errors in the data.
- Compression: Data can be compressed before being written.
- Recovery: If a block is corrupt, it can be recovered if the ECC data is intact.
- Integrity: The file format has a hash of each block to ensure data integrity.
The error correction is used as both a checksum and self-healing corruption protection in the header portions of the file, and is optional for content stored. The default allows for 2 errors every 251 bytes of data. Set the proper feature to change this.
This library provides a trait that handles all the hashing, compression and decompression for the implementer, making it transparent for usage.
§File Format
The file format is roughly as follows:
- Magic Number: 8 bytes,
docufort
- Version: 2 bytes,
V1
- ECC Length: 1 byte, the length of the ECC data used in the file.
- Block[]: A block is a set of headers and content.
- Header: A header is a timestamp and a type byte.
- Content: The content of a block.
- Hash: A hash of the block.
§Toolbox
This library is more of a toolbox, and requires proper wrapping to be useful. The purpose of exposing everything is to allow others to implement their own strategies per the spec. This library is sort of a reference implementation for the spec.
Modules§
- content_
reader - This module provides a helper function to find all the content written between two time stamps.
- core
- Core trait and structs for dealing with docufort format.
- ecc
- Error correction code (ECC) functions for encoding and decoding data. You shouldn’t need to use any of these functions directly.
- integrity
- This module contains the integrity check function for a docufort file.
- io_
retry - I/O Retry System
- read
- This module should follow the inverse of the write module. We always write to the file if we find errors reading system messages. Hence the Read + Write trait bounds for the RW generic that represents the docufort file.
- recovery
- This module contains functions for recovering the end of a docufort file.
- retry_
writer - MIGRATE TO THE io_retry module.
- write
- The format for a Docufort file is simple, consisting of three distinct message types. The primary point of interaction for users is the ‘Content’ message.
Structs§
Enums§
- Component
Tag - Represents the different read components.
- Corrupt
Data Segment - Header
Tag - Represents our different block types for matching against.
- Read
Write Error - A ReadWriterError for problems occurring during operations.
Constants§
- A_BLOCK
- Tag for an Atomic Block (b’A’) with no ECC on content.
- B_BLOCK
- Tag for a Best Effort Block (b’B’)
- CON_TAG
- First byte tag for the ‘Content’ message with no ECC on content.
- DATA_
SIZE - ECC_LEN
- END_TAG
- First byte tag for the ‘End Block’ message.
- FILE_
HEADER_ LEN - MAGIC_NUMBER(8) + Ver(2) + ECC_LEN(1)
- HASH_
AND_ ECC_ LEN - HASH(20) + ECC_LEN
- HASH_
LEN - HASH(20)
- HAS_ECC
- Bit flag indicating the presence of ECC data.
- HEADER_
LEN - TYPE(1) + TS(8) + DATA(4)
- IS_COMP
- Bit flag indicating the content is compressed.
- MAGIC_
NUMBER - Magic Number for the file format: “docufort”
- MN_ECC
- MN_
ECC_ LEN