Struct netidx_archive::ArchiveWriter[][src]

pub struct ArchiveWriter { /* fields omitted */ }
Expand description

This reads and writes the netidx archive format (as written by the “record” command in the tools). The archive format is intended to be a compact format for storing recordings of netidx data for long term storage and access. It uses memory mapped IO for performance and memory efficiency, and as such file size is limited to usize.

Files begin with a file header, which consists of the string “netidx archive” followed by the file format version. Currently there is 1 version, and the version number is 0.

Following the header are a series of records. Every record begins with a (RecordHeader)RecordHeader, which is followed by a data item, except in the case of the end of archive record, which is not followed by a data item.

Items are written to the file using a two phase commit scheme to allow detection of possibly corrupted data. Initially, items are marked as uncommitted, and only upon a successful flush to disk are they then marked as committed.

When an archive is opened read-only, an index of it’s contents is built in memory so that any part of it can be accessed quickly by timestamp. As a result, there is some memory overhead.

In order to facilitate full reconstruction of the state at any point without requiring to decode the entire file up to that point there are two types of data records, image records contain the entire state of every archived value at a given time, and delta records contain only values that changed since the last delta record. The full state of the values can be constructed at a given time t by seeking to the nearest image record that is before t, and then processing all the delta records up to t.

Because data sets vary in requirements and size the writing of image records is configurable in the archiver (e.g. write 1 image per 512 MiB of deltas), and it is not required to write any image records, however this will mean that reconstructing the state at any point will require processing the entire file before that point.

To prevent data corruption the underling file is locked for exclusive access using the advisory file locking mechanism present in the OS (e.g. flock on unix). If the file is modified independent of advisory locking it could cause data corruption.

The record header is 8 bytes. A data record starts with a LEB128 encoded item counter, and then a number of items. Path ids are also LEB128 encoded. So, for example, in an archive containing 1 path, a batch with 1 u64 data item would look like.

8 byte header 1 byte item count 1 byte path id 1 byte type tag 8 byte u64

19 bytes (11 bytes of overhead 57%)

Better overheads can be achieved with larger batches, as should naturally happen on busier systems. For example a batch of 128 u64s looks like.

8 byte header 1 byte item count (1 byte path id 1 byte type tag 8 byte u64) * 128

1289 bytes (264 bytes of overhead 20%)

Implementations

Open the specified archive for read/write access, if the file does not exist then a new archive will be created.

flush uncommitted changes to disk, mark all flushed records as committed, and update the end of archive marker. Does nothing if everything is already committed.

allocate path ids for any of the specified paths that don’t already have one, and write a path mappings record containing the new assignments.

Add a data batch to the archive. If image is true then it will be marked as an image batch, and should contain a value for every subscriped path whether it changed or not, otherwise it will be marked as a delta batch, and should contain only values that changed since the last delta batch. This method will fail if any of the path ids in the batch are unknown.

batch timestamps are monotonicly increasing, with the granularity of 1us. As such, one should avoid writing “spurious” batches, and generally for efficiency and correctness write as few batches as possible.

Create an archive reader from this writer by creating a read-only duplicate of the memory map.

If you need lots of readers it’s best to create just one using this method, and then clone it, that way the same memory map can be shared by all the readers.

Trait Implementations

Executes the destructor for this type. Read more

Auto Trait Implementations

Blanket Implementations

Gets the TypeId of self. Read more

Immutably borrows from an owned value. Read more

Mutably borrows from an owned value. Read more

Performs the conversion.

Performs the conversion.

The alignment of pointer.

The type for initializers.

Initializes a with the given initializer. Read more

Dereferences the given pointer. Read more

Mutably dereferences the given pointer. Read more

Drops the object pointed to by the given pointer. Read more

Should always be Self

The type returned in the event of a conversion error.

Performs the conversion.

The type returned in the event of a conversion error.

Performs the conversion.