[][src]Struct netidx_protocols::archive::ArchiveWriter

pub struct ArchiveWriter { /* fields omitted */ }

This reads and writes the netidx archive format (as written by the "record" command in the tools). The archive format is intended to be a compact format for storing recordings of netidx data for long term storage and access. It uses memory mapped IO for performance and memory efficiency, and as such file size is limited to usize.

Files begin with a file header, which consists of the string "netidx archive" followed by the file format version. Currently there is 1 version, and the version number is 0.

Following the header are a series of records. Every record begins with a (RecordHeader)RecordHeader, which is followed by a data item, except in the case of the end of archive record, which is not followed by a data item.

Items are written to the file using a two phase commit scheme to allow detection of possibly corrupted data. Initially, items are marked as uncommitted, and only upon a successful flush to disk are they then marked as committed.

When an archive is opened read-only, an index of it's contents is built in memory so that any part of it can be accessed quickly by timestamp. As a result, there is some memory overhead.

In order to facilitate full reconstruction of the state at any point without requiring to decode the entire file up to that point there are two types of data records, image records contain the entire state of every archived value at a given time, and delta records contain only values that changed since the last delta record. The full state of the values can be constructed at a given time t by seeking to the nearest image record that is before t, and then processing all the delta records up to t.

Because data sets vary in requirements and size the writing of image records is configurable in the archiver (e.g. write 1 image per 512 MiB of deltas), and it is not required to write any image records, however this will mean that reconstructing the state at any point will require processing the entire file before that point.

To prevent data corruption the underling file is locked for exclusive access using the advisory file locking mechanism present in the OS (e.g. flock on unix). If the file is modified independent of advisory locking it could cause data corruption.

The record header is 8 bytes. A data record starts with a LEB128 encoded item counter, and then a number of items. Path ids are also LEB128 encoded. So, for example, in an archive containing 1 path, a batch with 1 u64 data item would look like.

8 byte header 1 byte item count 1 byte path id 1 byte type tag 8 byte u64

19 bytes (11 bytes of overhead 57%)

Better overheads can be achieved with larger batches, as should naturally happen on busier systems. For example a batch of 128 u64s looks like.

8 byte header 1 byte item count (1 byte path id 1 byte type tag 8 byte u64) * 128

1289 bytes (264 bytes of overhead 20%)

Implementations

impl ArchiveWriter[src]

pub fn open(path: impl AsRef<FilePath>) -> Result<Self>[src]

Open the specified archive for read/write access, if the file does not exist then a new archive will be created.

pub fn flush(&mut self) -> Result<()>[src]

flush uncommitted changes to disk, mark all flushed records as committed, and update the end of archive marker. Does nothing if everything is already committed.

pub fn add_paths<'a>(
    &'a mut self,
    paths: impl IntoIterator<Item = &'a Path>
) -> Result<()>
[src]

allocate path ids for any of the specified paths that don't already have one, and write a path mappings record containing the new assignments.

pub fn add_batch(
    &mut self,
    image: bool,
    timestamp: Timestamp,
    batch: &Pooled<Vec<BatchItem>>
) -> Result<()>
[src]

Add a data batch to the archive. If image is true then it will be marked as an image batch, and should contain a value for every subscriped path whether it changed or not, otherwise it will be marked as a delta batch, and should contain only values that changed since the last delta batch. This method will fail if any of the path ids in the batch are unknown.

batch timestamps are monotonicly increasing, with the granularity of 1us. As such, one should avoid writing "spurious" batches, and generally for efficiency and correctness write as few batches as possible.

pub fn id_for_path(&self, path: &Path) -> Option<Id>[src]

pub fn path_for_id(&self, id: &Id) -> Option<&Path>[src]

pub fn capacity(&self) -> usize[src]

pub fn len(&self) -> usize[src]

pub fn block_size(&self) -> usize[src]

pub fn reader(&self) -> Result<ArchiveReader>[src]

Create an archive reader from this writer by creating a read-only duplicate of the memory map.

If you need lots of readers it's best to create just one using this method, and then clone it, that way the same memory map can be shared by all the readers.

Trait Implementations

impl Drop for ArchiveWriter[src]

Auto Trait Implementations

Blanket Implementations

impl<T> Any for T where
    T: 'static + ?Sized
[src]

impl<T> Borrow<T> for T where
    T: ?Sized
[src]

impl<T> BorrowMut<T> for T where
    T: ?Sized
[src]

impl<T> From<T> for T[src]

impl<T, U> Into<U> for T where
    U: From<T>, 
[src]

impl<T> Pointable for T

type Init = T

The type for initializers.

impl<T> Same<T> for T

type Output = T

Should always be Self

impl<T, U> TryFrom<U> for T where
    U: Into<T>, 
[src]

type Error = Infallible

The type returned in the event of a conversion error.

impl<T, U> TryInto<U> for T where
    U: TryFrom<T>, 
[src]

type Error = <U as TryFrom<T>>::Error

The type returned in the event of a conversion error.

impl<V, T> VZip<V> for T where
    V: MultiLane<T>,