Struct netidx_archive::ArchiveWriter[−][src]

pub struct ArchiveWriter { /* fields omitted */ }

Expand description

This reads and writes the netidx archive format (as written by the “record” command in the tools). The archive format is intended to be a compact format for storing recordings of netidx data for long term storage and access. It uses memory mapped IO for performance and memory efficiency, and as such file size is limited to usize.

Files begin with a file header, which consists of the string “netidx archive” followed by the file format version. Currently there is 1 version, and the version number is 0.

Following the header are a series of records. Every record begins with a (RecordHeader)RecordHeader, which is followed by a data item, except in the case of the end of archive record, which is not followed by a data item.

Items are written to the file using a two phase commit scheme to allow detection of possibly corrupted data. Initially, items are marked as uncommitted, and only upon a successful flush to disk are they then marked as committed.

When an archive is opened read-only, an index of it’s contents is built in memory so that any part of it can be accessed quickly by timestamp. As a result, there is some memory overhead.

In order to facilitate full reconstruction of the state at any point without requiring to decode the entire file up to that point there are two types of data records, image records contain the entire state of every archived value at a given time, and delta records contain only values that changed since the last delta record. The full state of the values can be constructed at a given time t by seeking to the nearest image record that is before t, and then processing all the delta records up to t.

Because data sets vary in requirements and size the writing of image records is configurable in the archiver (e.g. write 1 image per 512 MiB of deltas), and it is not required to write any image records, however this will mean that reconstructing the state at any point will require processing the entire file before that point.

To prevent data corruption the underling file is locked for exclusive access using the advisory file locking mechanism present in the OS (e.g. flock on unix). If the file is modified independent of advisory locking it could cause data corruption.

The record header is 8 bytes. A data record starts with a LEB128 encoded item counter, and then a number of items. Path ids are also LEB128 encoded. So, for example, in an archive containing 1 path, a batch with 1 u64 data item would look like.

8 byte header 1 byte item count 1 byte path id 1 byte type tag 8 byte u64

19 bytes (11 bytes of overhead 57%)

Better overheads can be achieved with larger batches, as should naturally happen on busier systems. For example a batch of 128 u64s looks like.

8 byte header 1 byte item count (1 byte path id 1 byte type tag 8 byte u64) * 128

1289 bytes (264 bytes of overhead 20%)

Struct netidx_archive::ArchiveWriter[−][src]

8 byte header 1 byte item count 1 byte path id 1 byte type tag 8 byte u64

8 byte header 1 byte item count (1 byte path id 1 byte type tag 8 byte u64) * 128

Implementations

impl ArchiveWriter

pub fn open(path: impl AsRef<FilePath>) -> Result<Self>

pub fn flush(&mut self) -> Result<()>

pub fn add_paths<'a>( &'a mut self, paths: impl IntoIterator<Item = &'a Path>) -> Result<()>

pub fn add_batch( &mut self, image: bool, timestamp: Timestamp, batch: &Pooled<Vec<BatchItem>>) -> Result<()>

pub fn id_for_path(&self, path: &Path) -> Option<Id>

pub fn path_for_id(&self, id: &Id) -> Option<&Path>

pub fn capacity(&self) -> usize

pub fn len(&self) -> usize

pub fn block_size(&self) -> usize

pub fn reader(&self) -> Result<ArchiveReader>

Trait Implementations

impl Drop for ArchiveWriter

fn drop(&mut self)

Auto Trait Implementations

impl RefUnwindSafe for ArchiveWriter

impl Send for ArchiveWriter

impl Sync for ArchiveWriter

impl Unpin for ArchiveWriter

impl UnwindSafe for ArchiveWriter

Blanket Implementations

impl<T> Any for T where T: 'static + ?Sized,

pub fn type_id(&self) -> TypeId

impl<T> Borrow<T> for T where T: ?Sized,

pub fn borrow(&self) -> &T

impl<T> BorrowMut<T> for T where T: ?Sized,

pub fn borrow_mut(&mut self) -> &mut T

impl<T> From<T> for T

pub fn from(t: T) -> T

impl<T, U> Into<U> for T where U: From<T>,

pub fn into(self) -> U

impl<T> Pointable for T

pub const ALIGN: usize

type Init = T

pub unsafe fn init(init: <T as Pointable>::Init) -> usize

pub unsafe fn deref<'a>(ptr: usize) -> &'a T

pub unsafe fn deref_mut<'a>(ptr: usize) -> &'a mut T

pub unsafe fn drop(ptr: usize)

impl<T> Same<T> for T

type Output = T

impl<T, U> TryFrom<U> for T where U: Into<T>,

type Error = Infallible

pub fn try_from(value: U) -> Result<T, <T as TryFrom<U>>::Error>

impl<T, U> TryInto<U> for T where U: TryFrom<T>,

type Error = <U as TryFrom<T>>::Error

pub fn try_into(self) -> Result<U, <U as TryFrom<T>>::Error>

impl<V, T> VZip<V> for T where V: MultiLane<T>,

pub fn vzip(self) -> V

pub fn add_paths<'a>(
&'a mut self,
paths: impl IntoIterator<Item = &'a Path>
) -> Result<()>

pub fn add_batch(
&mut self,
image: bool,
timestamp: Timestamp,
batch: &Pooled<Vec<BatchItem>>
) -> Result<()>

impl<T> Any for T where
T: 'static + ?Sized,

impl<T> Borrow<T> for T where
T: ?Sized,

impl<T> BorrowMut<T> for T where
T: ?Sized,

impl<T, U> Into<U> for T where
U: From<T>,

impl<T, U> TryFrom<U> for T where
U: Into<T>,

impl<T, U> TryInto<U> for T where
U: TryFrom<T>,

impl<V, T> VZip<V> for T where
V: MultiLane<T>,