[−][src]Struct netidx_protocols::archive::ArchiveWriter
This reads and writes the netidx archive format (as written by the
"record" command in the tools). The archive format is intended to
be a compact format for storing recordings of netidx data for long
term storage and access. It uses memory mapped IO for performance
and memory efficiency, and as such file size is limited to
usize
.
Files begin with a file header, which consists of the string "netidx archive" followed by the file format version. Currently there is 1 version, and the version number is 0.
Following the header are a series of records. Every record begins with a (RecordHeader)RecordHeader, which is followed by a data item, except in the case of the end of archive record, which is not followed by a data item.
Items are written to the file using a two phase commit scheme to allow detection of possibly corrupted data. Initially, items are marked as uncommitted, and only upon a successful flush to disk are they then marked as committed.
When an archive is opened read-only, an index of it's contents is built in memory so that any part of it can be accessed quickly by timestamp. As a result, there is some memory overhead.
In order to facilitate full reconstruction of the state at any
point without requiring to decode the entire file up to that point
there are two types of data records, image records contain the
entire state of every archived value at a given time, and delta
records contain only values that changed since the last delta
record. The full state of the values can be constructed at a given
time t
by seeking to the nearest image record that is before
t
, and then processing all the delta records up to t
.
Because data sets vary in requirements and size the writing of image records is configurable in the archiver (e.g. write 1 image per 512 MiB of deltas), and it is not required to write any image records, however this will mean that reconstructing the state at any point will require processing the entire file before that point.
To prevent data corruption the underling file is locked for exclusive access using the advisory file locking mechanism present in the OS (e.g. flock on unix). If the file is modified independent of advisory locking it could cause data corruption.
The record header is 8 bytes. A data record starts with a LEB128 encoded item counter, and then a number of items. Path ids are also LEB128 encoded. So, for example, in an archive containing 1 path, a batch with 1 u64 data item would look like.
8 byte header 1 byte item count 1 byte path id 1 byte type tag 8 byte u64
19 bytes (11 bytes of overhead 57%)
Better overheads can be achieved with larger batches, as should naturally happen on busier systems. For example a batch of 128 u64s looks like.
8 byte header 1 byte item count (1 byte path id 1 byte type tag 8 byte u64) * 128
1289 bytes (264 bytes of overhead 20%)
Implementations
impl ArchiveWriter
[src]
pub fn open(path: impl AsRef<FilePath>) -> Result<Self>
[src]
Open the specified archive for read/write access, if the file does not exist then a new archive will be created.
pub fn flush(&mut self) -> Result<()>
[src]
flush uncommitted changes to disk, mark all flushed records as committed, and update the end of archive marker. Does nothing if everything is already committed.
pub fn add_paths<'a>(
&'a mut self,
paths: impl IntoIterator<Item = &'a Path>
) -> Result<()>
[src]
&'a mut self,
paths: impl IntoIterator<Item = &'a Path>
) -> Result<()>
allocate path ids for any of the specified paths that don't already have one, and write a path mappings record containing the new assignments.
pub fn add_batch(
&mut self,
image: bool,
timestamp: Timestamp,
batch: &Pooled<Vec<BatchItem>>
) -> Result<()>
[src]
&mut self,
image: bool,
timestamp: Timestamp,
batch: &Pooled<Vec<BatchItem>>
) -> Result<()>
Add a data batch to the archive. If image
is true then it
will be marked as an image batch, and should contain a value
for every subscriped path whether it changed or not, otherwise
it will be marked as a delta batch, and should contain only
values that changed since the last delta batch. This method
will fail if any of the path ids in the batch are unknown.
batch timestamps are monotonicly increasing, with the granularity of 1us. As such, one should avoid writing "spurious" batches, and generally for efficiency and correctness write as few batches as possible.
pub fn id_for_path(&self, path: &Path) -> Option<Id>
[src]
pub fn path_for_id(&self, id: &Id) -> Option<&Path>
[src]
pub fn capacity(&self) -> usize
[src]
pub fn len(&self) -> usize
[src]
pub fn block_size(&self) -> usize
[src]
pub fn reader(&self) -> Result<ArchiveReader>
[src]
Create an archive reader from this writer by creating a read-only duplicate of the memory map.
If you need lots of readers it's best to create just one using this method, and then clone it, that way the same memory map can be shared by all the readers.
Trait Implementations
impl Drop for ArchiveWriter
[src]
Auto Trait Implementations
impl RefUnwindSafe for ArchiveWriter
[src]
impl Send for ArchiveWriter
[src]
impl Sync for ArchiveWriter
[src]
impl Unpin for ArchiveWriter
[src]
impl UnwindSafe for ArchiveWriter
[src]
Blanket Implementations
impl<T> Any for T where
T: 'static + ?Sized,
[src]
T: 'static + ?Sized,
impl<T> Borrow<T> for T where
T: ?Sized,
[src]
T: ?Sized,
impl<T> BorrowMut<T> for T where
T: ?Sized,
[src]
T: ?Sized,
pub fn borrow_mut(&mut self) -> &mut T
[src]
impl<T> From<T> for T
[src]
impl<T, U> Into<U> for T where
U: From<T>,
[src]
U: From<T>,
impl<T> Pointable for T
pub const ALIGN: usize
type Init = T
The type for initializers.
pub unsafe fn init(init: <T as Pointable>::Init) -> usize
pub unsafe fn deref<'a>(ptr: usize) -> &'a T
pub unsafe fn deref_mut<'a>(ptr: usize) -> &'a mut T
pub unsafe fn drop(ptr: usize)
impl<T> Same<T> for T
type Output = T
Should always be Self
impl<T, U> TryFrom<U> for T where
U: Into<T>,
[src]
U: Into<T>,
type Error = Infallible
The type returned in the event of a conversion error.
pub fn try_from(value: U) -> Result<T, <T as TryFrom<U>>::Error>
[src]
impl<T, U> TryInto<U> for T where
U: TryFrom<T>,
[src]
U: TryFrom<T>,
type Error = <U as TryFrom<T>>::Error
The type returned in the event of a conversion error.
pub fn try_into(self) -> Result<U, <U as TryFrom<T>>::Error>
[src]
impl<V, T> VZip<V> for T where
V: MultiLane<T>,
V: MultiLane<T>,