Struct shardio::ShardWriter[][src]

pub struct ShardWriter<T, S = DefaultSort> where
    T: 'static + Send + Serialize,
    S: SortKey<T>,
    <S as SortKey<T>>::Key: 'static + Send + Ord + Serialize + Clone
{ /* fields omitted */ }

Write a stream data items of type T to disk, in the sort order defined by S.

Data is buffered up to item_buffer_size items, then sorted, block compressed and written to disk. When the ShardWriter is dropped or has finish() called on it, it will flush remaining items to disk, write an index of the chunk data and close the file.

The get_sender() methods returns a ShardSender that must be used to send items to the writer. You must close each ShardSender by dropping it or calling its finish() method, or data may be lost. The ShardSender must be dropped/finished prior to callinng SharWriter::finish or dropping the shard writer.

Sorting

Items are sorted according to the Ord implementation of type S::Key. Type S, implementing the SortKey trait maps items of type T to their sort key of type S::Key. By default the sort key is the data item itself, and the the DefaultSort implementation of SortKey is the identity function.

Implementations

impl<T, S> ShardWriter<T, S> where
    T: 'static + Send + Serialize,
    S: SortKey<T>,
    <S as SortKey<T>>::Key: 'static + Send + Ord + Serialize + Clone
[src]

pub fn new<P: AsRef<Path>>(
    path: P,
    sender_buffer_size: usize,
    disk_chunk_size: usize,
    item_buffer_size: usize
) -> Result<ShardWriter<T, S>, Error>
[src]

Create a writer for storing data items of type T.

Arguments

  • path - Path to newly created output file
  • sender_buffer_size - number of items to buffer on the sending thread before transferring data to the writer. Each transfer to the writer requires one channel send, and one allocation. Set to ~16 or 32 it you’re sending items very rapidly (>100k/s).
  • disk_chunk_size - Number of items to store in each chunk on disk. Controls the tradeoff between indexing overhead and the granularity of reads into the sorted dataset. When reading, shardio must iterate from the start of a chunk to access an item.
  • item_buffer_size - Number of items to buffer before sorting, chunking and writing items to disk. More buffering causes each chunk to cover a smaller interval of key space (allowing for more efficient reading), but requires more memory.

pub fn get_sender(&self) -> ShardSender<T, S>[src]

Get a ShardSender. It can be sent to another thread that is generating data.

pub fn finish(&mut self) -> Result<usize, Error>[src]

Call finish if you want to detect errors in the writer IO.

Trait Implementations

impl<T, S> Drop for ShardWriter<T, S> where
    S: SortKey<T>,
    <S as SortKey<T>>::Key: 'static + Send + Ord + Serialize + Clone,
    T: Send + Serialize
[src]

Auto Trait Implementations

impl<T, S> RefUnwindSafe for ShardWriter<T, S> where
    S: RefUnwindSafe

impl<T, S> Send for ShardWriter<T, S> where
    S: Send

impl<T, S> Sync for ShardWriter<T, S> where
    S: Sync

impl<T, S> Unpin for ShardWriter<T, S> where
    S: Unpin

impl<T, S> UnwindSafe for ShardWriter<T, S> where
    S: UnwindSafe

Blanket Implementations

impl<T> Any for T where
    T: 'static + ?Sized
[src]

impl<T> Borrow<T> for T where
    T: ?Sized
[src]

impl<T> BorrowMut<T> for T where
    T: ?Sized
[src]

impl<T> From<T> for T[src]

impl<T, U> Into<U> for T where
    U: From<T>, 
[src]

impl<T, U> TryFrom<U> for T where
    U: Into<T>, 
[src]

type Error = Infallible

The type returned in the event of a conversion error.

impl<T, U> TryInto<U> for T where
    U: TryFrom<T>, 
[src]

type Error = <U as TryFrom<T>>::Error

The type returned in the event of a conversion error.