ShardedWriter

Struct ShardedWriter 

Source
pub struct ShardedWriter<FKey, FNameFile>
where FNameFile: Fn(&str, usize) -> String,
{ /* private fields */ }

Implementations§

Source§

impl<FKey, FNameFile> ShardedWriter<FKey, FNameFile>
where FKey: Fn(&StringRecord) -> String, FNameFile: Fn(&str, usize) -> String,

Source

pub fn with_output_splitting(self, output_splitting: FileSplitting) -> Self

Creates a new writer.

You must specify the directory into which the output will be written, a function that extracts the shard key from a csv StringRecord, and how output files will be named. The file naming function accepts the shard key and a zero-based number indicating how many files have been created for this shard.

This function can return an error if the output directory can’t be created.

let writer = ShardedWriter::new(
    "./foo-sharded/",
    |record| record.get(7).unwrap_or("_unknown").to_string(),
    |shard, seq| format!("{}-file{}.csv", shard, seq)
)?;

Specifies when sharded output files should be split.

Source

pub fn with_delimiter(self, delimiter: u8) -> Self

Sets the field delimiter to be used for output files. Default is ‘,’.

Source

pub fn on_file_completion(self, f: fn(&Path, &str)) -> Self

Sets an optional function that will be called when individual files are completed, either because they have been split by the number of rows or bytes or because processing is complete and the values are being dropped.

Source

pub fn on_create_file(self, f: fn(&Path) -> Result<Box<dyn Write>>) -> Self

Takes a closure that specifies how to create output files.

The closure provides the Path of the output file to be created. If you don’t provide your own way to create output files, the default implementation will simply create a new BufWriter for the output file, which is the same as:

my_sharded_writer.on_create_file(|path| Ok(BufWriter::new(File::create(path)?)));

This function may be useful if, for example, you want to inject gzip compression into the output writer.

Source

pub fn process_file(&mut self, filename: &str) -> Result<usize, Error>

Processes the input filename, creating output files according to the specified key selector.

This function will fail if the output directory or an output file can’t be created or if a row can’t be written. It can also fail if it is called multiple times with files that have different column counts.

On success, the number of records written is returned.

Source

pub fn process_csv<T: Read>( &mut self, csv_reader: &mut Reader<T>, ) -> Result<usize, Error>

Processes the input reader, creating output files as appropriate.

This function will fail if the output directory or an output file can’t be created or if a row can’t be written. It can also fail if it is called multiple times with files that have different column counts.

On success, the number of records written is returned.

Source

pub fn process_reader(&mut self, reader: impl Read) -> Result<usize, Error>

Processes an iterator of std::io::Read, creating output files as appropriate.

Source

pub fn process_iter<T>(&mut self, records: T) -> Result<usize, Error>
where T: IntoIterator<Item = StringRecord>,

Iterates over every record, calculating the shard key for each, getting or creating the shard file, and writing the record.

Source

pub fn is_shard_key_seen(&self, key: &str) -> bool

Checks if key has been seen in the processed data.

Source

pub fn shard_keys_seen(&self) -> Vec<String>

Returns a vec of all keys that have been seen.

Trait Implementations§

Source§

impl<FKey, FNameFile> Debug for ShardedWriter<FKey, FNameFile>
where FNameFile: Fn(&str, usize) -> String,

Source§

fn fmt(&self, f: &mut Formatter<'_>) -> Result

Formats the value using the given formatter. Read more

Auto Trait Implementations§

§

impl<FKey, FNameFile> Freeze for ShardedWriter<FKey, FNameFile>
where FKey: Freeze,

§

impl<FKey, FNameFile> !RefUnwindSafe for ShardedWriter<FKey, FNameFile>

§

impl<FKey, FNameFile> !Send for ShardedWriter<FKey, FNameFile>

§

impl<FKey, FNameFile> !Sync for ShardedWriter<FKey, FNameFile>

§

impl<FKey, FNameFile> Unpin for ShardedWriter<FKey, FNameFile>
where FKey: Unpin,

§

impl<FKey, FNameFile> !UnwindSafe for ShardedWriter<FKey, FNameFile>

Blanket Implementations§

Source§

impl<T> Any for T
where T: 'static + ?Sized,

Source§

fn type_id(&self) -> TypeId

Gets the TypeId of self. Read more
Source§

impl<T> Borrow<T> for T
where T: ?Sized,

Source§

fn borrow(&self) -> &T

Immutably borrows from an owned value. Read more
Source§

impl<T> BorrowMut<T> for T
where T: ?Sized,

Source§

fn borrow_mut(&mut self) -> &mut T

Mutably borrows from an owned value. Read more
Source§

impl<T> From<T> for T

Source§

fn from(t: T) -> T

Returns the argument unchanged.

Source§

impl<T, U> Into<U> for T
where U: From<T>,

Source§

fn into(self) -> U

Calls U::from(self).

That is, this conversion is whatever the implementation of From<T> for U chooses to do.

Source§

impl<T, U> TryFrom<U> for T
where U: Into<T>,

Source§

type Error = Infallible

The type returned in the event of a conversion error.
Source§

fn try_from(value: U) -> Result<T, <T as TryFrom<U>>::Error>

Performs the conversion.
Source§

impl<T, U> TryInto<U> for T
where U: TryFrom<T>,

Source§

type Error = <U as TryFrom<T>>::Error

The type returned in the event of a conversion error.
Source§

fn try_into(self) -> Result<U, <U as TryFrom<T>>::Error>

Performs the conversion.