pub struct StreamingDeletionVectorWriter<'a, W: Write> { /* private fields */ }Expand description
A streaming writer for deletion vectors.
This writer allows for writing multiple deletion vectors to a single file in a streaming fashion, which is memory-efficient for distributed workloads where deletion vectors are generated on executors.
§Format
The writer produces deletion vector files in the Delta Lake format:
- The first byte of the file is a version byte (currently 1)
- Each DV is prefixed with a 4-byte size (big-endian) of the serialized data
- Followed by a 4-byte magic number (0x64485871, little-endian)
- Followed by the serialized 64-bit Roaring Bitmap
- Followed by a 4-byte CRC32 checksum (big-endian) of the serialized data
§Examples
use delta_kernel::actions::deletion_vector_writer::{StreamingDeletionVectorWriter, KernelDeletionVector};
let mut buffer = Vec::new();
let mut writer = StreamingDeletionVectorWriter::new(&mut buffer);
let mut dv = KernelDeletionVector::new();
dv.add_deleted_row_indexes([1, 5, 10]);
let descriptor = writer.write_deletion_vector(dv)?;
writer.finalize()?;Implementations§
Source§impl<'a, W: Write> StreamingDeletionVectorWriter<'a, W>
impl<'a, W: Write> StreamingDeletionVectorWriter<'a, W>
Sourcepub fn new(writer: &'a mut W) -> Self
pub fn new(writer: &'a mut W) -> Self
Create a new streaming deletion vector writer.
§Arguments
writer- A mutable reference to any type implementingstd::io::Write.
§Examples
let mut buffer = Vec::new();
let writer = StreamingDeletionVectorWriter::new(&mut buffer);Sourcepub fn write_deletion_vector(
&mut self,
deletion_vector: impl DeletionVector,
) -> DeltaResult<DeletionVectorWriteResult>
pub fn write_deletion_vector( &mut self, deletion_vector: impl DeletionVector, ) -> DeltaResult<DeletionVectorWriteResult>
Write a deletion vector to the underlying writer.
This method can be called multiple times to write multiple deletion vectors to the same writer. The caller is responsible for keeping track of which deletion vector corresponds to which data file.
§Arguments
deletion_vector- The deletion vector to write
§Returns
A DeletionVectorWriteResult containing the offset, size, and cardinality
of the written deletion vector.
§Errors
Returns an error if:
- The writer fails to write data
- The deletion vector cannot be serialized
- The offset or size would overflow an i32
§Examples
let mut dv = KernelDeletionVector::new();
dv.add_deleted_row_indexes([1, 5, 10]);
let descriptor = writer.write_deletion_vector(dv)?;
println!("Written DV at offset {} with size {}", descriptor.offset, descriptor.size_in_bytes);Sourcepub fn finalize(self) -> DeltaResult<()>
pub fn finalize(self) -> DeltaResult<()>
Finalize all writes and flush the underlying writer.
This method should be called after all deletion vectors have been written. After calling this method, the writer should not be used anymore.
§Errors
Returns an error if flushing the writer fails.
§Examples
writer.write_deletion_vector(dv1)?;
writer.write_deletion_vector(dv2)?;
writer.finalize()?;Auto Trait Implementations§
impl<'a, W> Freeze for StreamingDeletionVectorWriter<'a, W>
impl<'a, W> RefUnwindSafe for StreamingDeletionVectorWriter<'a, W>where
W: RefUnwindSafe,
impl<'a, W> Send for StreamingDeletionVectorWriter<'a, W>where
W: Send,
impl<'a, W> Sync for StreamingDeletionVectorWriter<'a, W>where
W: Sync,
impl<'a, W> Unpin for StreamingDeletionVectorWriter<'a, W>
impl<'a, W> UnsafeUnpin for StreamingDeletionVectorWriter<'a, W>
impl<'a, W> !UnwindSafe for StreamingDeletionVectorWriter<'a, W>
Blanket Implementations§
Source§impl<T> AsAny for T
impl<T> AsAny for T
Source§fn any_ref(&self) -> &(dyn Any + Sync + Send + 'static)
fn any_ref(&self) -> &(dyn Any + Sync + Send + 'static)
dyn Any reference to the object: Read moreSource§fn as_any(self: Arc<T>) -> Arc<dyn Any + Sync + Send>
fn as_any(self: Arc<T>) -> Arc<dyn Any + Sync + Send>
Arc<dyn Any> reference to the object: Read moreSource§fn into_any(self: Box<T>) -> Box<dyn Any + Sync + Send>
fn into_any(self: Box<T>) -> Box<dyn Any + Sync + Send>
Box<dyn Any>: Read moreSource§fn type_name(&self) -> &'static str
fn type_name(&self) -> &'static str
std::any::type_name, since Any does not provide it and
Any::type_id is useless as a debugging aid (its Debug is just a mess of hex digits).Source§impl<T> BorrowMut<T> for Twhere
T: ?Sized,
impl<T> BorrowMut<T> for Twhere
T: ?Sized,
Source§fn borrow_mut(&mut self) -> &mut T
fn borrow_mut(&mut self) -> &mut T
Source§impl<T> Instrument for T
impl<T> Instrument for T
Source§fn instrument(self, span: Span) -> Instrumented<Self>
fn instrument(self, span: Span) -> Instrumented<Self>
Source§fn in_current_span(self) -> Instrumented<Self>
fn in_current_span(self) -> Instrumented<Self>
Source§impl<T> IntoEither for T
impl<T> IntoEither for T
Source§fn into_either(self, into_left: bool) -> Either<Self, Self>
fn into_either(self, into_left: bool) -> Either<Self, Self>
self into a Left variant of Either<Self, Self>
if into_left is true.
Converts self into a Right variant of Either<Self, Self>
otherwise. Read moreSource§fn into_either_with<F>(self, into_left: F) -> Either<Self, Self>
fn into_either_with<F>(self, into_left: F) -> Either<Self, Self>
self into a Left variant of Either<Self, Self>
if into_left(&self) returns true.
Converts self into a Right variant of Either<Self, Self>
otherwise. Read moreSource§impl<T> PolicyExt for Twhere
T: ?Sized,
impl<T> PolicyExt for Twhere
T: ?Sized,
Source§impl<KernelType, ArrowType> TryIntoArrow<ArrowType> for KernelTypewhere
ArrowType: TryFromKernel<KernelType>,
impl<KernelType, ArrowType> TryIntoArrow<ArrowType> for KernelTypewhere
ArrowType: TryFromKernel<KernelType>,
Source§fn try_into_arrow(self) -> Result<ArrowType, ArrowError>
fn try_into_arrow(self) -> Result<ArrowType, ArrowError>
default-engine-native-tls or default-engine-rustls or arrow-conversion) and crate feature arrow-conversion only.Source§impl<KernelType, ArrowType> TryIntoKernel<KernelType> for ArrowTypewhere
KernelType: TryFromArrow<ArrowType>,
impl<KernelType, ArrowType> TryIntoKernel<KernelType> for ArrowTypewhere
KernelType: TryFromArrow<ArrowType>,
Source§fn try_into_kernel(self) -> Result<KernelType, ArrowError>
fn try_into_kernel(self) -> Result<KernelType, ArrowError>
default-engine-native-tls or default-engine-rustls or arrow-conversion) and crate feature arrow-conversion only.