pub struct FileWriter { /* private fields */ }Implementations§
Source§impl FileWriter
impl FileWriter
Sourcepub fn try_new(
object_writer: Box<dyn Writer>,
schema: LanceSchema,
options: FileWriterOptions,
) -> Result<Self>
pub fn try_new( object_writer: Box<dyn Writer>, schema: LanceSchema, options: FileWriterOptions, ) -> Result<Self>
Create a new FileWriter with a desired output schema
Sourcepub fn new_lazy(
object_writer: Box<dyn Writer>,
options: FileWriterOptions,
) -> Self
pub fn new_lazy( object_writer: Box<dyn Writer>, options: FileWriterOptions, ) -> Self
Create a new FileWriter without a desired output schema
The output schema will be set based on the first batch of data to arrive. If no data arrives and the writer is finished then the write will fail.
Sourcepub fn with_page_metadata_spill(
self,
object_store: Arc<ObjectStore>,
path: Path,
) -> Self
pub fn with_page_metadata_spill( self, object_store: Arc<ObjectStore>, path: Path, ) -> Self
Spill page metadata to a sidecar file instead of accumulating in memory.
This can dramatically reduce memory usage when many writers are open
concurrently (e.g. IVF shuffle with thousands of partition writers).
The sidecar file is created lazily on the first page write. The caller
is responsible for cleaning up path (e.g. by placing it in a temp
directory that is removed via RAII).
Sourcepub async fn create_file_with_batches(
store: &ObjectStore,
path: &Path,
schema: Schema,
batches: impl Iterator<Item = RecordBatch> + Send,
options: FileWriterOptions,
) -> Result<usize>
pub async fn create_file_with_batches( store: &ObjectStore, path: &Path, schema: Schema, batches: impl Iterator<Item = RecordBatch> + Send, options: FileWriterOptions, ) -> Result<usize>
Write a series of record batches to a new file
Returns the number of rows written
Sourcepub fn version(&self) -> LanceFileVersion
pub fn version(&self) -> LanceFileVersion
Returns the format version that will be used when writing the file
Sourcepub async fn write_batches(
&mut self,
batches: impl Iterator<Item = &RecordBatch>,
) -> Result<()>
pub async fn write_batches( &mut self, batches: impl Iterator<Item = &RecordBatch>, ) -> Result<()>
Schedule batches of data to be written to the file
Sourcepub async fn write_batch(&mut self, batch: &RecordBatch) -> Result<()>
pub async fn write_batch(&mut self, batch: &RecordBatch) -> Result<()>
Schedule a batch of data to be written to the file
Note: the future returned by this method may complete before the data has been fully flushed to the file (some data may be in the data cache or the I/O cache)
Sourcepub fn add_schema_metadata(
&mut self,
key: impl Into<String>,
value: impl Into<String>,
)
pub fn add_schema_metadata( &mut self, key: impl Into<String>, value: impl Into<String>, )
Add a metadata entry to the schema
This method is useful because sometimes the metadata is not known until after the
data has been written. This method allows you to alter the schema metadata. It
must be called before finish is called.
Sourcepub fn initialize_with_external_metadata(
&mut self,
schema: Schema,
column_metadata: Vec<ColumnMetadata>,
rows_written: u64,
)
pub fn initialize_with_external_metadata( &mut self, schema: Schema, column_metadata: Vec<ColumnMetadata>, rows_written: u64, )
Prepare the writer when column data and metadata were produced externally.
This is useful for flows that copy already-encoded pages (e.g., binary copy
during compaction) where the column buffers have been written directly and we
only need to write the footer and schema metadata. The provided
column_metadata must describe the buffers already persisted by the
underlying ObjectWriter, and rows_written should reflect the total number
of rows in those buffers.
Sourcepub async fn add_global_buffer(&mut self, buffer: Bytes) -> Result<u32>
pub async fn add_global_buffer(&mut self, buffer: Bytes) -> Result<u32>
Adds a global buffer to the file
The global buffer can contain any arbitrary bytes. It will be written to the disk immediately. This method returns the index of the global buffer (this will always start at 1 and increment by 1 each time this method is called)
Sourcepub async fn finish(&mut self) -> Result<u64>
pub async fn finish(&mut self) -> Result<u64>
Finishes writing the file
This method will wait until all data has been flushed to the file. Then it will write the file metadata and the footer. It will not return until all data has been flushed and the file has been closed.
Returns the total number of rows written
pub async fn abort(&mut self)
pub async fn tell(&mut self) -> Result<u64>
pub fn field_id_to_column_indices(&self) -> &[(u32, u32)]
Auto Trait Implementations§
impl Freeze for FileWriter
impl !RefUnwindSafe for FileWriter
impl Send for FileWriter
impl !Sync for FileWriter
impl Unpin for FileWriter
impl UnsafeUnpin for FileWriter
impl !UnwindSafe for FileWriter
Blanket Implementations§
Source§impl<T> BorrowMut<T> for Twhere
T: ?Sized,
impl<T> BorrowMut<T> for Twhere
T: ?Sized,
Source§fn borrow_mut(&mut self) -> &mut T
fn borrow_mut(&mut self) -> &mut T
Source§impl<T> Instrument for T
impl<T> Instrument for T
Source§fn instrument(self, span: Span) -> Instrumented<Self>
fn instrument(self, span: Span) -> Instrumented<Self>
Source§fn in_current_span(self) -> Instrumented<Self>
fn in_current_span(self) -> Instrumented<Self>
Source§impl<T> IntoEither for T
impl<T> IntoEither for T
Source§fn into_either(self, into_left: bool) -> Either<Self, Self>
fn into_either(self, into_left: bool) -> Either<Self, Self>
self into a Left variant of Either<Self, Self>
if into_left is true.
Converts self into a Right variant of Either<Self, Self>
otherwise. Read moreSource§fn into_either_with<F>(self, into_left: F) -> Either<Self, Self>
fn into_either_with<F>(self, into_left: F) -> Either<Self, Self>
self into a Left variant of Either<Self, Self>
if into_left(&self) returns true.
Converts self into a Right variant of Either<Self, Self>
otherwise. Read more