Skip to main content

FileWriter

Struct FileWriter 

Source
pub struct FileWriter { /* private fields */ }

Implementations§

Source§

impl FileWriter

Source

pub fn try_new( object_writer: Box<dyn Writer>, schema: LanceSchema, options: FileWriterOptions, ) -> Result<Self>

Create a new FileWriter with a desired output schema

Source

pub fn new_lazy( object_writer: Box<dyn Writer>, options: FileWriterOptions, ) -> Self

Create a new FileWriter without a desired output schema

The output schema will be set based on the first batch of data to arrive. If no data arrives and the writer is finished then the write will fail.

Source

pub fn with_page_metadata_spill( self, object_store: Arc<ObjectStore>, path: Path, ) -> Self

Spill page metadata to a sidecar file instead of accumulating in memory.

This can dramatically reduce memory usage when many writers are open concurrently (e.g. IVF shuffle with thousands of partition writers). The sidecar file is created lazily on the first page write. The caller is responsible for cleaning up path (e.g. by placing it in a temp directory that is removed via RAII).

Source

pub async fn create_file_with_batches( store: &ObjectStore, path: &Path, schema: Schema, batches: impl Iterator<Item = RecordBatch> + Send, options: FileWriterOptions, ) -> Result<usize>

Write a series of record batches to a new file

Returns the number of rows written

Source

pub fn version(&self) -> LanceFileVersion

Returns the format version that will be used when writing the file

Source

pub async fn write_batches( &mut self, batches: impl Iterator<Item = &RecordBatch>, ) -> Result<()>

Schedule batches of data to be written to the file

Source

pub async fn write_batch(&mut self, batch: &RecordBatch) -> Result<()>

Schedule a batch of data to be written to the file

Note: the future returned by this method may complete before the data has been fully flushed to the file (some data may be in the data cache or the I/O cache)

Source

pub fn add_schema_metadata( &mut self, key: impl Into<String>, value: impl Into<String>, )

Add a metadata entry to the schema

This method is useful because sometimes the metadata is not known until after the data has been written. This method allows you to alter the schema metadata. It must be called before finish is called.

Source

pub fn initialize_with_external_metadata( &mut self, schema: Schema, column_metadata: Vec<ColumnMetadata>, rows_written: u64, )

Prepare the writer when column data and metadata were produced externally.

This is useful for flows that copy already-encoded pages (e.g., binary copy during compaction) where the column buffers have been written directly and we only need to write the footer and schema metadata. The provided column_metadata must describe the buffers already persisted by the underlying ObjectWriter, and rows_written should reflect the total number of rows in those buffers.

Source

pub async fn add_global_buffer(&mut self, buffer: Bytes) -> Result<u32>

Adds a global buffer to the file

The global buffer can contain any arbitrary bytes. It will be written to the disk immediately. This method returns the index of the global buffer (this will always start at 1 and increment by 1 each time this method is called)

Source

pub async fn finish(&mut self) -> Result<u64>

Finishes writing the file

This method will wait until all data has been flushed to the file. Then it will write the file metadata and the footer. It will not return until all data has been flushed and the file has been closed.

Returns the total number of rows written

Source

pub async fn abort(&mut self)

Source

pub async fn tell(&mut self) -> Result<u64>

Source

pub fn field_id_to_column_indices(&self) -> &[(u32, u32)]

Auto Trait Implementations§

Blanket Implementations§

Source§

impl<T> Any for T
where T: 'static + ?Sized,

Source§

fn type_id(&self) -> TypeId

Gets the TypeId of self. Read more
Source§

impl<T> Borrow<T> for T
where T: ?Sized,

Source§

fn borrow(&self) -> &T

Immutably borrows from an owned value. Read more
Source§

impl<T> BorrowMut<T> for T
where T: ?Sized,

Source§

fn borrow_mut(&mut self) -> &mut T

Mutably borrows from an owned value. Read more
Source§

impl<T> From<T> for T

Source§

fn from(t: T) -> T

Returns the argument unchanged.

Source§

impl<T> Instrument for T

Source§

fn instrument(self, span: Span) -> Instrumented<Self>

Instruments this type with the provided Span, returning an Instrumented wrapper. Read more
Source§

fn in_current_span(self) -> Instrumented<Self>

Instruments this type with the current Span, returning an Instrumented wrapper. Read more
Source§

impl<T, U> Into<U> for T
where U: From<T>,

Source§

fn into(self) -> U

Calls U::from(self).

That is, this conversion is whatever the implementation of From<T> for U chooses to do.

Source§

impl<T> IntoEither for T

Source§

fn into_either(self, into_left: bool) -> Either<Self, Self>

Converts self into a Left variant of Either<Self, Self> if into_left is true. Converts self into a Right variant of Either<Self, Self> otherwise. Read more
Source§

fn into_either_with<F>(self, into_left: F) -> Either<Self, Self>
where F: FnOnce(&Self) -> bool,

Converts self into a Left variant of Either<Self, Self> if into_left(&self) returns true. Converts self into a Right variant of Either<Self, Self> otherwise. Read more
Source§

impl<T> Pointable for T

Source§

const ALIGN: usize

The alignment of pointer.
Source§

type Init = T

The type for initializers.
Source§

unsafe fn init(init: <T as Pointable>::Init) -> usize

Initializes a with the given initializer. Read more
Source§

unsafe fn deref<'a>(ptr: usize) -> &'a T

Dereferences the given pointer. Read more
Source§

unsafe fn deref_mut<'a>(ptr: usize) -> &'a mut T

Mutably dereferences the given pointer. Read more
Source§

unsafe fn drop(ptr: usize)

Drops the object pointed to by the given pointer. Read more
Source§

impl<T> PolicyExt for T
where T: ?Sized,

Source§

fn and<P, B, E>(self, other: P) -> And<T, P>
where T: Policy<B, E>, P: Policy<B, E>,

Create a new Policy that returns Action::Follow only if self and other return Action::Follow. Read more
Source§

fn or<P, B, E>(self, other: P) -> Or<T, P>
where T: Policy<B, E>, P: Policy<B, E>,

Create a new Policy that returns Action::Follow if either self or other returns Action::Follow. Read more
Source§

impl<T, U> TryFrom<U> for T
where U: Into<T>,

Source§

type Error = Infallible

The type returned in the event of a conversion error.
Source§

fn try_from(value: U) -> Result<T, <T as TryFrom<U>>::Error>

Performs the conversion.
Source§

impl<T, U> TryInto<U> for T
where U: TryFrom<T>,

Source§

type Error = <U as TryFrom<T>>::Error

The type returned in the event of a conversion error.
Source§

fn try_into(self) -> Result<U, <U as TryFrom<T>>::Error>

Performs the conversion.
Source§

impl<V, T> VZip<V> for T
where V: MultiLane<T>,

Source§

fn vzip(self) -> V

Source§

impl<T> WithSubscriber for T

Source§

fn with_subscriber<S>(self, subscriber: S) -> WithDispatch<Self>
where S: Into<Dispatch>,

Attaches the provided Subscriber to this type, returning a WithDispatch wrapper. Read more
Source§

fn with_current_subscriber(self) -> WithDispatch<Self>

Attaches the current default Subscriber to this type, returning a WithDispatch wrapper. Read more