pub struct WriteParams {Show 17 fields
pub max_rows_per_file: usize,
pub max_rows_per_group: usize,
pub max_bytes_per_file: usize,
pub mode: WriteMode,
pub store_params: Option<ObjectStoreParams>,
pub progress: Arc<dyn WriteFragmentProgress>,
pub commit_handler: Option<Arc<dyn CommitHandler>>,
pub data_storage_version: Option<LanceFileVersion>,
pub enable_stable_row_ids: bool,
pub enable_v2_manifest_paths: bool,
pub session: Option<Arc<Session>>,
pub auto_cleanup: Option<AutoCleanupParams>,
pub skip_auto_cleanup: bool,
pub transaction_properties: Option<Arc<HashMap<String, String>>>,
pub initial_bases: Option<Vec<BasePath>>,
pub target_bases: Option<Vec<u32>>,
pub target_base_names_or_paths: Option<Vec<String>>,
}Expand description
Dataset Write Parameters
Fields§
§max_rows_per_file: usizeMax number of records per file.
max_rows_per_group: usizeMax number of rows per row group.
max_bytes_per_file: usizeMax file size in bytes.
This is a soft limit. The actual file size may be larger than this value by a few megabytes, since once we detect we hit this limit, we still need to flush the footer.
This limit is checked after writing each group, so if max_rows_per_group is set to a large value, this limit may be exceeded by a large amount.
The default is 90 GB. If you are using an object store such as S3, we currently have a hard 100 GB limit.
mode: WriteModeWrite mode
store_params: Option<ObjectStoreParams>§progress: Arc<dyn WriteFragmentProgress>§commit_handler: Option<Arc<dyn CommitHandler>>If present, dataset will use this to update the latest version
If not set, the default will be based on the object store. Generally this will be RenameCommitHandler unless the object store does not handle atomic renames (e.g. S3)
If a custom object store is provided (via store_params.object_store) then this must also be provided.
data_storage_version: Option<LanceFileVersion>The format version to use when writing data.
Newer versions are more efficient but the data can only be read by more recent versions of lance.
If not specified then the latest stable version will be used.
enable_stable_row_ids: boolExperimental: if set to true, the writer will use stable row ids. These row ids are stable after compaction operations, but not after updates. This makes compaction more efficient, since with stable row ids no secondary indices need to be updated to point to new row ids.
enable_v2_manifest_paths: boolIf set to true, and this is a new dataset, uses the new v2 manifest paths.
These allow constant-time lookups for the latest manifest on object storage.
This parameter has no effect on existing datasets. To migrate an existing
dataset, use the super::Dataset::migrate_manifest_paths_v2 method.
Default is True.
session: Option<Arc<Session>>§auto_cleanup: Option<AutoCleanupParams>If Some and this is a new dataset, old dataset versions will be
automatically cleaned up according to the parameters set out in
AutoCleanupParams. This parameter has no effect on existing datasets.
To add auto-cleanup to an existing dataset, use Dataset::update_config
to set lance.auto_cleanup.interval and lance.auto_cleanup.older_than.
Both parameters must be set to invoke auto-cleanup.
skip_auto_cleanup: boolIf true, skip auto cleanup during commits. This should be set to true for high frequency writes to improve performance. This is also useful if the writer does not have delete permissions and the clean up would just try and log a failure anyway. Default is false.
transaction_properties: Option<Arc<HashMap<String, String>>>Configuration key-value pairs for this write operation. This can include commit messages, engine information, etc. this properties map will be persisted as part of Transaction object.
initial_bases: Option<Vec<BasePath>>New base paths to register in the manifest during dataset creation. Each BasePath must have a properly assigned ID (non-zero). Only used in CREATE/OVERWRITE modes for manifest registration. IDs should be assigned by the caller before passing to WriteParams.
target_bases: Option<Vec<u32>>Target base IDs for writing data files. When provided, all new data files will be written to bases with these IDs. Used in all modes (CREATE, APPEND, OVERWRITE) to specify where data should be written. The IDs must correspond to either:
- IDs in initial_bases (for CREATE/OVERWRITE modes)
- IDs already registered in the existing dataset manifest (for APPEND mode)
target_base_names_or_paths: Option<Vec<String>>Target base names or paths as strings (unresolved). These will be resolved to IDs when the write operation executes. Resolution happens at builder execution time when dataset context is available.
Implementations§
Source§impl WriteParams
impl WriteParams
Sourcepub fn with_storage_version(version: LanceFileVersion) -> Self
pub fn with_storage_version(version: LanceFileVersion) -> Self
Create a new WriteParams with the given storage version. The other fields are set to their default values.
pub fn storage_version_or_default(&self) -> LanceFileVersion
pub fn store_registry(&self) -> Arc<ObjectStoreRegistry>
Sourcepub fn with_transaction_properties(
self,
properties: HashMap<String, String>,
) -> Self
pub fn with_transaction_properties( self, properties: HashMap<String, String>, ) -> Self
Set the properties for this WriteParams.
Sourcepub fn with_initial_bases(self, bases: Vec<BasePath>) -> Self
pub fn with_initial_bases(self, bases: Vec<BasePath>) -> Self
Set the initial_bases for this WriteParams.
This specifies new base paths to register in the manifest during dataset creation. Each BasePath must have a properly assigned ID (non-zero) before calling this method. Only used in CREATE/OVERWRITE modes for manifest registration.
Sourcepub fn with_target_bases(self, base_ids: Vec<u32>) -> Self
pub fn with_target_bases(self, base_ids: Vec<u32>) -> Self
Set the target_bases for this WriteParams.
This specifies the base IDs where data files should be written. The IDs must correspond to either:
- IDs in initial_bases (for CREATE/OVERWRITE modes)
- IDs already registered in the existing dataset manifest (for APPEND mode)
Sourcepub fn with_target_base_names_or_paths(self, references: Vec<String>) -> Self
pub fn with_target_base_names_or_paths(self, references: Vec<String>) -> Self
Store target base names or paths for deferred resolution.
This method stores the references in target_base_names_or_paths field
to be resolved later at execution time when the dataset manifest is available.
Resolution will happen at write execution time and will try to match:
- initial_bases by name
- initial_bases by path
- existing manifest by name
- existing manifest by path
§Arguments
references- Vector of base names or paths to be resolved later
Trait Implementations§
Source§impl Clone for WriteParams
impl Clone for WriteParams
Source§fn clone(&self) -> WriteParams
fn clone(&self) -> WriteParams
1.0.0 · Source§fn clone_from(&mut self, source: &Self)
fn clone_from(&mut self, source: &Self)
source. Read moreSource§impl Debug for WriteParams
impl Debug for WriteParams
Auto Trait Implementations§
impl Freeze for WriteParams
impl !RefUnwindSafe for WriteParams
impl Send for WriteParams
impl Sync for WriteParams
impl Unpin for WriteParams
impl !UnwindSafe for WriteParams
Blanket Implementations§
Source§impl<T> BorrowMut<T> for Twhere
T: ?Sized,
impl<T> BorrowMut<T> for Twhere
T: ?Sized,
Source§fn borrow_mut(&mut self) -> &mut T
fn borrow_mut(&mut self) -> &mut T
Source§impl<T> CloneToUninit for Twhere
T: Clone,
impl<T> CloneToUninit for Twhere
T: Clone,
Source§impl<T> Downcast for Twhere
T: Any,
impl<T> Downcast for Twhere
T: Any,
Source§fn into_any(self: Box<T>) -> Box<dyn Any>
fn into_any(self: Box<T>) -> Box<dyn Any>
Box<dyn Trait> (where Trait: Downcast) to Box<dyn Any>, which can then be
downcast into Box<dyn ConcreteType> where ConcreteType implements Trait.Source§fn into_any_rc(self: Rc<T>) -> Rc<dyn Any>
fn into_any_rc(self: Rc<T>) -> Rc<dyn Any>
Rc<Trait> (where Trait: Downcast) to Rc<Any>, which can then be further
downcast into Rc<ConcreteType> where ConcreteType implements Trait.Source§fn as_any(&self) -> &(dyn Any + 'static)
fn as_any(&self) -> &(dyn Any + 'static)
&Trait (where Trait: Downcast) to &Any. This is needed since Rust cannot
generate &Any’s vtable from &Trait’s.Source§fn as_any_mut(&mut self) -> &mut (dyn Any + 'static)
fn as_any_mut(&mut self) -> &mut (dyn Any + 'static)
&mut Trait (where Trait: Downcast) to &Any. This is needed since Rust cannot
generate &mut Any’s vtable from &mut Trait’s.Source§impl<T> DowncastSend for T
impl<T> DowncastSend for T
Source§impl<T> DowncastSync for T
impl<T> DowncastSync for T
Source§impl<T> FmtForward for T
impl<T> FmtForward for T
Source§fn fmt_binary(self) -> FmtBinary<Self>where
Self: Binary,
fn fmt_binary(self) -> FmtBinary<Self>where
Self: Binary,
self to use its Binary implementation when Debug-formatted.Source§fn fmt_display(self) -> FmtDisplay<Self>where
Self: Display,
fn fmt_display(self) -> FmtDisplay<Self>where
Self: Display,
self to use its Display implementation when
Debug-formatted.Source§fn fmt_lower_exp(self) -> FmtLowerExp<Self>where
Self: LowerExp,
fn fmt_lower_exp(self) -> FmtLowerExp<Self>where
Self: LowerExp,
self to use its LowerExp implementation when
Debug-formatted.Source§fn fmt_lower_hex(self) -> FmtLowerHex<Self>where
Self: LowerHex,
fn fmt_lower_hex(self) -> FmtLowerHex<Self>where
Self: LowerHex,
self to use its LowerHex implementation when
Debug-formatted.Source§fn fmt_octal(self) -> FmtOctal<Self>where
Self: Octal,
fn fmt_octal(self) -> FmtOctal<Self>where
Self: Octal,
self to use its Octal implementation when Debug-formatted.Source§fn fmt_pointer(self) -> FmtPointer<Self>where
Self: Pointer,
fn fmt_pointer(self) -> FmtPointer<Self>where
Self: Pointer,
self to use its Pointer implementation when
Debug-formatted.Source§fn fmt_upper_exp(self) -> FmtUpperExp<Self>where
Self: UpperExp,
fn fmt_upper_exp(self) -> FmtUpperExp<Self>where
Self: UpperExp,
self to use its UpperExp implementation when
Debug-formatted.Source§fn fmt_upper_hex(self) -> FmtUpperHex<Self>where
Self: UpperHex,
fn fmt_upper_hex(self) -> FmtUpperHex<Self>where
Self: UpperHex,
self to use its UpperHex implementation when
Debug-formatted.Source§impl<T> Instrument for T
impl<T> Instrument for T
Source§fn instrument(self, span: Span) -> Instrumented<Self>
fn instrument(self, span: Span) -> Instrumented<Self>
Source§fn in_current_span(self) -> Instrumented<Self>
fn in_current_span(self) -> Instrumented<Self>
Source§impl<T> IntoEither for T
impl<T> IntoEither for T
Source§fn into_either(self, into_left: bool) -> Either<Self, Self>
fn into_either(self, into_left: bool) -> Either<Self, Self>
self into a Left variant of Either<Self, Self>
if into_left is true.
Converts self into a Right variant of Either<Self, Self>
otherwise. Read moreSource§fn into_either_with<F>(self, into_left: F) -> Either<Self, Self>
fn into_either_with<F>(self, into_left: F) -> Either<Self, Self>
self into a Left variant of Either<Self, Self>
if into_left(&self) returns true.
Converts self into a Right variant of Either<Self, Self>
otherwise. Read moreSource§impl<T> Pipe for Twhere
T: ?Sized,
impl<T> Pipe for Twhere
T: ?Sized,
Source§fn pipe<R>(self, func: impl FnOnce(Self) -> R) -> Rwhere
Self: Sized,
fn pipe<R>(self, func: impl FnOnce(Self) -> R) -> Rwhere
Self: Sized,
Source§fn pipe_ref<'a, R>(&'a self, func: impl FnOnce(&'a Self) -> R) -> Rwhere
R: 'a,
fn pipe_ref<'a, R>(&'a self, func: impl FnOnce(&'a Self) -> R) -> Rwhere
R: 'a,
self and passes that borrow into the pipe function. Read moreSource§fn pipe_ref_mut<'a, R>(&'a mut self, func: impl FnOnce(&'a mut Self) -> R) -> Rwhere
R: 'a,
fn pipe_ref_mut<'a, R>(&'a mut self, func: impl FnOnce(&'a mut Self) -> R) -> Rwhere
R: 'a,
self and passes that borrow into the pipe function. Read moreSource§fn pipe_borrow<'a, B, R>(&'a self, func: impl FnOnce(&'a B) -> R) -> R
fn pipe_borrow<'a, B, R>(&'a self, func: impl FnOnce(&'a B) -> R) -> R
Source§fn pipe_borrow_mut<'a, B, R>(
&'a mut self,
func: impl FnOnce(&'a mut B) -> R,
) -> R
fn pipe_borrow_mut<'a, B, R>( &'a mut self, func: impl FnOnce(&'a mut B) -> R, ) -> R
Source§fn pipe_as_ref<'a, U, R>(&'a self, func: impl FnOnce(&'a U) -> R) -> R
fn pipe_as_ref<'a, U, R>(&'a self, func: impl FnOnce(&'a U) -> R) -> R
self, then passes self.as_ref() into the pipe function.Source§fn pipe_as_mut<'a, U, R>(&'a mut self, func: impl FnOnce(&'a mut U) -> R) -> R
fn pipe_as_mut<'a, U, R>(&'a mut self, func: impl FnOnce(&'a mut U) -> R) -> R
self, then passes self.as_mut() into the pipe
function.Source§fn pipe_deref<'a, T, R>(&'a self, func: impl FnOnce(&'a T) -> R) -> R
fn pipe_deref<'a, T, R>(&'a self, func: impl FnOnce(&'a T) -> R) -> R
self, then passes self.deref() into the pipe function.Source§impl<T> Pointable for T
impl<T> Pointable for T
Source§impl<T> PolicyExt for Twhere
T: ?Sized,
impl<T> PolicyExt for Twhere
T: ?Sized,
Source§impl<T> Tap for T
impl<T> Tap for T
Source§fn tap_borrow<B>(self, func: impl FnOnce(&B)) -> Self
fn tap_borrow<B>(self, func: impl FnOnce(&B)) -> Self
Borrow<B> of a value. Read moreSource§fn tap_borrow_mut<B>(self, func: impl FnOnce(&mut B)) -> Self
fn tap_borrow_mut<B>(self, func: impl FnOnce(&mut B)) -> Self
BorrowMut<B> of a value. Read moreSource§fn tap_ref<R>(self, func: impl FnOnce(&R)) -> Self
fn tap_ref<R>(self, func: impl FnOnce(&R)) -> Self
AsRef<R> view of a value. Read moreSource§fn tap_ref_mut<R>(self, func: impl FnOnce(&mut R)) -> Self
fn tap_ref_mut<R>(self, func: impl FnOnce(&mut R)) -> Self
AsMut<R> view of a value. Read moreSource§fn tap_deref<T>(self, func: impl FnOnce(&T)) -> Self
fn tap_deref<T>(self, func: impl FnOnce(&T)) -> Self
Deref::Target of a value. Read moreSource§fn tap_deref_mut<T>(self, func: impl FnOnce(&mut T)) -> Self
fn tap_deref_mut<T>(self, func: impl FnOnce(&mut T)) -> Self
Deref::Target of a value. Read moreSource§fn tap_dbg(self, func: impl FnOnce(&Self)) -> Self
fn tap_dbg(self, func: impl FnOnce(&Self)) -> Self
.tap() only in debug builds, and is erased in release builds.Source§fn tap_mut_dbg(self, func: impl FnOnce(&mut Self)) -> Self
fn tap_mut_dbg(self, func: impl FnOnce(&mut Self)) -> Self
.tap_mut() only in debug builds, and is erased in release
builds.Source§fn tap_borrow_dbg<B>(self, func: impl FnOnce(&B)) -> Self
fn tap_borrow_dbg<B>(self, func: impl FnOnce(&B)) -> Self
.tap_borrow() only in debug builds, and is erased in release
builds.Source§fn tap_borrow_mut_dbg<B>(self, func: impl FnOnce(&mut B)) -> Self
fn tap_borrow_mut_dbg<B>(self, func: impl FnOnce(&mut B)) -> Self
.tap_borrow_mut() only in debug builds, and is erased in release
builds.Source§fn tap_ref_dbg<R>(self, func: impl FnOnce(&R)) -> Self
fn tap_ref_dbg<R>(self, func: impl FnOnce(&R)) -> Self
.tap_ref() only in debug builds, and is erased in release
builds.Source§fn tap_ref_mut_dbg<R>(self, func: impl FnOnce(&mut R)) -> Self
fn tap_ref_mut_dbg<R>(self, func: impl FnOnce(&mut R)) -> Self
.tap_ref_mut() only in debug builds, and is erased in release
builds.Source§fn tap_deref_dbg<T>(self, func: impl FnOnce(&T)) -> Self
fn tap_deref_dbg<T>(self, func: impl FnOnce(&T)) -> Self
.tap_deref() only in debug builds, and is erased in release
builds.