pub struct WriteParams {Show 22 fields
pub max_rows_per_file: usize,
pub max_rows_per_group: usize,
pub max_bytes_per_file: usize,
pub mode: WriteMode,
pub store_params: Option<ObjectStoreParams>,
pub base_store_params: Option<HashMap<String, ObjectStoreParams>>,
pub progress: Arc<dyn WriteFragmentProgress>,
pub write_progress: Option<WriteProgressFn>,
pub commit_handler: Option<Arc<dyn CommitHandler>>,
pub data_storage_version: Option<LanceFileVersion>,
pub enable_stable_row_ids: bool,
pub enable_v2_manifest_paths: bool,
pub session: Option<Arc<Session>>,
pub auto_cleanup: Option<AutoCleanupParams>,
pub skip_auto_cleanup: bool,
pub transaction_properties: Option<Arc<HashMap<String, String>>>,
pub initial_bases: Option<Vec<BasePath>>,
pub target_bases: Option<Vec<u32>>,
pub target_base_names_or_paths: Option<Vec<String>>,
pub allow_external_blob_outside_bases: bool,
pub external_blob_mode: ExternalBlobMode,
pub blob_pack_file_size_threshold: Option<usize>,
}Expand description
Dataset Write Parameters
Fields§
§max_rows_per_file: usizeMax number of records per file.
max_rows_per_group: usizeMax number of rows per row group.
max_bytes_per_file: usizeMax file size in bytes.
This is a soft limit. The actual file size may be larger than this value by a few megabytes, since once we detect we hit this limit, we still need to flush the footer.
This limit is checked after writing each group, so if max_rows_per_group is set to a large value, this limit may be exceeded by a large amount.
The default is 90 GB. If you are using an object store such as S3, we currently have a hard 100 GB limit.
mode: WriteModeWrite mode
store_params: Option<ObjectStoreParams>§base_store_params: Option<HashMap<String, ObjectStoreParams>>§progress: Arc<dyn WriteFragmentProgress>§write_progress: Option<WriteProgressFn>Optional callback invoked after each batch is written.
Receives cumulative WriteStats so callers can render a progress bar
or compute throughput. The callback must be cheap and non-blocking;
spawn a task if you need async work.
commit_handler: Option<Arc<dyn CommitHandler>>If present, dataset will use this to update the latest version
If not set, the default will be based on the object store. Generally this will be RenameCommitHandler unless the object store does not handle atomic renames (e.g. S3)
If a custom object store is provided (via store_params.object_store) then this must also be provided.
data_storage_version: Option<LanceFileVersion>The format version to use when writing data.
Newer versions are more efficient but the data can only be read by more recent versions of lance.
If not specified then the latest stable version will be used.
enable_stable_row_ids: boolExperimental: if set to true, the writer will use stable row ids. These row ids are stable after compaction operations, but not after updates. This makes compaction more efficient, since with stable row ids no secondary indices need to be updated to point to new row ids.
enable_v2_manifest_paths: boolIf set to true, and this is a new dataset, uses the new v2 manifest paths.
These allow constant-time lookups for the latest manifest on object storage.
This parameter has no effect on existing datasets. To migrate an existing
dataset, use the super::Dataset::migrate_manifest_paths_v2 method.
Default is True.
session: Option<Arc<Session>>§auto_cleanup: Option<AutoCleanupParams>If Some and this is a new dataset, old dataset versions will be
automatically cleaned up after commits according to the parameters set
out in AutoCleanupParams. This parameter has no effect on existing
datasets. To add auto-cleanup to an existing dataset, use
Dataset::update_config to set lance.auto_cleanup.interval and
lance.auto_cleanup.older_than. Both parameters must be set to invoke
auto-cleanup.
Defaults to None (auto-cleanup disabled). Enabling it makes every
interval-th commit run a full cleanup pass, which lists and reads every
manifest in the dataset even when nothing is old enough to delete; on
object stores this adds noticeable per-commit latency that grows with the
version count. Prefer calling Dataset::cleanup_old_versions explicitly
when you actually need to reclaim space.
skip_auto_cleanup: boolIf true, skip auto cleanup during commits. This should be set to true for high frequency writes to improve performance. This is also useful if the writer does not have delete permissions and the clean up would just try and log a failure anyway. Default is false.
transaction_properties: Option<Arc<HashMap<String, String>>>Configuration key-value pairs for this write operation. This can include commit messages, engine information, etc. this properties map will be persisted as part of Transaction object.
initial_bases: Option<Vec<BasePath>>New base paths to register in the manifest during dataset creation. Each BasePath must have a properly assigned ID (non-zero). Only used in CREATE/OVERWRITE modes for manifest registration. IDs should be assigned by the caller before passing to WriteParams.
target_bases: Option<Vec<u32>>Target base IDs for writing data files. When provided, all new data files will be written to bases with these IDs. Used in all modes (CREATE, APPEND, OVERWRITE) to specify where data should be written. The IDs must correspond to either:
- IDs in initial_bases (for CREATE/OVERWRITE modes)
- IDs already registered in the existing dataset manifest (for APPEND mode)
target_base_names_or_paths: Option<Vec<String>>Target base names or paths as strings (unresolved). These will be resolved to IDs when the write operation executes. Resolution happens at builder execution time when dataset context is available.
allow_external_blob_outside_bases: boolAllow writing external blob URIs that cannot be mapped to any registered non-dataset-root base path. When disabled, such rows are rejected.
external_blob_mode: ExternalBlobModeThe strategy used when writing external blob URIs.
blob_pack_file_size_threshold: Option<usize>Maximum size in bytes for blob v2 pack (.blob) sidecar files. When a pack file reaches this size, a new one is started. If not set, defaults to 1 GiB.
Implementations§
Source§impl WriteParams
impl WriteParams
Sourcepub fn with_storage_version(version: LanceFileVersion) -> WriteParams
pub fn with_storage_version(version: LanceFileVersion) -> WriteParams
Create a new WriteParams with the given storage version. The other fields are set to their default values.
pub fn storage_version_or_default(&self) -> LanceFileVersion
pub fn store_registry(&self) -> Arc<ObjectStoreRegistry> ⓘ
Sourcepub fn with_base_store_params(
self,
base_path: impl AsRef<str>,
store_params: ObjectStoreParams,
) -> WriteParams
pub fn with_base_store_params( self, base_path: impl AsRef<str>, store_params: ObjectStoreParams, ) -> WriteParams
Set exact runtime object store params for a registered base path.
These params are used as-is for that base. The write-level default
store_params remain the fallback for bases without an explicit binding.
Sourcepub fn with_transaction_properties(
self,
properties: HashMap<String, String>,
) -> WriteParams
pub fn with_transaction_properties( self, properties: HashMap<String, String>, ) -> WriteParams
Set the properties for this WriteParams.
Sourcepub fn with_initial_bases(self, bases: Vec<BasePath>) -> WriteParams
pub fn with_initial_bases(self, bases: Vec<BasePath>) -> WriteParams
Set the initial_bases for this WriteParams.
This specifies new base paths to register in the manifest during dataset creation. Each BasePath must have a properly assigned ID (non-zero) before calling this method. Only used in CREATE/OVERWRITE modes for manifest registration.
Sourcepub fn with_target_bases(self, base_ids: Vec<u32>) -> WriteParams
pub fn with_target_bases(self, base_ids: Vec<u32>) -> WriteParams
Set the target_bases for this WriteParams.
This specifies the base IDs where data files should be written. The IDs must correspond to either:
- IDs in initial_bases (for CREATE/OVERWRITE modes)
- IDs already registered in the existing dataset manifest (for APPEND mode)
Sourcepub fn with_target_base_names_or_paths(
self,
references: Vec<String>,
) -> WriteParams
pub fn with_target_base_names_or_paths( self, references: Vec<String>, ) -> WriteParams
Store target base names or paths for deferred resolution.
This method stores the references in target_base_names_or_paths field
to be resolved later at execution time when the dataset manifest is available.
Resolution will happen at write execution time and will try to match:
- initial_bases by name
- initial_bases by path
- existing manifest by name
- existing manifest by path
§Arguments
references- Vector of base names or paths to be resolved later
Sourcepub fn with_allow_external_blob_outside_bases(self, allow: bool) -> WriteParams
pub fn with_allow_external_blob_outside_bases(self, allow: bool) -> WriteParams
Configure whether external blobs outside registered bases are allowed.
Sourcepub fn with_external_blob_mode(self, mode: ExternalBlobMode) -> WriteParams
pub fn with_external_blob_mode(self, mode: ExternalBlobMode) -> WriteParams
Configure how external blob URIs are handled during writes.
Sourcepub fn with_blob_pack_file_size_threshold(self, max_bytes: usize) -> WriteParams
pub fn with_blob_pack_file_size_threshold(self, max_bytes: usize) -> WriteParams
Set the maximum size in bytes for blob v2 pack (.blob) sidecar files.
Trait Implementations§
Source§impl Clone for WriteParams
impl Clone for WriteParams
Source§fn clone(&self) -> WriteParams
fn clone(&self) -> WriteParams
1.0.0 (const: unstable) · Source§fn clone_from(&mut self, source: &Self)
fn clone_from(&mut self, source: &Self)
source. Read moreSource§impl Debug for WriteParams
impl Debug for WriteParams
Source§impl Default for WriteParams
impl Default for WriteParams
Source§fn default() -> WriteParams
fn default() -> WriteParams
Auto Trait Implementations§
impl !RefUnwindSafe for WriteParams
impl !UnwindSafe for WriteParams
impl Freeze for WriteParams
impl Send for WriteParams
impl Sync for WriteParams
impl Unpin for WriteParams
impl UnsafeUnpin for WriteParams
Blanket Implementations§
Source§impl<T> BorrowMut<T> for Twhere
T: ?Sized,
impl<T> BorrowMut<T> for Twhere
T: ?Sized,
Source§fn borrow_mut(&mut self) -> &mut T
fn borrow_mut(&mut self) -> &mut T
Source§impl<T> CloneToUninit for Twhere
T: Clone,
impl<T> CloneToUninit for Twhere
T: Clone,
Source§impl<T> DropFlavorWrapper<T> for T
impl<T> DropFlavorWrapper<T> for T
impl<T> ErasedDestructor for Twhere
T: 'static,
Source§impl<T> FmtForward for T
impl<T> FmtForward for T
Source§fn fmt_binary(self) -> FmtBinary<Self>where
Self: Binary,
fn fmt_binary(self) -> FmtBinary<Self>where
Self: Binary,
self to use its Binary implementation when Debug-formatted.Source§fn fmt_display(self) -> FmtDisplay<Self>where
Self: Display,
fn fmt_display(self) -> FmtDisplay<Self>where
Self: Display,
self to use its Display implementation when
Debug-formatted.Source§fn fmt_lower_exp(self) -> FmtLowerExp<Self>where
Self: LowerExp,
fn fmt_lower_exp(self) -> FmtLowerExp<Self>where
Self: LowerExp,
self to use its LowerExp implementation when
Debug-formatted.Source§fn fmt_lower_hex(self) -> FmtLowerHex<Self>where
Self: LowerHex,
fn fmt_lower_hex(self) -> FmtLowerHex<Self>where
Self: LowerHex,
self to use its LowerHex implementation when
Debug-formatted.Source§fn fmt_octal(self) -> FmtOctal<Self>where
Self: Octal,
fn fmt_octal(self) -> FmtOctal<Self>where
Self: Octal,
self to use its Octal implementation when Debug-formatted.Source§fn fmt_pointer(self) -> FmtPointer<Self>where
Self: Pointer,
fn fmt_pointer(self) -> FmtPointer<Self>where
Self: Pointer,
self to use its Pointer implementation when
Debug-formatted.Source§fn fmt_upper_exp(self) -> FmtUpperExp<Self>where
Self: UpperExp,
fn fmt_upper_exp(self) -> FmtUpperExp<Self>where
Self: UpperExp,
self to use its UpperExp implementation when
Debug-formatted.Source§fn fmt_upper_hex(self) -> FmtUpperHex<Self>where
Self: UpperHex,
fn fmt_upper_hex(self) -> FmtUpperHex<Self>where
Self: UpperHex,
self to use its UpperHex implementation when
Debug-formatted.Source§impl<T, W> HasTypeWitness<W> for Twhere
W: MakeTypeWitness<Arg = T>,
T: ?Sized,
impl<T, W> HasTypeWitness<W> for Twhere
W: MakeTypeWitness<Arg = T>,
T: ?Sized,
Source§impl<T> Identity for Twhere
T: ?Sized,
impl<T> Identity for Twhere
T: ?Sized,
Source§impl<T> Instrument for T
impl<T> Instrument for T
Source§fn instrument(self, span: Span) -> Instrumented<Self>
fn instrument(self, span: Span) -> Instrumented<Self>
Source§fn in_current_span(self) -> Instrumented<Self>
fn in_current_span(self) -> Instrumented<Self>
Source§impl<T> IntoEither for T
impl<T> IntoEither for T
Source§fn into_either(self, into_left: bool) -> Either<Self, Self>
fn into_either(self, into_left: bool) -> Either<Self, Self>
self into a Left variant of Either<Self, Self>
if into_left is true.
Converts self into a Right variant of Either<Self, Self>
otherwise. Read moreSource§fn into_either_with<F>(self, into_left: F) -> Either<Self, Self>
fn into_either_with<F>(self, into_left: F) -> Either<Self, Self>
self into a Left variant of Either<Self, Self>
if into_left(&self) returns true.
Converts self into a Right variant of Either<Self, Self>
otherwise. Read moreimpl<T> MaybeSend for Twhere
T: Send,
impl<T> MaybeSend for Twhere
T: Send,
Source§impl<T> Paint for Twhere
T: ?Sized,
impl<T> Paint for Twhere
T: ?Sized,
Source§fn fg(&self, value: Color) -> Painted<&T>
fn fg(&self, value: Color) -> Painted<&T>
Returns a styled value derived from self with the foreground set to
value.
This method should be used rarely. Instead, prefer to use color-specific
builder methods like red() and
green(), which have the same functionality but are
pithier.
§Example
Set foreground color to white using fg():
use yansi::{Paint, Color};
painted.fg(Color::White);Set foreground color to white using white().
use yansi::Paint;
painted.white();Source§fn bright_black(&self) -> Painted<&T>
fn bright_black(&self) -> Painted<&T>
Source§fn bright_red(&self) -> Painted<&T>
fn bright_red(&self) -> Painted<&T>
Source§fn bright_green(&self) -> Painted<&T>
fn bright_green(&self) -> Painted<&T>
Source§fn bright_yellow(&self) -> Painted<&T>
fn bright_yellow(&self) -> Painted<&T>
Source§fn bright_blue(&self) -> Painted<&T>
fn bright_blue(&self) -> Painted<&T>
Source§fn bright_magenta(&self) -> Painted<&T>
fn bright_magenta(&self) -> Painted<&T>
Source§fn bright_cyan(&self) -> Painted<&T>
fn bright_cyan(&self) -> Painted<&T>
Source§fn bright_white(&self) -> Painted<&T>
fn bright_white(&self) -> Painted<&T>
Source§fn bg(&self, value: Color) -> Painted<&T>
fn bg(&self, value: Color) -> Painted<&T>
Returns a styled value derived from self with the background set to
value.
This method should be used rarely. Instead, prefer to use color-specific
builder methods like on_red() and
on_green(), which have the same functionality but
are pithier.
§Example
Set background color to red using fg():
use yansi::{Paint, Color};
painted.bg(Color::Red);Set background color to red using on_red().
use yansi::Paint;
painted.on_red();Source§fn on_primary(&self) -> Painted<&T>
fn on_primary(&self) -> Painted<&T>
Source§fn on_magenta(&self) -> Painted<&T>
fn on_magenta(&self) -> Painted<&T>
Source§fn on_bright_black(&self) -> Painted<&T>
fn on_bright_black(&self) -> Painted<&T>
Source§fn on_bright_red(&self) -> Painted<&T>
fn on_bright_red(&self) -> Painted<&T>
Source§fn on_bright_green(&self) -> Painted<&T>
fn on_bright_green(&self) -> Painted<&T>
Source§fn on_bright_yellow(&self) -> Painted<&T>
fn on_bright_yellow(&self) -> Painted<&T>
Source§fn on_bright_blue(&self) -> Painted<&T>
fn on_bright_blue(&self) -> Painted<&T>
Source§fn on_bright_magenta(&self) -> Painted<&T>
fn on_bright_magenta(&self) -> Painted<&T>
Source§fn on_bright_cyan(&self) -> Painted<&T>
fn on_bright_cyan(&self) -> Painted<&T>
Source§fn on_bright_white(&self) -> Painted<&T>
fn on_bright_white(&self) -> Painted<&T>
Source§fn attr(&self, value: Attribute) -> Painted<&T>
fn attr(&self, value: Attribute) -> Painted<&T>
Enables the styling Attribute value.
This method should be used rarely. Instead, prefer to use
attribute-specific builder methods like bold() and
underline(), which have the same functionality
but are pithier.
§Example
Make text bold using attr():
use yansi::{Paint, Attribute};
painted.attr(Attribute::Bold);Make text bold using using bold().
use yansi::Paint;
painted.bold();Source§fn rapid_blink(&self) -> Painted<&T>
fn rapid_blink(&self) -> Painted<&T>
Source§fn quirk(&self, value: Quirk) -> Painted<&T>
fn quirk(&self, value: Quirk) -> Painted<&T>
Enables the yansi Quirk value.
This method should be used rarely. Instead, prefer to use quirk-specific
builder methods like mask() and
wrap(), which have the same functionality but are
pithier.
§Example
Enable wrapping using .quirk():
use yansi::{Paint, Quirk};
painted.quirk(Quirk::Wrap);Enable wrapping using wrap().
use yansi::Paint;
painted.wrap();Source§fn clear(&self) -> Painted<&T>
👎Deprecated since 1.0.1: renamed to resetting() due to conflicts with Vec::clear().
The clear() method will be removed in a future release.
fn clear(&self) -> Painted<&T>
renamed to resetting() due to conflicts with Vec::clear().
The clear() method will be removed in a future release.
Source§fn whenever(&self, value: Condition) -> Painted<&T>
fn whenever(&self, value: Condition) -> Painted<&T>
Conditionally enable styling based on whether the Condition value
applies. Replaces any previous condition.
See the crate level docs for more details.
§Example
Enable styling painted only when both stdout and stderr are TTYs:
use yansi::{Paint, Condition};
painted.red().on_yellow().whenever(Condition::STDOUTERR_ARE_TTY);Source§impl<T> Pipe for Twhere
T: ?Sized,
impl<T> Pipe for Twhere
T: ?Sized,
Source§fn pipe<R>(self, func: impl FnOnce(Self) -> R) -> Rwhere
Self: Sized,
fn pipe<R>(self, func: impl FnOnce(Self) -> R) -> Rwhere
Self: Sized,
Source§fn pipe_ref<'a, R>(&'a self, func: impl FnOnce(&'a Self) -> R) -> Rwhere
R: 'a,
fn pipe_ref<'a, R>(&'a self, func: impl FnOnce(&'a Self) -> R) -> Rwhere
R: 'a,
self and passes that borrow into the pipe function. Read moreSource§fn pipe_ref_mut<'a, R>(&'a mut self, func: impl FnOnce(&'a mut Self) -> R) -> Rwhere
R: 'a,
fn pipe_ref_mut<'a, R>(&'a mut self, func: impl FnOnce(&'a mut Self) -> R) -> Rwhere
R: 'a,
self and passes that borrow into the pipe function. Read moreSource§fn pipe_borrow<'a, B, R>(&'a self, func: impl FnOnce(&'a B) -> R) -> R
fn pipe_borrow<'a, B, R>(&'a self, func: impl FnOnce(&'a B) -> R) -> R
Source§fn pipe_borrow_mut<'a, B, R>(
&'a mut self,
func: impl FnOnce(&'a mut B) -> R,
) -> R
fn pipe_borrow_mut<'a, B, R>( &'a mut self, func: impl FnOnce(&'a mut B) -> R, ) -> R
Source§fn pipe_as_ref<'a, U, R>(&'a self, func: impl FnOnce(&'a U) -> R) -> R
fn pipe_as_ref<'a, U, R>(&'a self, func: impl FnOnce(&'a U) -> R) -> R
self, then passes self.as_ref() into the pipe function.Source§fn pipe_as_mut<'a, U, R>(&'a mut self, func: impl FnOnce(&'a mut U) -> R) -> R
fn pipe_as_mut<'a, U, R>(&'a mut self, func: impl FnOnce(&'a mut U) -> R) -> R
self, then passes self.as_mut() into the pipe
function.Source§fn pipe_deref<'a, T, R>(&'a self, func: impl FnOnce(&'a T) -> R) -> R
fn pipe_deref<'a, T, R>(&'a self, func: impl FnOnce(&'a T) -> R) -> R
self, then passes self.deref() into the pipe function.Source§impl<T> Pointable for T
impl<T> Pointable for T
Source§impl<T> PolicyExt for Twhere
T: ?Sized,
impl<T> PolicyExt for Twhere
T: ?Sized,
impl<E> ResultError for E
impl<T> ResultType for T
Source§impl<T> Tap for T
impl<T> Tap for T
Source§fn tap_borrow<B>(self, func: impl FnOnce(&B)) -> Self
fn tap_borrow<B>(self, func: impl FnOnce(&B)) -> Self
Borrow<B> of a value. Read moreSource§fn tap_borrow_mut<B>(self, func: impl FnOnce(&mut B)) -> Self
fn tap_borrow_mut<B>(self, func: impl FnOnce(&mut B)) -> Self
BorrowMut<B> of a value. Read moreSource§fn tap_ref<R>(self, func: impl FnOnce(&R)) -> Self
fn tap_ref<R>(self, func: impl FnOnce(&R)) -> Self
AsRef<R> view of a value. Read moreSource§fn tap_ref_mut<R>(self, func: impl FnOnce(&mut R)) -> Self
fn tap_ref_mut<R>(self, func: impl FnOnce(&mut R)) -> Self
AsMut<R> view of a value. Read moreSource§fn tap_deref<T>(self, func: impl FnOnce(&T)) -> Self
fn tap_deref<T>(self, func: impl FnOnce(&T)) -> Self
Deref::Target of a value. Read moreSource§fn tap_deref_mut<T>(self, func: impl FnOnce(&mut T)) -> Self
fn tap_deref_mut<T>(self, func: impl FnOnce(&mut T)) -> Self
Deref::Target of a value. Read moreSource§fn tap_dbg(self, func: impl FnOnce(&Self)) -> Self
fn tap_dbg(self, func: impl FnOnce(&Self)) -> Self
.tap() only in debug builds, and is erased in release builds.Source§fn tap_mut_dbg(self, func: impl FnOnce(&mut Self)) -> Self
fn tap_mut_dbg(self, func: impl FnOnce(&mut Self)) -> Self
.tap_mut() only in debug builds, and is erased in release
builds.Source§fn tap_borrow_dbg<B>(self, func: impl FnOnce(&B)) -> Self
fn tap_borrow_dbg<B>(self, func: impl FnOnce(&B)) -> Self
.tap_borrow() only in debug builds, and is erased in release
builds.Source§fn tap_borrow_mut_dbg<B>(self, func: impl FnOnce(&mut B)) -> Self
fn tap_borrow_mut_dbg<B>(self, func: impl FnOnce(&mut B)) -> Self
.tap_borrow_mut() only in debug builds, and is erased in release
builds.Source§fn tap_ref_dbg<R>(self, func: impl FnOnce(&R)) -> Self
fn tap_ref_dbg<R>(self, func: impl FnOnce(&R)) -> Self
.tap_ref() only in debug builds, and is erased in release
builds.Source§fn tap_ref_mut_dbg<R>(self, func: impl FnOnce(&mut R)) -> Self
fn tap_ref_mut_dbg<R>(self, func: impl FnOnce(&mut R)) -> Self
.tap_ref_mut() only in debug builds, and is erased in release
builds.Source§fn tap_deref_dbg<T>(self, func: impl FnOnce(&T)) -> Self
fn tap_deref_dbg<T>(self, func: impl FnOnce(&T)) -> Self
.tap_deref() only in debug builds, and is erased in release
builds.