pub struct DefaultJsonHandler<E: TaskExecutor> { /* private fields */ }Available on (crate features
default-engine-native-tls or default-engine-rustls or arrow-conversion) and crate feature default-engine-base only.Implementations§
Source§impl<E: TaskExecutor> DefaultJsonHandler<E>
impl<E: TaskExecutor> DefaultJsonHandler<E>
pub fn new(store: Arc<DynObjectStore>, task_executor: Arc<E>) -> Self
Sourcepub fn with_buffer_size(self, buffer_size: usize) -> Self
pub fn with_buffer_size(self, buffer_size: usize) -> Self
Set the maximum number read requests to buffer in memory at once in Self::read_json_files().
Defaults to 1000.
Memory constraints can be imposed by constraining the buffer size and batch size. Note that overall memory usage is proportional to the product of these two values.
- Batch size governs the size of RecordBatches yielded in each iteration of the stream
- Buffer size governs the number of concurrent tasks (which equals the size of the buffer
Sourcepub fn with_batch_size(self, batch_size: usize) -> Self
pub fn with_batch_size(self, batch_size: usize) -> Self
Limit the number of rows per batch. That is, for batch_size = N, then each RecordBatch yielded by the stream will have at most N rows.
Defaults to 1000 rows (json objects).
See Decoder::with_buffer_size for details on constraining memory usage with buffer size and batch size.
Trait Implementations§
Source§impl<E: Debug + TaskExecutor> Debug for DefaultJsonHandler<E>
impl<E: Debug + TaskExecutor> Debug for DefaultJsonHandler<E>
Source§impl<E: TaskExecutor> JsonHandler for DefaultJsonHandler<E>
impl<E: TaskExecutor> JsonHandler for DefaultJsonHandler<E>
Source§fn parse_json(
&self,
json_strings: Box<dyn EngineData>,
output_schema: SchemaRef,
) -> DeltaResult<Box<dyn EngineData>>
fn parse_json( &self, json_strings: Box<dyn EngineData>, output_schema: SchemaRef, ) -> DeltaResult<Box<dyn EngineData>>
Parse the given json strings and return the fields requested by output schema as columns in
EngineData.
json_strings MUST be a single column batch of engine data, and the column type must be stringSource§fn read_json_files(
&self,
files: &[FileMeta],
physical_schema: SchemaRef,
predicate: Option<PredicateRef>,
) -> DeltaResult<FileDataReadResultIterator>
fn read_json_files( &self, files: &[FileMeta], physical_schema: SchemaRef, predicate: Option<PredicateRef>, ) -> DeltaResult<FileDataReadResultIterator>
Read and parse the JSON format file at given locations and return the data as EngineData with
the columns requested by physical schema. Note: The
FileDataReadResultIterator must emit
data from files in the order that files is given. For example if files [“a”, “b”] is provided,
then the engine data iterator must first return all the engine data from file “a”, then all
the engine data from file “b”. Moreover, for a given file, all of its EngineData and
constituent rows must be in order that they occur in the file. Consider a file with rows
(1, 2, 3). The following are legal iterator batches:
iter: [EngineData(1, 2), EngineData(3)]
iter: [EngineData(1), EngineData(2, 3)]
iter: [EngineData(1, 2, 3)]
The following are illegal batches:
iter: [EngineData(3), EngineData(1, 2)]
iter: [EngineData(1), EngineData(3, 2)]
iter: [EngineData(2, 1, 3)] Read moreSource§fn write_json_file(
&self,
path: &Url,
data: Box<dyn Iterator<Item = DeltaResult<FilteredEngineData>> + Send + '_>,
overwrite: bool,
) -> DeltaResult<()>
fn write_json_file( &self, path: &Url, data: Box<dyn Iterator<Item = DeltaResult<FilteredEngineData>> + Send + '_>, overwrite: bool, ) -> DeltaResult<()>
Atomically (!) write a single JSON file. Each row of the input data should be written as a
new JSON object appended to the file. this write must:
(1) serialize the data to newline-delimited json (each row is a json object literal)
(2) write the data to storage atomically (i.e. if the file already exists, fail unless the
overwrite flag is set) Read more
Auto Trait Implementations§
impl<E> Freeze for DefaultJsonHandler<E>
impl<E> !RefUnwindSafe for DefaultJsonHandler<E>
impl<E> Send for DefaultJsonHandler<E>
impl<E> Sync for DefaultJsonHandler<E>
impl<E> Unpin for DefaultJsonHandler<E>
impl<E> !UnwindSafe for DefaultJsonHandler<E>
Blanket Implementations§
Source§impl<T> AsAny for T
impl<T> AsAny for T
Source§fn any_ref(&self) -> &(dyn Any + Sync + Send + 'static)
fn any_ref(&self) -> &(dyn Any + Sync + Send + 'static)
Obtains a
dyn Any reference to the object: Read moreSource§fn as_any(self: Arc<T>) -> Arc<dyn Any + Sync + Send>
fn as_any(self: Arc<T>) -> Arc<dyn Any + Sync + Send>
Obtains an
Arc<dyn Any> reference to the object: Read moreSource§fn into_any(self: Box<T>) -> Box<dyn Any + Sync + Send>
fn into_any(self: Box<T>) -> Box<dyn Any + Sync + Send>
Converts the object to
Box<dyn Any>: Read moreSource§fn type_name(&self) -> &'static str
fn type_name(&self) -> &'static str
Convenient wrapper for
std::any::type_name, since Any does not provide it and
Any::type_id is useless as a debugging aid (its Debug is just a mess of hex digits).§impl<T> BorrowMut<T> for Twhere
T: ?Sized,
impl<T> BorrowMut<T> for Twhere
T: ?Sized,
§fn borrow_mut(&mut self) -> &mut T
fn borrow_mut(&mut self) -> &mut T
Mutably borrows from an owned value. Read more
Source§impl<T> Instrument for T
impl<T> Instrument for T
Source§fn instrument(self, span: Span) -> Instrumented<Self>
fn instrument(self, span: Span) -> Instrumented<Self>
Source§fn in_current_span(self) -> Instrumented<Self>
fn in_current_span(self) -> Instrumented<Self>
Source§impl<T> IntoEither for T
impl<T> IntoEither for T
Source§fn into_either(self, into_left: bool) -> Either<Self, Self>
fn into_either(self, into_left: bool) -> Either<Self, Self>
Converts
self into a Left variant of Either<Self, Self>
if into_left is true.
Converts self into a Right variant of Either<Self, Self>
otherwise. Read moreSource§fn into_either_with<F>(self, into_left: F) -> Either<Self, Self>
fn into_either_with<F>(self, into_left: F) -> Either<Self, Self>
Converts
self into a Left variant of Either<Self, Self>
if into_left(&self) returns true.
Converts self into a Right variant of Either<Self, Self>
otherwise. Read moreSource§impl<T> PolicyExt for Twhere
T: ?Sized,
impl<T> PolicyExt for Twhere
T: ?Sized,
Source§impl<KernelType, ArrowType> TryIntoArrow<ArrowType> for KernelTypewhere
ArrowType: TryFromKernel<KernelType>,
impl<KernelType, ArrowType> TryIntoArrow<ArrowType> for KernelTypewhere
ArrowType: TryFromKernel<KernelType>,
Source§fn try_into_arrow(self) -> Result<ArrowType, ArrowError>
fn try_into_arrow(self) -> Result<ArrowType, ArrowError>
Available on (crate features
default-engine-native-tls or default-engine-rustls or arrow-conversion) and crate feature arrow-conversion only.Source§impl<KernelType, ArrowType> TryIntoKernel<KernelType> for ArrowTypewhere
KernelType: TryFromArrow<ArrowType>,
impl<KernelType, ArrowType> TryIntoKernel<KernelType> for ArrowTypewhere
KernelType: TryFromArrow<ArrowType>,
Source§fn try_into_kernel(self) -> Result<KernelType, ArrowError>
fn try_into_kernel(self) -> Result<KernelType, ArrowError>
Available on (crate features
default-engine-native-tls or default-engine-rustls or arrow-conversion) and crate feature arrow-conversion only.