DefaultJsonHandler

Struct DefaultJsonHandler 

Source
pub struct DefaultJsonHandler<E: TaskExecutor> { /* private fields */ }
Available on (crate features default-engine-native-tls or default-engine-rustls or arrow-conversion) and crate feature default-engine-base only.

Implementations§

Source§

impl<E: TaskExecutor> DefaultJsonHandler<E>

Source

pub fn new(store: Arc<DynObjectStore>, task_executor: Arc<E>) -> Self

Source

pub fn with_buffer_size(self, buffer_size: usize) -> Self

Set the maximum number read requests to buffer in memory at once in Self::read_json_files().

Defaults to 1000.

Memory constraints can be imposed by constraining the buffer size and batch size. Note that overall memory usage is proportional to the product of these two values.

  1. Batch size governs the size of RecordBatches yielded in each iteration of the stream
  2. Buffer size governs the number of concurrent tasks (which equals the size of the buffer
Source

pub fn with_batch_size(self, batch_size: usize) -> Self

Limit the number of rows per batch. That is, for batch_size = N, then each RecordBatch yielded by the stream will have at most N rows.

Defaults to 1000 rows (json objects).

See Decoder::with_buffer_size for details on constraining memory usage with buffer size and batch size.

Trait Implementations§

Source§

impl<E: Debug + TaskExecutor> Debug for DefaultJsonHandler<E>

Source§

fn fmt(&self, f: &mut Formatter<'_>) -> Result

Formats the value using the given formatter. Read more
Source§

impl<E: TaskExecutor> JsonHandler for DefaultJsonHandler<E>

Source§

fn parse_json( &self, json_strings: Box<dyn EngineData>, output_schema: SchemaRef, ) -> DeltaResult<Box<dyn EngineData>>

Parse the given json strings and return the fields requested by output schema as columns in EngineData. json_strings MUST be a single column batch of engine data, and the column type must be string
Source§

fn read_json_files( &self, files: &[FileMeta], physical_schema: SchemaRef, predicate: Option<PredicateRef>, ) -> DeltaResult<FileDataReadResultIterator>

Read and parse the JSON format file at given locations and return the data as EngineData with the columns requested by physical schema. Note: The FileDataReadResultIterator must emit data from files in the order that files is given. For example if files [“a”, “b”] is provided, then the engine data iterator must first return all the engine data from file “a”, then all the engine data from file “b”. Moreover, for a given file, all of its EngineData and constituent rows must be in order that they occur in the file. Consider a file with rows (1, 2, 3). The following are legal iterator batches: iter: [EngineData(1, 2), EngineData(3)] iter: [EngineData(1), EngineData(2, 3)] iter: [EngineData(1, 2, 3)] The following are illegal batches: iter: [EngineData(3), EngineData(1, 2)] iter: [EngineData(1), EngineData(3, 2)] iter: [EngineData(2, 1, 3)] Read more
Source§

fn write_json_file( &self, path: &Url, data: Box<dyn Iterator<Item = DeltaResult<FilteredEngineData>> + Send + '_>, overwrite: bool, ) -> DeltaResult<()>

Atomically (!) write a single JSON file. Each row of the input data should be written as a new JSON object appended to the file. this write must: (1) serialize the data to newline-delimited json (each row is a json object literal) (2) write the data to storage atomically (i.e. if the file already exists, fail unless the overwrite flag is set) Read more

Auto Trait Implementations§

Blanket Implementations§

§

impl<T> Any for T
where T: 'static + ?Sized,

§

fn type_id(&self) -> TypeId

Gets the TypeId of self. Read more
Source§

impl<T> AsAny for T
where T: Any + Send + Sync,

Source§

fn any_ref(&self) -> &(dyn Any + Sync + Send + 'static)

Obtains a dyn Any reference to the object: Read more
Source§

fn as_any(self: Arc<T>) -> Arc<dyn Any + Sync + Send>

Obtains an Arc<dyn Any> reference to the object: Read more
Source§

fn into_any(self: Box<T>) -> Box<dyn Any + Sync + Send>

Converts the object to Box<dyn Any>: Read more
Source§

fn type_name(&self) -> &'static str

Convenient wrapper for std::any::type_name, since Any does not provide it and Any::type_id is useless as a debugging aid (its Debug is just a mess of hex digits).
§

impl<T> Borrow<T> for T
where T: ?Sized,

§

fn borrow(&self) -> &T

Immutably borrows from an owned value. Read more
§

impl<T> BorrowMut<T> for T
where T: ?Sized,

§

fn borrow_mut(&mut self) -> &mut T

Mutably borrows from an owned value. Read more
§

impl<T> From<T> for T

§

fn from(t: T) -> T

Returns the argument unchanged.

Source§

impl<T> Instrument for T

Source§

fn instrument(self, span: Span) -> Instrumented<Self>

Instruments this type with the provided Span, returning an Instrumented wrapper. Read more
Source§

fn in_current_span(self) -> Instrumented<Self>

Instruments this type with the current Span, returning an Instrumented wrapper. Read more
§

impl<T, U> Into<U> for T
where U: From<T>,

§

fn into(self) -> U

Calls U::from(self).

That is, this conversion is whatever the implementation of From<T> for U chooses to do.

Source§

impl<T> IntoEither for T

Source§

fn into_either(self, into_left: bool) -> Either<Self, Self>

Converts self into a Left variant of Either<Self, Self> if into_left is true. Converts self into a Right variant of Either<Self, Self> otherwise. Read more
Source§

fn into_either_with<F>(self, into_left: F) -> Either<Self, Self>
where F: FnOnce(&Self) -> bool,

Converts self into a Left variant of Either<Self, Self> if into_left(&self) returns true. Converts self into a Right variant of Either<Self, Self> otherwise. Read more
Source§

impl<T> PolicyExt for T
where T: ?Sized,

Source§

fn and<P, B, E>(self, other: P) -> And<T, P>
where T: Policy<B, E>, P: Policy<B, E>,

Create a new Policy that returns Action::Follow only if self and other return Action::Follow. Read more
Source§

fn or<P, B, E>(self, other: P) -> Or<T, P>
where T: Policy<B, E>, P: Policy<B, E>,

Create a new Policy that returns Action::Follow if either self or other returns Action::Follow. Read more
Source§

impl<T> Same for T

Source§

type Output = T

Should always be Self
§

impl<T, U> TryFrom<U> for T
where U: Into<T>,

§

type Error = Infallible

The type returned in the event of a conversion error.
§

fn try_from(value: U) -> Result<T, <T as TryFrom<U>>::Error>

Performs the conversion.
§

impl<T, U> TryInto<U> for T
where U: TryFrom<T>,

§

type Error = <U as TryFrom<T>>::Error

The type returned in the event of a conversion error.
§

fn try_into(self) -> Result<U, <U as TryFrom<T>>::Error>

Performs the conversion.
Source§

impl<KernelType, ArrowType> TryIntoArrow<ArrowType> for KernelType
where ArrowType: TryFromKernel<KernelType>,

Source§

fn try_into_arrow(self) -> Result<ArrowType, ArrowError>

Available on (crate features default-engine-native-tls or default-engine-rustls or arrow-conversion) and crate feature arrow-conversion only.
Source§

impl<KernelType, ArrowType> TryIntoKernel<KernelType> for ArrowType
where KernelType: TryFromArrow<ArrowType>,

Source§

fn try_into_kernel(self) -> Result<KernelType, ArrowError>

Available on (crate features default-engine-native-tls or default-engine-rustls or arrow-conversion) and crate feature arrow-conversion only.
Source§

impl<V, T> VZip<V> for T
where V: MultiLane<T>,

Source§

fn vzip(self) -> V

Source§

impl<T> WithSubscriber for T

Source§

fn with_subscriber<S>(self, subscriber: S) -> WithDispatch<Self>
where S: Into<Dispatch>,

Attaches the provided Subscriber to this type, returning a WithDispatch wrapper. Read more
Source§

fn with_current_subscriber(self) -> WithDispatch<Self>

Attaches the current default Subscriber to this type, returning a WithDispatch wrapper. Read more