Struct arrow_odbc::OdbcReaderBuilder

pub struct OdbcReaderBuilder { /* private fields */ }

Expand description

Creates instances of OdbcReader based on odbc_api::Cursor.

Using a builder pattern instead of passing structs with all required arguments to the constructors of OdbcReader allows arrow_odbc to introduce new paramters to fine tune the creation and behavior of the readers without breaking the code of downstream applications.

Implementations§

impl OdbcReaderBuilder

pub fn new() -> Self

pub fn with_max_num_rows_per_batch( &mut self, max_num_rows_per_batch: usize ) -> &mut Self

Limits the maximum amount of rows which are fetched in a single roundtrip to the datasource. Higher numbers lower the IO overhead and may speed up your runtime, but also require larger preallocated buffers and use more memory. This value defaults to 65535 which is u16 max. Some ODBC drivers use a 16Bit integer to count rows so this can avoid overflows. The improvements in saving IO overhead going above that number are estimated to be small. Your milage may vary of course.

pub fn with_max_bytes_per_batch( &mut self, max_bytes_per_batch: usize ) -> &mut Self

In addition to a row size limit you may specify an upper bound in bytes for allocating the transit buffer. This is useful if you do not know the database schema, or your code has to work with different ones, but you know the amount of memory in your machine. This limit is applied in addition to OdbcReaderBuilder::with_max_num_rows_per_batch. Whichever of these leads to a smaller buffer is used. This defaults to 512 MiB.

pub fn with_schema(&mut self, schema: SchemaRef) -> &mut Self

Describes the types of the Arrow Arrays in the record batches. It is also used to determine CData type requested from the data source. If this is not explicitly set the type is infered from the schema information provided by the ODBC driver. A reason for setting this explicitly could be that you have superior knowledge about your data compared to the ODBC driver. E.g. a type for an unsigned byte (u8) is not part of the ODBC standard. Therfore the driver might at best be able to tell you that this is an (i8). If you want to still have u8s in the resulting array you need to specify the schema manually. Also many drivers struggle with reporting nullability correctly and just report every column as nullable. Explicitly specifying a schema can also compensate for such shortcomings if it turns out to be relevant.

pub fn with_max_text_size(&mut self, max_text_size: usize) -> &mut Self

An upper limit for the size of buffers bound to variadic text columns of the data source. This limit does not (directly) apply to the size of the created arrow buffers, but rather applies to the buffers used for the data in transit. Use this option if you have e.g. VARCHAR(MAX) fields in your database schema. In such a case without an upper limit, the ODBC driver of your data source is asked for the maximum size of an element, and is likely to answer with either 0 or a value which is way larger than any actual entry in the column If you can not adapt your database schema, this limit might be what you are looking for. On windows systems the size is double words (16Bit), as windows utilizes an UTF-16 encoding. So this translates to roughly the size in letters. On non windows systems this is the size in bytes and the datasource is assumed to utilize an UTF-8 encoding. If this method is not called no upper limit is set and the maximum element size, reported by ODBC is used to determine buffer sizes.

pub fn with_max_binary_size(&mut self, max_binary_size: usize) -> &mut Self

An upper limit for the size of buffers bound to variadic binary columns of the data source. This limit does not (directly) apply to the size of the created arrow buffers, but rather applies to the buffers used for the data in transit. Use this option if you have e.g. VARBINARY(MAX) fields in your database schema. In such a case without an upper limit, the ODBC driver of your data source is asked for the maximum size of an element, and is likely to answer with either 0 or a value which is way larger than any actual entry in the column. If you can not adapt your database schema, this limit might be what you are looking for. This is the maximum size in bytes of the binary column. If this method is not called no upper limit is set and the maximum element size, reported by ODBC is used to determine buffer sizes.

pub fn with_fallibale_allocations( &mut self, fallibale_allocations: bool ) -> &mut Self

Set to true in order to trigger an crate::ColumnFailure::TooLarge instead of a panic in case the buffers can not be allocated due to their size. This might have a performance cost for constructing the reader. false by default.

pub fn with_shims(&mut self, quirks: Quirks) -> &mut Self

Shims are workarounds which can make arrow ODBC use different implementations in order to compensate for ODBC drivers which violate the ODBC specification.

This crate currently has a workaround drivers which return memory garbage instead of indicators if bulk fetching variadic columns.

pub fn build<C>(&self, cursor: C) -> Result<OdbcReader<C>, Error>
where C: Cursor,

Constructs an OdbcReader which consumes the giver cursor. The cursor will also be used to infer the Arrow schema if it has not been supplied explicitly.

§Parameters

cursor: ODBC cursor used to fetch batches from the data source. The constructor will bind buffers to this cursor in order to perform bulk fetches from the source. This is usually faster than fetching results row by row as it saves roundtrips to the database. The type of these buffers will be inferred from the arrow schema. Not every arrow type is supported though.

Trait Implementations§

impl Clone for OdbcReaderBuilder

fn clone(&self) -> OdbcReaderBuilder

Returns a copy of the value. Read more

1.0.0 · source§

fn clone_from(&mut self, source: &Self)

Performs copy-assignment from source. Read more

impl Default for OdbcReaderBuilder

fn default() -> OdbcReaderBuilder

Returns the “default value” for a type. Read more

Auto Trait Implementations§

impl Freeze for OdbcReaderBuilder

impl RefUnwindSafe for OdbcReaderBuilder

impl Send for OdbcReaderBuilder

impl Sync for OdbcReaderBuilder

impl Unpin for OdbcReaderBuilder

impl UnwindSafe for OdbcReaderBuilder

Blanket Implementations§

impl<T> Any for T
where T: 'static + ?Sized,

fn type_id(&self) -> TypeId

Gets the TypeId of self. Read more

impl<T> Borrow<T> for T
where T: ?Sized,

fn borrow(&self) -> &T

Immutably borrows from an owned value. Read more

impl<T> BorrowMut<T> for T
where T: ?Sized,

fn borrow_mut(&mut self) -> &mut T

Mutably borrows from an owned value. Read more

impl<T> From<T> for T

fn from(t: T) -> T

Returns the argument unchanged.

impl<T, U> Into<U> for T
where U: From<T>,

fn into(self) -> U

Calls U::from(self).

That is, this conversion is whatever the implementation of From<T> for U chooses to do.

impl<T> ToOwned for T
where T: Clone,

type Owned = T

The resulting type after obtaining ownership.

fn to_owned(&self) -> T

Creates owned data from borrowed data, usually by cloning. Read more

fn clone_into(&self, target: &mut T)

Uses borrowed data to replace owned data, usually by cloning. Read more

impl<T, U> TryFrom<U> for T
where U: Into<T>,

type Error = Infallible

The type returned in the event of a conversion error.

fn try_from(value: U) -> Result<T, <T as TryFrom<U>>::Error>

Performs the conversion.

impl<T, U> TryInto<U> for T
where U: TryFrom<T>,

type Error = <U as TryFrom<T>>::Error

The type returned in the event of a conversion error.

fn try_into(self) -> Result<U, <U as TryFrom<T>>::Error>

Performs the conversion.

impl<T> Allocation for T
where T: RefUnwindSafe + Send + Sync,