Struct arrow_odbc::OdbcReaderBuilder
source · pub struct OdbcReaderBuilder { /* private fields */ }Expand description
Creates instances of OdbcReader based on odbc_api::Cursor.
Using a builder pattern instead of passing structs with all required arguments to the
constructors of OdbcReader allows arrow_odbc to introduce new paramters to fine tune the
creation and behavior of the readers without breaking the code of downstream applications.
Implementations§
source§impl OdbcReaderBuilder
impl OdbcReaderBuilder
pub fn new() -> Self
sourcepub fn with_max_num_rows_per_batch(
&mut self,
max_num_rows_per_batch: usize
) -> &mut Self
pub fn with_max_num_rows_per_batch( &mut self, max_num_rows_per_batch: usize ) -> &mut Self
Limits the maximum amount of rows which are fetched in a single roundtrip to the datasource.
Higher numbers lower the IO overhead and may speed up your runtime, but also require larger
preallocated buffers and use more memory. This value defaults to 65535 which is u16 max.
Some ODBC drivers use a 16Bit integer to count rows so this can avoid overflows. The
improvements in saving IO overhead going above that number are estimated to be small. Your
milage may vary of course.
sourcepub fn with_max_bytes_per_batch(
&mut self,
max_bytes_per_batch: usize
) -> &mut Self
pub fn with_max_bytes_per_batch( &mut self, max_bytes_per_batch: usize ) -> &mut Self
In addition to a row size limit you may specify an upper bound in bytes for allocating the
transit buffer. This is useful if you do not know the database schema, or your code has to
work with different ones, but you know the amount of memory in your machine. This limit is
applied in addition to OdbcReaderBuilder::with_max_num_rows_per_batch. Whichever of
these leads to a smaller buffer is used. This defaults to 512 MiB.
sourcepub fn with_schema(&mut self, schema: SchemaRef) -> &mut Self
pub fn with_schema(&mut self, schema: SchemaRef) -> &mut Self
Describes the types of the Arrow Arrays in the record batches. It is also used to determine
CData type requested from the data source. If this is not explicitly set the type is infered
from the schema information provided by the ODBC driver. A reason for setting this
explicitly could be that you have superior knowledge about your data compared to the ODBC
driver. E.g. a type for an unsigned byte (u8) is not part of the ODBC standard. Therfore
the driver might at best be able to tell you that this is an (i8). If you want to still
have u8s in the resulting array you need to specify the schema manually. Also many drivers
struggle with reporting nullability correctly and just report every column as nullable.
Explicitly specifying a schema can also compensate for such shortcomings if it turns out to
be relevant.
sourcepub fn with_max_text_size(&mut self, max_text_size: usize) -> &mut Self
pub fn with_max_text_size(&mut self, max_text_size: usize) -> &mut Self
An upper limit for the size of buffers bound to variadic text columns of the data source.
This limit does not (directly) apply to the size of the created arrow buffers, but rather
applies to the buffers used for the data in transit. Use this option if you have e.g.
VARCHAR(MAX) fields in your database schema. In such a case without an upper limit, the
ODBC driver of your data source is asked for the maximum size of an element, and is likely
to answer with either 0 or a value which is way larger than any actual entry in the column
If you can not adapt your database schema, this limit might be what you are looking for. On
windows systems the size is double words (16Bit), as windows utilizes an UTF-16 encoding. So
this translates to roughly the size in letters. On non windows systems this is the size in
bytes and the datasource is assumed to utilize an UTF-8 encoding. If this method is not
called no upper limit is set and the maximum element size, reported by ODBC is used to
determine buffer sizes.
sourcepub fn with_max_binary_size(&mut self, max_binary_size: usize) -> &mut Self
pub fn with_max_binary_size(&mut self, max_binary_size: usize) -> &mut Self
An upper limit for the size of buffers bound to variadic binary columns of the data source.
This limit does not (directly) apply to the size of the created arrow buffers, but rather
applies to the buffers used for the data in transit. Use this option if you have e.g.
VARBINARY(MAX) fields in your database schema. In such a case without an upper limit, the
ODBC driver of your data source is asked for the maximum size of an element, and is likely
to answer with either 0 or a value which is way larger than any actual entry in the
column. If you can not adapt your database schema, this limit might be what you are looking
for. This is the maximum size in bytes of the binary column. If this method is not called no
upper limit is set and the maximum element size, reported by ODBC is used to determine
buffer sizes.
sourcepub fn with_fallibale_allocations(
&mut self,
fallibale_allocations: bool
) -> &mut Self
pub fn with_fallibale_allocations( &mut self, fallibale_allocations: bool ) -> &mut Self
Set to true in order to trigger an crate::ColumnFailure::TooLarge instead of a panic
in case the buffers can not be allocated due to their size. This might have a performance
cost for constructing the reader. false by default.
sourcepub fn with_shims(&mut self, quirks: Quirks) -> &mut Self
pub fn with_shims(&mut self, quirks: Quirks) -> &mut Self
Shims are workarounds which can make arrow ODBC use different implementations in order to compensate for ODBC drivers which violate the ODBC specification.
This crate currently has a workaround drivers which return memory garbage instead of indicators if bulk fetching variadic columns.
sourcepub fn build<C>(&self, cursor: C) -> Result<OdbcReader<C>, Error>where
C: Cursor,
pub fn build<C>(&self, cursor: C) -> Result<OdbcReader<C>, Error>where
C: Cursor,
Constructs an OdbcReader which consumes the giver cursor. The cursor will also be used
to infer the Arrow schema if it has not been supplied explicitly.
§Parameters
cursor: ODBC cursor used to fetch batches from the data source. The constructor will bind buffers to this cursor in order to perform bulk fetches from the source. This is usually faster than fetching results row by row as it saves roundtrips to the database. The type of these buffers will be inferred from the arrow schema. Not every arrow type is supported though.
Trait Implementations§
source§impl Clone for OdbcReaderBuilder
impl Clone for OdbcReaderBuilder
source§fn clone(&self) -> OdbcReaderBuilder
fn clone(&self) -> OdbcReaderBuilder
1.0.0 · source§fn clone_from(&mut self, source: &Self)
fn clone_from(&mut self, source: &Self)
source. Read more