Crate sqlx_exasol

Source
Expand description

A database driver for Exasol to be used with the Rust sqlx framework.

§Crate Features flags

  • etl - enables the usage ETL jobs without TLS encryption.
  • etl_native_tls - enables the etl feature and adds TLS encryption through native-tls1
  • etl_rustls - enables the etl feature and adds TLS encryption through rustls1
  • compression - enables compression support (for both connections and ETL jobs)
  • uuid - enables support for the uuid crate
  • chrono - enables support for the chrono crate types
  • rust_decimal - enables support for the rust_decimal type
  • migrate - enables the use of migrations and testing (just like in other sqlx drivers).

§Comparison to native sqlx drivers

Since the driver is used through sqlx and it implements the interfaces there, it can do all the drivers shipped with sqlx do, with some caveats:

  • Limitations

    • no compile-time query check support2
    • no sqlx-cli support2
    • no locking migrations support3
    • no column nullability checks4
  • Additions

    • array-like parameter binding in queries, thanks to the columnar nature of the Exasol database
    • performant & parallelizable ETL IMPORT/EXPORT jobs in CSV format through HTTP Transport (see the etl module for more details)

§Connection string

The connection string is expected to be an URL with the exa:// scheme, e.g: exa://sys:exasol@localhost:8563.

Connection options:

  • access-token: Use an access token for login instead of credentials
  • refresh-token: Use a refresh token for login instead of credentials
  • protocol-version: Select a specific protocol version to use
  • ssl-mode: Select a specifc SSL behavior. See: ExaSslMode
  • ssl-ca: Use a certain certificate authority
  • ssl-cert: Use a certain certificate
  • ssl-key: Use a specific SSL key
  • statement-cache-capacity: Set the capacity of the LRU prepared statements cache
  • fetch-size: Sets the size of data chunks when retrieving result sets
  • query-timeout: The query timeout amount, in seconds. 0 means no timeout
  • compression: Boolean representing whether use compression
  • feedback-interval: Interval at which Exasol sends keep-alive Pong frames

ExaConnectOptions can also be constructed in code through its builder method, which returns a ExaConnectOptionsBuilder.

§Supported Rust datatypes

§Supported Exasol datatypes:

All Exasol datatypes are supported in some way, also depdending on the additional types used through feature flags.

The GEOMETRY type does not have a correspondent Rust datatype. One could be introduced in future versions of the driver, but for now they can be encoded/decoded to String.

§HTTP Transport

Functionality that allows performant data import/export by creation of one-shot HTTP servers to which Exasol connects to (at most one per node), thus balancing the load.

The data is always in CSV format and job configuration can be done through the etl::ImportBuilder and etl::ExportBuilder structs. The workers implement futures_io::AsyncWrite and futures_io::AsyncRead respectively, providing great flexibility in terms of how the data is processed.

The general flow of an ETL job is:

  • build the job through etl::ImportBuilder or etl::ExportBuilder
  • concurrently wait on the query execution future (typically from the main thread) and on worker operations (async tasks can be spawned in multi-threaded runtimes to further parallelize the workload).
  • when all the workers are done (readers reach EOF, while writers require an explicit close()) the job ends and the query execution future returns.
  • an error/timeout issue results in the query execution future or a worker throwing an error, therefore consider joining the tasks and aborting them if an error is thrown somewhere.

§Examples

Using the driver for regular database interactions:

use std::env;

use sqlx_exasol::*;

let pool = ExaPool::connect(&env::var("DATABASE_URL").unwrap()).await?;
let mut con = pool.acquire().await?;

sqlx::query("CREATE SCHEMA RUST_DOC_TEST")
    .execute(&mut *con)
    .await?;

Array-like parameter binding, also featuring the ExaIter adapter. An important thing to note is that the parameter sets must be of equal length, otherwise an error is thrown:

use std::{collections::HashSet, env};

use sqlx_exasol::*;

let pool = ExaPool::connect(&env::var("DATABASE_URL").unwrap()).await?;
let mut con = pool.acquire().await?;

let params1 = vec![1, 2, 3];
let params2 = HashSet::from([1, 2, 3]);

sqlx::query("INSERT INTO MY_TABLE VALUES (?, ?)")
    .bind(&params1)
    .bind(ExaIter::from(&params2))
    .execute(&mut *con)
    .await?;

An EXPORT - IMPORT ETL data pipe.

use std::env;

use futures_util::{
    future::{try_join, try_join3, try_join_all},
    AsyncReadExt, AsyncWriteExt, TryFutureExt,
};
use sqlx_exasol::{etl::*, *};

async fn pipe(mut reader: ExaExport, mut writer: ExaImport) -> anyhow::Result<()> {
    let mut buf = vec![0; 5120].into_boxed_slice();
    let mut read = 1;

    while read > 0 {
        // Readers return EOF when there's no more data.
        read = reader.read(&mut buf).await?;
        // Write data to Exasol
        writer.write_all(&buf[..read]).await?;
    }

    // Writes, unlike readers, MUST be closed to signal we won't send more data to Exasol
    writer.close().await?;
    Ok(())
}

let pool = ExaPool::connect(&env::var("DATABASE_URL").unwrap()).await?;
let mut con1 = pool.acquire().await?;
let mut con2 = pool.acquire().await?;

// Build EXPORT job
let (export_fut, readers) = ExportBuilder::new(ExportSource::Table("TEST_ETL"))
    .build(&mut con1)
    .await?;

// Build IMPORT job
let (import_fut, writers) = ImportBuilder::new("TEST_ETL").build(&mut con2).await?;

// Use readers and writers in some futures
let transport_futs = std::iter::zip(readers, writers).map(|(r, w)| pipe(r, w));

// Execute the EXPORT and IMPORT query futures along with the worker futures
let (export_res, import_res, _) = try_join3(
    export_fut.map_err(From::from),
    import_fut.map_err(From::from),
    try_join_all(transport_futs),
)
.await?;

assert_eq!(export_res.rows_affected(), import_res.rows_affected());

§Footnotes

1: There is unfortunately no way to automagically choose a crate’s feature flags based on its dependencies feature flags, so the TLS backend has to be manually selected. While nothing prevents you from using, say native-tls with sqlx and rustls with Exasol ETL jobs, it might be best to avoid compiling two different TLS backends. Therefore, consider choosing the sqlx and sqlx-exasol feature flags in a consistent manner.

2: The sqlx API powering the compile-time query checks and the sqlx-cli tool is not public. Even if it were, the drivers that are incorporated into sqlx are hardcoded in the part of the code that handles the compile-time driver decision logic.
The main problem from what I can gather is that there’s no easy way of defining a plugin system in Rust at the moment, hence the hardcoding.

3: Exasol has no advisory or database locks and simple, unnested, transactions are unfortunately not enough to define a mechanism so that concurrent migrations do not collide. This does not pose a problem when migrations are run sequentially or do not act on the same database objects.

4: Exasol does not provide the information of whether a column is nullable or not, so the driver cannot implicitly decide whether a NULL value can go into a certain database column or not until it actually tries.

Modules§

etl
This module provides the building blocks for creating IMPORT and EXPORT jobs. These are represented by a query that gets executed concurrently with some ETL workers, both of which are obtained by building the ETL job. The data format is always CSV, but there are some customizations that can be done on the builders such as row or column separator, etc.

Structs§

ExaArguments
Implementor of Arguments.
ExaAttributes
Struct representing attributes related to the connection with the Exasol server. These can either be returned by an explicit getAttributes call or as part of any response.
ExaColumn
Implementor of Column.
ExaConnectOptions
Options for connecting to the Exasol database. Implementor of ConnectOptions.
ExaConnectOptionsBuilder
Builder for ExaConnectOptions.
ExaConnection
A connection to the Exasol database. Implementor of Connection.
ExaDatabaseError
An error directly issued by the Exasol database. Implementor of DatabaseError.
ExaIter
Adapter allowing any iterator of encodable values to be passed as a parameter set / array to Exasol.
ExaQueryResult
A query result containing the number of affected rows.
ExaRow
Struct representing a result set row. Implementor of Row.
ExaStatement
Implementor of Statement.
ExaTransactionManager
Implementor of [TransactionManager].
ExaTypeInfo
Information about an Exasol data type and implementor of TypeInfo.
ExaValue
Implementor of Value.
ExaValueRef
Implementor of ValueRef.
Exasol
Implementor of Database.
Months
A duration in calendar months, analog to chrono::Months. Unlike chrono::Months, this type can represent a negative duration.
SessionInfo
Struct representing database information returned after establishing a connection.

Enums§

ExaSslMode
Options for controlling the desired security state of the connection to the Exasol server.
ProtocolVersion
Enum listing the protocol versions that can be used when establishing a websocket connection to Exasol. Defaults to the highest defined protocol version and falls back to the highest protocol version supported by the server.

Traits§

ExaExecutor
An alias for Executor<'_, Database = Exasol>.

Type Aliases§

ExaPool
An alias for Pool, specialized for Exasol.
ExaPoolOptions
An alias for PoolOptions, specialized for Exasol.