Skip to main content

Crate arrow_tiberius

Crate arrow_tiberius 

Source
Expand description

Apache Arrow and SQL Server bridge through Tiberius.

arrow-tiberius bridges Apache Arrow and Microsoft SQL Server through the Tiberius TDS driver. The crate is designed around a bidirectional boundary: Arrow schemas and RecordBatch values can be planned and written to SQL Server, and future read-side APIs can map SQL Server metadata and rows back to Arrow.

The v0.1 API implements the Arrow-to-SQL Server write path first: plan an Arrow schema for SQL Server, render deterministic DDL, inspect structured diagnostics, and bulk load one or more record batches. SQL Server-to-Arrow reads are reserved for a later release.

§Quick Start

Plan an Arrow schema and render CREATE TABLE SQL:

use arrow_schema::{DataType, Field, Schema};
use arrow_tiberius::{
    MssqlProfile, PlanOptions, TableName, create_table_sql_from_mappings,
    plan_arrow_schema_to_mssql_mappings,
};

let schema = Schema::new(vec![
    Field::new("id", DataType::Int64, false),
    Field::new("name", DataType::Utf8, true),
]);

let outcome = plan_arrow_schema_to_mssql_mappings(
    &schema,
    MssqlProfile::sql_server_2016_compat_100(),
    PlanOptions::default(),
)?;

let table = TableName::new("dbo", "people")?;
let ddl = create_table_sql_from_mappings(&table, outcome.value());
assert!(ddl.contains("CREATE TABLE [dbo].[people]"));

Write a planned batch to an existing table:

use arrow_array::RecordBatch;
use arrow_tiberius::{
    BulkWriter, MssqlProfile, PlanOptions, TableName, WriteBackend,
    WriteOptions, plan_arrow_schema_to_mssql_mappings,
};
use futures_util::io::{AsyncRead, AsyncWrite};

async fn write_batch<S>(
    client: &mut tiberius::Client<S>,
    batch: &RecordBatch,
) -> arrow_tiberius::Result<()>
where
    S: AsyncRead + AsyncWrite + Unpin + Send,
{
    let outcome = plan_arrow_schema_to_mssql_mappings(
        batch.schema().as_ref(),
        MssqlProfile::sql_server_2016_compat_100(),
        PlanOptions::default(),
    )?;

    let mut writer = BulkWriter::new(
        client,
        TableName::new("dbo", "people")?,
        outcome.value().to_vec(),
        WriteOptions {
            backend: WriteBackend::DirectRawBulk,
            ..WriteOptions::default()
        },
    )
    .await?;

    writer.write_batch(batch).await?;
    writer.finish().await?;
    Ok(())
}

BulkWriter validates target table metadata before writing. It does not create tables automatically; callers can use create_table_sql_from_mappings when they want this crate to produce a table definition.

§Main Modules

  • schema plans Arrow fields into SQL Server column mappings and DDL metadata.
  • mssql contains SQL Server identifiers, profiles, types, and DDL helpers.
  • diagnostic exposes structured planning and runtime diagnostics.
  • The write module contains write policies, backend selection, and BulkWriter.

§Writer Backends

WriteBackend::Auto is the default selection and currently resolves to WriteBackend::DirectRawBulk. WriteBackend::DirectRawBulk is the optimized direct Arrow-to-TDS path for supported mappings. WriteBackend::BaselineTokenRow remains available as a compatibility and reference path through Tiberius TokenRow bulk load. WriteBackend::DirectFramedBulk uses the direct row encoder through Tiberius framed writes.

§SQL Server Compatibility

The initial profile is MssqlProfile::sql_server_2016_compat_100, which targets SQL Server 2016 with database compatibility level 100.

§Tiberius Dependency Model

This crate depends on the published tiberius-raw-bulk package as the crate name tiberius. Downstream crates that construct the Tiberius client passed to BulkWriter should use the same package identity:

[dependencies]
arrow-tiberius = "0.1"
tiberius = { package = "tiberius-raw-bulk", version = "=0.12.3-raw-bulk.13", default-features = false, features = [
    "tds73",
    "winauth",
    "native-tls",
] }

Depending on upstream tiberius separately creates a distinct crate type and will not produce a client compatible with BulkWriter.

§Feature Flags

  • bench-profile: benchmark-only direct write profiling hooks.
  • integration-tests: SQL Server integration tests that are normally run through cargo xtask sqlserver-test.

Docs.rs is configured to build with all features so feature-gated public items are visible in API documentation. Normal library use does not require either feature.

§More Documentation

Re-exports§

pub use arrow::ArrowFieldRef;
pub use diagnostic::Diagnostic;
pub use diagnostic::DiagnosticCode;
pub use diagnostic::DiagnosticSet;
pub use diagnostic::DiagnosticSeverity;
pub use diagnostic::FieldRef;
pub use diagnostic::PlanOutcome;
pub use error::Error;
pub use error::Result;
pub use mssql::CompatibilityLevel;
pub use mssql::CreateTableOptions;
pub use mssql::Identifier;
pub use mssql::IdentifierPolicy;
pub use mssql::MssqlColumn;
pub use mssql::MssqlProfile;
pub use mssql::MssqlTimePrecision;
pub use mssql::MssqlType;
pub use mssql::MssqlTypeLength;
pub use mssql::MssqlVersion;
pub use mssql::TableName;
pub use mssql::create_table_sql;
pub use schema::SchemaMapping;
pub use schema::create_table_sql_from_mappings;
pub use schema::mssql_columns_from_mappings;
pub use schema::plan_arrow_schema_to_mssql_mappings;
pub use write::BinaryPolicy;
pub use write::BulkWriter;
pub use write::Date64Policy;
pub use write::Decimal256Policy;
pub use write::DecimalPolicy;
pub use write::FloatPolicy;
pub use write::NanosecondPolicy;
pub use write::PlanOptions;
pub use write::SchemaCheck;
pub use write::StringPolicy;
pub use write::TimezonePolicy;
pub use write::UInt64Policy;
pub use write::WriteBackend;
pub use write::WriteOptions;
pub use write::WriteStats;

Modules§

arrow
Arrow-side schema metadata. Arrow-side schema metadata.
diagnostic
Structured diagnostics for planning and writing. Structured diagnostics for planning and writing.
error
Error types for arrow-tiberius. Error types for arrow-tiberius.
mssql
MSSQL-side schema metadata, identifiers, profile, and DDL helpers. MSSQL-side schema metadata, identifiers, profile, and DDL helpers.
schema
Bidirectional Arrow/MSSQL schema mapping. Bidirectional Arrow/MSSQL schema mapping.
write
Write-path options and conversion policies. Write-path options and policies.