Expand description
Apache Arrow and SQL Server bridge through Tiberius.
arrow-tiberius bridges Apache Arrow and Microsoft SQL Server through the
Tiberius TDS driver. The crate is designed around a bidirectional boundary:
Arrow schemas and RecordBatch values can be planned and written to SQL
Server, and future read-side APIs can map SQL Server metadata and rows back
to Arrow.
The v0.1 API implements the Arrow-to-SQL Server write path first: plan an Arrow schema for SQL Server, render deterministic DDL, inspect structured diagnostics, and bulk load one or more record batches. SQL Server-to-Arrow reads are reserved for a later release.
§Quick Start
Plan an Arrow schema and render CREATE TABLE SQL:
use arrow_schema::{DataType, Field, Schema};
use arrow_tiberius::{
MssqlProfile, PlanOptions, TableName, create_table_sql_from_mappings,
plan_arrow_schema_to_mssql_mappings,
};
let schema = Schema::new(vec![
Field::new("id", DataType::Int64, false),
Field::new("name", DataType::Utf8, true),
]);
let outcome = plan_arrow_schema_to_mssql_mappings(
&schema,
MssqlProfile::sql_server_2016_compat_100(),
PlanOptions::default(),
)?;
let table = TableName::new("dbo", "people")?;
let ddl = create_table_sql_from_mappings(&table, outcome.value());
assert!(ddl.contains("CREATE TABLE [dbo].[people]"));Write a planned batch to an existing table:
use arrow_array::RecordBatch;
use arrow_tiberius::{
BulkWriter, MssqlProfile, PlanOptions, TableName, WriteBackend,
WriteOptions, plan_arrow_schema_to_mssql_mappings,
};
use futures_util::io::{AsyncRead, AsyncWrite};
async fn write_batch<S>(
client: &mut tiberius::Client<S>,
batch: &RecordBatch,
) -> arrow_tiberius::Result<()>
where
S: AsyncRead + AsyncWrite + Unpin + Send,
{
let outcome = plan_arrow_schema_to_mssql_mappings(
batch.schema().as_ref(),
MssqlProfile::sql_server_2016_compat_100(),
PlanOptions::default(),
)?;
let mut writer = BulkWriter::new(
client,
TableName::new("dbo", "people")?,
outcome.value().to_vec(),
WriteOptions {
backend: WriteBackend::DirectRawBulk,
..WriteOptions::default()
},
)
.await?;
writer.write_batch(batch).await?;
writer.finish().await?;
Ok(())
}BulkWriter validates target table metadata before writing. It does not
create tables automatically; callers can use create_table_sql_from_mappings
when they want this crate to produce a table definition.
§Main Modules
schemaplans Arrow fields into SQL Server column mappings and DDL metadata.mssqlcontains SQL Server identifiers, profiles, types, and DDL helpers.diagnosticexposes structured planning and runtime diagnostics.- The
writemodule contains write policies, backend selection, andBulkWriter.
§Writer Backends
WriteBackend::Auto is the default selection and currently resolves to
WriteBackend::DirectRawBulk.
WriteBackend::DirectRawBulk is the optimized direct Arrow-to-TDS path for
supported mappings. WriteBackend::BaselineTokenRow remains available as a
compatibility and reference path through Tiberius TokenRow bulk load.
WriteBackend::DirectFramedBulk uses the direct row encoder through
Tiberius framed writes.
§SQL Server Compatibility
The initial profile is MssqlProfile::sql_server_2016_compat_100, which
targets SQL Server 2016 with database compatibility level 100.
§Tiberius Dependency Model
This crate depends on the published tiberius-raw-bulk package as the crate
name tiberius. Downstream crates that construct the Tiberius client passed
to BulkWriter should use the same package identity:
[dependencies]
arrow-tiberius = "0.1"
tiberius = { package = "tiberius-raw-bulk", version = "=0.12.3-raw-bulk.13", default-features = false, features = [
"tds73",
"winauth",
"native-tls",
] }Depending on upstream tiberius separately creates a distinct crate type and
will not produce a client compatible with BulkWriter.
§Feature Flags
bench-profile: benchmark-only direct write profiling hooks.integration-tests: SQL Server integration tests that are normally run throughcargo xtask sqlserver-test.
Docs.rs is configured to build with all features so feature-gated public items are visible in API documentation. Normal library use does not require either feature.
§More Documentation
Re-exports§
pub use arrow::ArrowFieldRef;pub use diagnostic::Diagnostic;pub use diagnostic::DiagnosticCode;pub use diagnostic::DiagnosticSet;pub use diagnostic::DiagnosticSeverity;pub use diagnostic::FieldRef;pub use diagnostic::PlanOutcome;pub use error::Error;pub use error::Result;pub use mssql::CompatibilityLevel;pub use mssql::CreateTableOptions;pub use mssql::Identifier;pub use mssql::IdentifierPolicy;pub use mssql::MssqlColumn;pub use mssql::MssqlProfile;pub use mssql::MssqlTimePrecision;pub use mssql::MssqlType;pub use mssql::MssqlTypeLength;pub use mssql::MssqlVersion;pub use mssql::TableName;pub use mssql::create_table_sql;pub use schema::SchemaMapping;pub use schema::create_table_sql_from_mappings;pub use schema::mssql_columns_from_mappings;pub use schema::plan_arrow_schema_to_mssql_mappings;pub use write::BinaryPolicy;pub use write::BulkWriter;pub use write::Date64Policy;pub use write::Decimal256Policy;pub use write::DecimalPolicy;pub use write::FloatPolicy;pub use write::NanosecondPolicy;pub use write::PlanOptions;pub use write::SchemaCheck;pub use write::StringPolicy;pub use write::TimezonePolicy;pub use write::UInt64Policy;pub use write::WriteBackend;pub use write::WriteOptions;pub use write::WriteStats;
Modules§
- arrow
- Arrow-side schema metadata. Arrow-side schema metadata.
- diagnostic
- Structured diagnostics for planning and writing. Structured diagnostics for planning and writing.
- error
- Error types for
arrow-tiberius. Error types forarrow-tiberius. - mssql
- MSSQL-side schema metadata, identifiers, profile, and DDL helpers. MSSQL-side schema metadata, identifiers, profile, and DDL helpers.
- schema
- Bidirectional Arrow/MSSQL schema mapping. Bidirectional Arrow/MSSQL schema mapping.
- write
- Write-path options and conversion policies. Write-path options and policies.