Snowflake Connector
Who is this crate for? Developers who want to deserialize data into types and auto-generate tables into Rust structs from Snowflake. This crate focuses on type safety and encourages you to handle errors.
Use of RustRover is HIGHLY encouraged when using derive feature (enabled by default), otherwise false positive proc_macro errors may occur when using VS Code or other code editors, but builds will work fine.
Usage
Add following line to Cargo.toml:
= "0.3"
Right now, only key pair authentication is supported.
You can pass the paths to your private and public key using SnowflakeConnector::try_new_from_file, or pass them directly using SnowflakeConnector::try_new.
Dev Setup
Add your public and private key under a folder, and feed the paths into SnowflakeConnector::try_new_from_file.
Make sure to ignore the keys. You do not want to commit your keys to a repository.
Derive Features
snowflake-connector will attempt to auto-generate table structs for you. There are two requirements:
SNOWFLAKE_PATHenvironment variable must be definedsnowflake_config.tomlmust reside in the folderSNOWFLAKE_PATHpoints to
If SNOWFLAKE_PATH is not defined, the auto-generation of tables is simply skipped. This can be useful for environments that will build the crate, and should not regenerate tables.
Example snowflake_config.toml:
= "./keys/local/rsa_key.p8"
= "./keys/local/rsa_key.pub"
= "FIRST-LAST"
= "FIRST-LAST"
= "USER"
= "PUBLIC"
= "SNOWFLAKE_LEARNING_WH"
# First database we want to load tables from
[[]]
= "SNOWFLAKE_LEARNING_DB" # Database name
# Tables from first database
[[]]
= "USER_SAMPLE_DATA_FROM_S3.MENU" # Schema.Table name
# By default, all numbers are signed, below marks certain columns as unsigned
= ["menu_id", "menu_type_id", "menu_item_id"]
# Custom struct below that will be parsed from json
[]
= "crate::snowflake::metrics::Metrics" # Full path to struct (must implement `serde::Deserialize`)
# Custom enums for columns below, array contains all the possible values for said column,
# each array element generates an enum variant
[]
= [
"Variant_1", # MenuType::Variant1
"Variant 2", # MenuType::Variant2
"VARIANT 3", # MenuType::Variant3
"Variant4", # MenuType::Variant4
]
# Second database we want to load tables from
[[]]
= "SNOWFLAKE_SAMPLE_DATA" # Database name
[[]]
= "TPCH_SF1.ORDERS" # Schema.Table name
This will create a snowflake_tables.rs file under the SNOWFLAKE_PATH folder which will contain two tables:
snowflake_learning_db::Menusnowflake_sample_data::Orders
There are two ways to regenerate Snowflake tables:
touchor modify thesnowflake_config.tomlfile- or run
cargo cleanand thencargo buildto force rebuild dependencies
BE WARY OF AUTO-GENERATED CODE DURING CODE REVIEWS. Someone malicious may inject their own code into the auto-generated file. If you are someone trusted, regenerating the tables on your end and committing them into the branch is wise, or better yet, set up an automated process.
Generating tables will send a query to Snowflake for every table to retrieve its metadata.
How it Works
Below example is not tested, but you get the gist:
use *;
// Manually creating tables instead of auto-generating.
// Fields must be in order of columns!
// Enum must implement DeserializeFromStr!
// Snowflake sends each cell as a string,
// convert the string to the appropriate type!
// Manually creating table with more control
// Specify which error to use
Snowflake returns every value as a string. Implement DeserializeFromStr for types that can be parsed from a string. Add the SnowflakeDeserialize derive attribute to a struct to allow SnowflakeConnector to convert the data to that type. As of now, the order of the fields must correspond to the order of the columns. Let's assume the fields go top-to-bottom, so the top-most field must be the first column, the bottom-most field must be the last column, otherwise deserializing will fail.
It is encouraged to make your crate as type safe as you can. Instead of using strings for your warehouses or databases, consider creating enums that implement ToString, and use them instead of strings. This way, you know which warehouses or databases are available, can change their string representation in one place, and remove any obsolete values from your codebase entirely.