Pinot Client Rust
A rust library to query Apache Pinot.
Installing Pinot
To install Pinot locally, please follow this Pinot Quickstart link to install and start Pinot batch quickstart locally.
bin/quick-start-batch.sh
Alternatively, the docker contained Pinot database ochestrated by this repository's docker-compose.yaml file may be used.
Examples
Check out Client library Github Repo
Start up the docker contained pinot database
make prepare-pinot
Build and run an example application to query from Pinot
Usage
Create a Pinot Connection
Pinot client could be initialized through:
- Zookeeper Path.
let client = client_from_zookeeper;
- A list of broker addresses.
let client = client_from_broker_list;
Asynchronous Queries
An asynchronous connection can be established with pinot_client_rust::async_connection::AsyncConnection for
which exist equivalents to the above described synchronous instantiation methods.
Query Pinot
Please see this example for your reference.
Code snippet:
Response Format
Query Responses are defined by one of two broker response structures.
SQL queries return SqlResponse, whose generic parameter is supported by all structs implementing the
FromRow trait, whereas PQL queries return PqlResponse.
SqlResponse contains a Table, the holder for SQL query data, whereas PqlResponse contains
AggregationResults and SelectionResults, the holders for PQL query data.
Exceptions for a given request for both SqlResponse and PqlResponse are stored in the Exception array.
Stats for a given request for both SqlResponse and PqlResponse are stored in ResponseStats.
Common
Exception is defined as:
/// Pinot exception.
ResponseStats is defined as:
/// ResponseStats carries all stats returned by a query.
PQL
PqlResponse is defined as:
/// PqlResponse is the data structure for broker response to a PQL query.
SQL
SqlResponse is defined as:
/// SqlResponse is the data structure for a broker response to an SQL query.
Table is defined as:
/// Table is the holder for SQL queries.
Schema is defined as:
/// Schema is response schema with a bimap to allow easy name <-> index retrieval
There are multiple functions defined for Schema, like:
fn get_column_count(&self) -> usize;
fn get_column_name(&self, column_index: usize) -> Result<&str>;
fn get_column_index(&self, column_name: &str) -> Result<usize>;
fn get_column_data_type(&self, column_index: usize) -> Result<DataType>;
fn get_column_data_type_by_name(&self, column_name: &str) -> Result<DataType>;
DataType is defined as:
/// Pinot native types
FromRow is defined as:
/// FromRow represents any structure which can deserialize
/// the Table.rows json field provided a `Schema`
In addition to being implemented by DataRow, FromRow is also implemented by all implementors
of serde::de::Deserialize, which is achieved by first deserializing the response to json and then
before each row is deserialized into final form, a json map of column name to value is substituted.
Additionally, there are a number of serde deserializer functions provided to deserialize complex pinot types:
/// Converts Pinot timestamps into `Vec<DateTime<Utc>>` using `deserialize_timestamps_from_json()`.
fn deserialize_timestamps<'de, D>(deserializer: D) -> std::result::Result<Vec<DateTime<Utc>>, D::Error>...
/// Converts Pinot timestamps into `DateTime<Utc>` using `deserialize_timestamp_from_json()`.
pub fn deserialize_timestamp<'de, D>(deserializer: D) -> std::result::Result<DateTime<Utc>, D::Error>...
/// Converts Pinot hex strings into `Vec<Vec<u8>>` using `deserialize_bytes_array_from_json()`.
pub fn deserialize_bytes_array<'de, D>(deserializer: D) -> std::result::Result<Vec<Vec<u8>>, D::Error>...
/// Converts Pinot hex string into `Vec<u8>` using `deserialize_bytes_from_json()`.
pub fn deserialize_bytes<'de, D>(deserializer: D) -> std::result::Result<Vec<u8>, D::Error>...
/// Deserializes json potentially packaged into a string by calling `deserialize_json_from_json()`.
pub fn deserialize_json<'de, D>(deserializer: D) -> std::result::Result<Value, D::Error>
For example usage, please refer to this example
DataRow is defined as:
/// A row of `Data`
Data is defined as:
/// Typed Pinot data
There are multiple functions defined for Data, like:
fn data_type(&self) -> DataType;
fn get_int(&self) -> Result<i32>;
fn get_long(&self) -> Result<i64>;
fn get_float(&self) -> Result<f32>;
fn get_double(&self) -> Result<f64>;
fn get_boolean(&self) -> Result<bool>;
fn get_timestamp(&self) -> Result<DateTime<Utc>>;
fn get_string(&self) -> Result<&str>;
fn get_json(&self) -> Result<&Value>;
fn get_bytes(&self) -> Result<&Vec<u8>>;
fn is_null(&self) -> bool;
In addition to row count, DataRow also contains convenience counterparts to those above given a column index.