Pinot Client Rust
A rust library to query Apache Pinot.
Installing Pinot
To install Pinot locally, please follow this Pinot Quickstart link to install and start Pinot batch quickstart locally.
bin/quick-start-batch.sh
Alternatively, the docker contained Pinot database ochestrated by this repository's docker-compose.yaml
file may be used.
Examples
Check out Client library Github Repo
Start up the docker contained pinot database
make prepare-pinot
Build and run an example application to query from Pinot
Usage
Create a Pinot Connection
Pinot client could be initialized through:
- Zookeeper Path.
let client = client_from_zookeeper;
- A list of broker addresses.
let client = client_from_broker_list;
Asynchronous Queries
An asynchronous connection can be established with pinot_client_rust::async_connection::AsyncConnection
for
which exist equivalents to the above described synchronous instantiation methods.
Query Pinot
Please see this example for your reference.
Code snippet:
Response Format
Query Responses are defined by one of two broker response structures.
SQL queries return SqlBrokerResponse
, whose generic parameter is supported by all structs implementing the
FromRow
trait, whereas PQL queries return PqlBrokerResponse
.
SqlBrokerResponse
contains a ResultTable
, the holder for SQL query data, whereas PqlBrokerResponse
contains
AggregationResults
and SelectionResults
, the holders for PQL query data.
Exceptions for a given request for both SqlBrokerResponse
and PqlBrokerResponse
are stored in the Exception
array.
Stats for a given request for both SqlBrokerResponse
and PqlBrokerResponse
are stored in ResponseStats
.
Common
Exception
is defined as:
/// Pinot exception.
ResponseStats
is defined as:
/// ResponseStats carries all stats returned by a query.
PQL
PqlBrokerResponse
is defined as:
/// PqlBrokerResponse is the data structure for broker response to a PQL query.
SQL
SqlBrokerResponse
is defined as:
/// SqlBrokerResponse is the data structure for a broker response to an SQL query.
ResultTable
is defined as:
/// ResultTable is the holder for SQL queries.
RespSchema
is defined as:
/// RespSchema is response schema with a bimap to allow easy name <-> index retrieval
There are multiple functions defined for RespSchema
, like:
fn get_column_count(&self) -> usize;
fn get_column_name(&self, column_index: usize) -> Result<&str>;
fn get_column_index(&self, column_name: &str) -> Result<usize>;
fn get_column_data_type(&self, column_index: usize) -> Result<DataType>;
fn get_column_data_type_by_name(&self, column_name: &str) -> Result<DataType>;
DataType
is defined as:
/// Pinot native types
FromRow
is defined as:
/// FromRow represents any structure which can deserialize
/// the ResultTable.rows json field provided a `RespSchema`
In addition to being implemented by DataRow
, FromRow
is also implemented by all implementors
of serde::de::Deserialize
, which is achieved by first deserializing the response to json and then
before each row is deserialized into final form, a json map of column name to value is substituted.
Additionally, there are a number of serde deserializer functions provided to deserialize complex pinot types:
/// Converts Pinot timestamps into `Vec<DateTime<Utc>>` using `deserialize_timestamps_from_json()`.
fn deserialize_timestamps<'de, D>(deserializer: D) -> std::result::Result<Vec<DateTime<Utc>>, D::Error>...
/// Converts Pinot timestamps into `DateTime<Utc>` using `deserialize_timestamp_from_json()`.
pub fn deserialize_timestamp<'de, D>(deserializer: D) -> std::result::Result<DateTime<Utc>, D::Error>...
/// Converts Pinot hex strings into `Vec<Vec<u8>>` using `deserialize_bytes_array_from_json()`.
pub fn deserialize_bytes_array<'de, D>(deserializer: D) -> std::result::Result<Vec<Vec<u8>>, D::Error>...
/// Converts Pinot hex string into `Vec<u8>` using `deserialize_bytes_from_json()`.
pub fn deserialize_bytes<'de, D>(deserializer: D) -> std::result::Result<Vec<u8>, D::Error>...
/// Deserializes json potentially packaged into a string by calling `deserialize_json_from_json()`.
pub fn deserialize_json<'de, D>(deserializer: D) -> std::result::Result<Value, D::Error>
For example usage, please refer to this example
DataRow
is defined as:
/// A row of `Data`
Data
is defined as:
/// Typed Pinot data
There are multiple functions defined for Data
, like:
fn data_type(&self) -> DataType;
fn get_int(&self) -> Result<i32>;
fn get_long(&self) -> Result<i64>;
fn get_float(&self) -> Result<f32>;
fn get_double(&self) -> Result<f64>;
fn get_boolean(&self) -> Result<bool>;
fn get_timestamp(&self) -> Result<DateTime<Utc>>;
fn get_string(&self) -> Result<&str>;
fn get_json(&self) -> Result<&Value>;
fn get_bytes(&self) -> Result<&Vec<u8>>;
fn is_null(&self) -> bool;
In addition to row count, DataRow
also contains convenience counterparts to those above given a column index.