datafusion-bigtable 0.1.0

Bigtable data source for Apache Arrow Datafusion
# Datafusion-Bigtable
Bigtable data source for [Apache Arrow Datafusion](https://github.com/apache/arrow-datafusion)

## Run SQL on Bigtable

This crate implements Bigtable data source and Executor for Datafusion. It is built on top of gRPC client [tonic](https://github.com/hyperium/tonic).

## Quick Start

```
let bigtable_datasource = BigtableDataSource::new(
    "emulator".to_owned(),                               // project
    "dev".to_owned(),                                    // instance
    "weather_balloons".to_owned(),                       // table
    "measurements".to_owned(),                           // column family
    vec!["_row_key".to_owned()],                         // table_partition_cols
    "#".to_owned(),                                      // table_partition_separator
    vec![Field::new("pressure", DataType::Utf8, false)], // qualifiers
    true,                                                // only_read_latest
).await.unwrap();

let mut ctx = ExecutionContext::new();
ctx.register_table("weather_balloons", Arc::new(bigtable_datasource)).unwrap();

ctx.sql("SELECT \"_row_key\", pressure, \"_timestamp\" FROM weather_balloons where \"_row_key\" = 'us-west2#3698#2021-03-05-1200'").await?.collect().await?;
```

## Roadmap

### Bigtable

- ✅ UTF8 string
- ✅ 64-bit big-endian signed integer

### SQL
- ✅ select by `"_row_key" =`
- ✅ select by `"_row_key" IN`
- ✅ select by `"_row_key" BETWEEN`
- ✅ select by composite row keys `=`
- ✅ select by composite row keys `IN`
- ✅ select by composite row keys `BETWEEN` (only supported by last table_partition_cols)

### General

- ✅ Projection pushdown
- [ ] Predicate push down
    + [Value range]https://cloud.google.com/bigtable/docs/using-filters#value-range
    + [Value Regex]https://cloud.google.com/bigtable/docs/using-filters#value-regex
    + [Timestamp range]https://cloud.google.com/bigtable/docs/using-filters#timestamp-range
- [ ] Multi Thread or Partition aware execution
- [ ] Production ready Bigtable SDK in Rust

Note: datafusion-bigtable provides the physical Executor for Datafusion. Any aggregation, group by, join are implemented and handled by Datafusion.