timeseries-table-cli 0.1.1

Command-line tool for creating, managing, and querying time-series tables (powered by DataFusion).
timeseries-table-cli-0.1.1 is not a library.

tstable

A command-line tool for creating, managing, and querying time-series tables—without writing any Rust code.

What is this?

This CLI lets you work with time-series tables: a structured way to store and query time-indexed data (like stock prices, sensor readings, or log events) on top of Parquet files.

Instead of manually managing scattered Parquet files, you get:

  • A table abstraction that tracks all your data segments
  • SQL queries powered by Apache DataFusion
  • Automatic schema tracking and overlap detection
  • An interactive shell for exploratory analysis

Installation

From crates.io

cargo install timeseries-table-cli --bin tstable

From a local clone

git clone https://github.com/mag1cfrog/timeseries-table-format.git
cd timeseries-table-format
cargo install --path crates/timeseries-table-cli

Verify installation

tstable --help

Quick start

# 1. Create a table for hourly stock bars
tstable create \
  --table ./my_stocks \
  --time-column timestamp \
  --bucket 1h \
  --entity symbol

# 2. Add some data (any Parquet file with a timestamp column)
tstable append \
  --table ./my_stocks \
  --parquet ./data/aapl_bars.parquet

# 3. Query with SQL
tstable query \
  --table ./my_stocks \
  --sql "SELECT symbol, COUNT(*) FROM my_stocks GROUP BY symbol"

Commands

create — Create a new table

Creates an empty time-series table. The schema is automatically inferred when you append the first data segment.

tstable create \
  --table ./data/my_table \
  --time-column timestamp \
  --bucket 1h \
  --timezone America/New_York \
  --entity symbol
Flag Required Description
--table Path where the table will be created
--time-column Name of the timestamp column in your data
--bucket Time granularity for indexing (see below)
--timezone IANA timezone (e.g., America/New_York, UTC)
--entity Entity column(s) for partitioning, repeatable

Bucket values: 1s, 1m, 5m, 15m, 1h, 1d

What are entity columns?
If your data has multiple "things" (like stock symbols or sensor IDs), specify them with --entity. This helps with coverage tracking and future optimizations.


append — Add data to a table

Appends a Parquet file as a new segment. The file must have the timestamp column defined when you created the table.

tstable append \
  --table ./data/my_table \
  --parquet ./incoming/new_data.parquet
Flag Required Description
--table Path to an existing table
--parquet Path to the Parquet file to append
--time-column Override timestamp column (default: from table metadata)
--timing Print elapsed time

Notes:

  • The Parquet file is copied/moved into the table's data/ directory
  • Overlapping time ranges with existing segments will cause an error
  • Schema must be compatible with existing data (if any)

query — Run SQL queries

Execute SQL queries against your table using DataFusion.

tstable query \
  --table ./data/my_table \
  --sql "SELECT * FROM my_table WHERE timestamp > '2024-01-01' LIMIT 10"
Flag Required Description
--table Path to the table
--sql SQL query to execute
--max-rows Max rows to display (default: 10, use 0 for unlimited)
--format Output format: csv (default) or jsonl
--output Write results to a file instead of stdout
--explain Show the query execution plan
--timing Print elapsed time
--pager Pipe output through less -S for horizontal scrolling

Table name in SQL:
The table is registered under its directory name. For ./data/my_table, use my_table in your SQL.

Examples:

# Show all data (no row limit)
tstable query \
  --table ./stocks \
  --sql "SELECT * FROM stocks" \
  --max-rows 0

# Export to JSON Lines
tstable query \
  --table ./stocks \
  --sql "SELECT symbol, close FROM stocks WHERE symbol = 'AAPL'" \
  --format jsonl \
  --output aapl.jsonl

# See the query plan
tstable query \
  --table ./stocks \
  --sql "SELECT * FROM stocks WHERE timestamp > '2024-06-01'" \
  --explain

shell — Interactive mode

Opens an interactive shell that keeps the table loaded in memory. Great for exploratory analysis.

tstable shell --table ./data/my_table

If you omit --table, the shell will prompt you for a path (and can create a new table interactively).

Flag Description
--table Path to a table (optional—will prompt if omitted)
--history Path to command history file

Shell commands:

Command Description
query <sql> Run a SQL query
query --max-rows 100 <sql> Query with options
explain <sql> Show query execution plan
append <parquet_path> Append a new segment
refresh Reload table state from disk
\timing Toggle elapsed time display
\pager Toggle pager output
alias <name> Set a shorter table name for queries
alias --clear Reset to default table name
clear Clear screen
help Show all commands
exit Exit the shell

Query flags in shell:

query [--max-rows N] [--format csv|jsonl] [--output PATH] [--timing] [--explain] [--] <sql>

Use -- before your SQL if it starts with -- (to avoid flag parsing issues).


Example: Stock market data

Here's a complete workflow for managing daily stock bars:

# Create a table for daily bars, partitioned by symbol
tstable create \
  --table ./market_data/daily_bars \
  --time-column date \
  --bucket 1d \
  --entity symbol \
  --timezone America/New_York

# Append historical data
tstable append \
  --table ./market_data/daily_bars \
  --parquet ./downloads/spy_2023.parquet

tstable append \
  --table ./market_data/daily_bars \
  --parquet ./downloads/spy_2024.parquet

# Query: Find the highest closing prices
tstable query \
  --table ./market_data/daily_bars \
  --sql "
    SELECT symbol, date, close
    FROM daily_bars
    WHERE close = (SELECT MAX(close) FROM daily_bars)
  "

# Interactive exploration
tstable shell --table ./market_data/daily_bars

Output formats

CSV (default)

symbol,date,open,high,low,close,volume
AAPL,2024-01-02,185.50,186.20,184.80,185.90,50000000
AAPL,2024-01-03,186.00,187.10,185.50,186.80,48000000

JSON Lines (--format jsonl)

{"symbol":"AAPL","date":"2024-01-02","open":185.50,"high":186.20,"low":184.80,"close":185.90,"volume":50000000}
{"symbol":"AAPL","date":"2024-01-03","open":186.00,"high":187.10,"low":185.50,"close":186.80,"volume":48000000}

Tips

  • Row limit: By default, only 10 rows are displayed. Use --max-rows 0 to see everything, or --output file.csv to save full results.

  • Table names with special characters: If your table directory has spaces or hyphens, quote it in SQL: SELECT * FROM "my-table".

  • Time filtering: DataFusion supports standard SQL timestamp syntax:

    WHERE timestamp > '2024-01-01T00:00:00Z'
    WHERE timestamp BETWEEN '2024-01-01' AND '2024-06-30'
    
  • Refreshing in shell: If another process appends data while you're in the shell, run refresh to see the new segments.


Related