scatters 0.0.0

A CLI to instantly turn tabular data and audio files into interactive HTML scatter plots.
Documentation

data_plotter

A Rust CLI that turns tabular data and audio files into interactive, self-contained HTML scatter plots powered by ECharts.

  • Inputs: CSV, Parquet, JSON/JSONL/NDJSON, and audio (WAV/MP3/FLAC)
  • Output: One HTML file per input, with zoom/pan, legend, and toolbox — no server required
  • Directory mode: Pass a folder to process all supported files recursively

Build and Run

Prerequisites: Rust toolchain with Cargo

  • Build (debug):
cargo build
  • Build (release):
cargo build --release
  • Show CLI help:
cargo run -- --help

Usage

Basic examples:

# Plot all numeric columns; auto-detect X when possible
cargo run -- sample/sample.csv

# Write outputs to a directory
cargo run -- sample/sample.csv -o plots/

# Set the X-axis explicitly
cargo run -- data.csv --index timestamp

# Choose specific Y columns
cargo run -- data.csv -c sensor_a,sensor_b

# Use first column as X and set a custom title
cargo run -- data.csv --use-first-column --title "My Custom Plot"

# Process an entire directory recursively
cargo run -- path/to/folder -o plots/

# Audio files (mono or multi-channel)
cargo run -- audio.wav

# Disable dynamic Y autoscaling (keep initial padded range)
cargo run -- sample/sample.csv --no-autoscale-y

Where outputs go:

  • With -o/--output-dir, files are saved under that directory as <stem>.html.
  • Without it, each plot is saved next to its input file.

CLI

A tool to generate interactive scatter plots from various data formats.

Usage: data_plotter.exe [OPTIONS] <INPUT_PATH>

Arguments:
  <INPUT_PATH>  The input file or folder to scan for data

Options:
  -o, --output-dir <OUTPUT_DIR>  Directory to save the generated HTML plots. Defaults to saving next to each input file
      --index <INDEX>            Name of the column to use as the index (X-axis). Highest priority for index selection
      --use-first-column         Use the first column of the data as the index. Overridden by --index
  -c, --columns <COLUMNS>        Comma-separated list of columns to plot (Y-axis). If not provided, all numeric columns will be plotted
      --title <TITLE>            A custom title for the plot. Defaults to the input filename
      --no-autoscale-y           Disable dynamic Y-axis autoscaling on zoom (keeps initial Y range)
  -h, --help                     Print help
  -V, --version                  Print version

How X and Y are chosen

X-axis selection priority:

  1. --index <name>
  2. --use-first-column
  3. If a column named sample_index exists (audio), use it
  4. First datetime column (string columns may be auto-cast to datetime)
  5. Fallback to a generated row_index

Y-axis selection:

  • If -c/--columns is provided, those columns are used
  • Otherwise, all numeric columns except the chosen X column

Supported formats and behavior

  • CSV, Parquet
  • JSON, JSONL/NDJSON (JSON is read as JSON Lines; for array-of-objects JSON, convert to NDJSON)
  • Audio: WAV/MP3/FLAC via Symphonia
    • The first default track is decoded
    • DataFrame has sample_index (X) and amplitude (Y)
    • For multi-channel audio, samples are currently interleaved into the single amplitude series

Notes on parsing and type inference:

  • String columns that look like datetimes are only cast to datetime if at least one value parses successfully. This avoids selecting an "all-null" datetime axis and producing empty plots.
  • String columns that look numeric are trimmed and parsed into Float64 if at least ~50% of their values parse successfully. This helps for CSVs that don't have headers and/or have numeric values with whitespace.

Output HTML

  • Self-contained HTML with ECharts loaded from CDN
  • Interactive features: zoom/pan (dataZoom), legend scroll, save-as-image, restore
  • X-axis type is set automatically (time, category, or value) based on the chosen X series
  • Themes: dark (default) and white (enable with --white-theme)
  • Numeric formatting: -m/--max-decimals controls decimals; -1 disables the limit; scientific notation when appropriate

Project structure (brief)

  • src/cli.rs — command-line interface (clap)
  • src/lib.rs — orchestration (discover files, process each, write output)
  • src/data_loader.rs — load DataFrames; audio decoding; best-effort datetime casting
  • src/processing.rs — select X/Y series and plot title
  • src/plotter.rs — generate HTML and embed series as JSON for ECharts
  • src/error.rs — error types

For additional repository-specific guidance, see WARP.md.