polars-view 0.31.0

A fast and interactive viewer for CSV, Json and Parquet data.
Documentation

PolarsView

Crates.io Documentation License Rust

Polars View

A fast and interactive viewer for CSV, JSON (including Newline-Delimited JSON - NDJSON), and Apache Parquet files, built with Polars and egui.

This project is inspired by and initially forked from the parqbench project.

Features

  • Fast Data Handling: Leverages the high-performance Polars DataFrame library for efficient loading, processing, and querying.
  • Multiple File Format Support:
    • Load data from: CSV, JSON, NDJSON, Parquet.
    • Save data as: CSV, JSON, NDJSON, Parquet (via "Save As...").
  • Interactive Table View:
    • Sorting: Click column headers to sort the entire DataFrame (cycles through Not Sorted → Descending → Ascending → Not Sorted). Sort operations are performed asynchronously.
    • Column Header Styles: Toggle between a styled, text-wrapping header ("Enhanced Header" = true) or a simpler, single-line button header ("Enhanced Header" = false) via the Format panel.
    • Column Header Height: Adjust the vertical padding of the header row via the "Header Padding" setting in the Format panel.
    • Column Sizing: Choose between automatically sizing columns to content (Format > Auto Col Width = true) or using faster initial calculated widths (Auto Col Width = false).
  • SQL Querying: Filter and transform data using Polars' SQL interface. Specify the query in the "Query" panel and click "Apply SQL Commands". Queries execute asynchronously.
  • Configuration Panels:
    • Metadata: Displays file information (row count, column count).
    • Schema: Shows column names and their Polars data types (right-click column name to copy).
    • Format:
      • Alignment: Customize text alignment (Left, Center, Right) for different data types.
      • Decimals: Control the number of decimal places displayed for float columns.
      • Auto Col Width: Toggle automatic column width adjustment.
      • Enhanced Header: Toggle the visual style and click behavior of table headers.
      • Header Padding: Adjust extra vertical space in the header row (when Enhanced Header is enabled).
    • Query:
      • SQL Query: Enter SQL commands (default table name: AllData).
      • Remove Null Cols: Option to automatically drop columns containing only null values upon loading or applying SQL.
      • Schema Inference Rows: (CSV/JSON/NDJSON) Control rows used for schema detection.
      • CSV Delimiter: Specify the delimiter (auto-detection attempted on load).
      • Null Values (CSV): Define custom strings (comma-separated) to be interpreted as nulls (e.g., "", "NA", <N/A>).
      • SQL Examples: Provides context-aware SQL command suggestions based on the loaded data schema.
  • Drag and Drop: Load files by dragging and dropping them onto the application window.
  • Asynchronous Operations: Uses Tokio for non-blocking file loading, saving, sorting, and SQL execution, keeping the UI responsive. Data state updates (load, sort, format) happen asynchronously and results are seamlessly integrated back into the UI.
  • Robust Error Handling: Utilizes a custom PolarsViewError enum and displays errors clearly in a non-blocking notification window.
  • Theming: Switch between Light and Dark themes via the menu bar.

Simplified Architecture

PolarsView uses eframe and egui for the immediate-mode GUI, polars for efficient data manipulation, and tokio for handling asynchronous background tasks (like file I/O, sorting, SQL execution). Communication between the UI and background tasks is managed via tokio::sync::oneshot channels.

Building and Running

  1. Prerequisites:

    • Rust and Cargo (latest stable version recommended, minimum version 1.85 and edition 2024 as defined in Cargo.toml).
  2. Clone the Repository:

    git clone https://github.com/claudiofsr/polars-view.git
    cd polars-view
    
  3. Build and Install (Release Mode):

    # Build with default features ('format-simple')
    cargo b -r && cargo install --path=.
    
    # --- Feature Example ---
    # Build with the alternative 'format-special' feature enabled.
    cargo b -r && cargo install --path=. --features format-special
    

    This compiles the application in release mode (optimized) and installs the binary (polars-view) into your Cargo bin directory (~/.cargo/bin/ by default), making it available in your PATH.

  4. Run:

    polars-view [path_to_file] [options]
    
    • If [path_to_file] is provided, the application will attempt to load it on startup. Supported formats: .csv, .json, .ndjson, .parquet.

    • Use polars-view --help for a detailed list of available command-line options (e.g., --delimiter, --query, --table-name, --null-values, --remove-null-cols).

    • Logging/Tracing: Control log output using the RUST_LOG environment variable:

      • export RUST_LOG=info (General information)
      • export RUST_LOG=debug (Detailed information for debugging)
      • export RUST_LOG=trace (Very detailed, for granular debugging)
      • Combine levels: export RUST_LOG=polars_view=debug,polars=info
      • Run directly: RUST_LOG=debug polars-view data.parquet
    • Examples:

      polars-view sales_data.parquet
      polars-view --delimiter="|" transactions.csv --null-values="N/A,-"
      polars-view data.csv -q "SELECT category, SUM(value) AS total FROM AllData WHERE date > '2023-01-01' GROUP BY category"
      # Using backticks or double quotes for column names with spaces:
      polars-view items.csv -q "SELECT \`Item Name\`, Price FROM AllData WHERE Price > 100.0"
      polars-view logs.ndjson -q 'SELECT timestamp, level, message FROM AllData WHERE level = "ERROR"'
      RUST_LOG=info polars-view big_dataset.parquet --remove-null-cols
      

Usage Guide

  • Opening Files:
    • Provide the file path as a command-line argument.
    • Use the "File" > "Open File..." menu (Ctrl+O).
    • Drag and drop a supported file onto the application window.
  • Viewing Data:
    • The main panel displays the data in a table. Use horizontal and vertical scrollbars if needed.
    • Column headers show the column names. Click the sort icon to sort.
    • Adjust column widths by dragging the separators between headers.
  • Configuring View & Data:
    • Expand the panels on the left ("Metadata", "Schema", "Format", "Query") to view information and change settings.
    • Format Panel: Adjust alignment per data type, set float precision, toggle column width (Auto Col Width), toggle header style (Enhanced Header), and adjust header padding. Changes trigger an efficient asynchronous update.
    • Query Panel: Set CSV options (delimiter, nulls, schema inference), toggle Remove Null Cols, define and apply SQL queries. Applying SQL or changing most query settings triggers an asynchronous data reload/requery.
  • Applying SQL:
    • Enter your query in the "SQL Query" text area (using AllData as the default table name unless changed via CLI or config).
    • Click "Apply SQL Commands". The table will update after the query executes asynchronously. Refer to Polars SQL documentation.
    • Check the dynamically generated "SQL Command Examples" for syntax relevant to your data.
  • Saving Data:
    • Save (Ctrl+S): Attempts to overwrite the original file (if applicable) with the currently displayed data (after filtering/sorting). This happens asynchronously. Use with caution.
    • Save As... (Ctrl+A): Opens a dialog to save the currently displayed data to a new file. You can choose the output format (CSV, JSON, NDJSON, Parquet) and location. This happens asynchronously.
  • Exiting: Use "File" > "Exit" or close the window.

Core Dependencies

  • Polars: High-performance DataFrame library (CSV, JSON, Parquet, SQL features enabled).
  • eframe / egui: Immediate-mode GUI framework.
  • egui_extras: Additional widgets for egui (TableBuilder).
  • tokio: Asynchronous runtime for background tasks.
  • clap: Command-line argument parsing.
  • tracing / tracing-subscriber: Application logging.
  • thiserror: Error handling boilerplate.
  • rfd: Native file dialogs.
  • cfg-if: Conditional compilation helpers.
  • anstyle: Terminal styling for clap.

License

This project is licensed under the MIT License.