PolarsView

A fast and interactive viewer for CSV, JSON (including Newline-Delimited JSON - NDJSON), and Apache Parquet files, built with Polars and egui.
This project is inspired by and initially forked from the parqbench project.
Features
- Fast Data Handling: Leverages the high-performance Polars DataFrame library for efficient loading, processing, and querying.
- Multiple File Format Support:
- Load data from: CSV, JSON, NDJSON, Parquet.
- Save data as: CSV, JSON, NDJSON, Parquet (via "Save As...").
- Interactive Table View:
- Displays data in a scrollable and resizable table using
egui_extras::TableBuilder. - Sorting: Click column headers to sort the entire DataFrame (cycles through Not Sorted → Descending ↔ Ascending).
- Column Sizing: Choose between automatically sizing columns to content (
Format > Expand Cols= true) or using faster initial calculated widths (Expand Cols= false).
- Displays data in a scrollable and resizable table using
- SQL Querying: Filter and transform data using Polars' SQL interface. Specify the query in the "Query" panel and click "Apply SQL Commands".
- Configuration Panels:
- Metadata: Displays file information (row count, column count).
- Schema: Shows column names and their Polars data types (right-click column name to copy).
- Format:
- Alignment: Customize text alignment (Left, Center, Right) for different data types.
- Decimals: Control the number of decimal places displayed for float columns.
- Expand Cols: Toggle column auto-sizing behavior.
- Query:
- SQL Query: Enter SQL commands (default table name:
AllData). - Remove Null Cols: Option to automatically drop columns containing only null values upon loading or applying SQL.
- Schema Inference Length: (CSV/JSON/NDJSON) Control rows used for schema detection.
- CSV Delimiter: Specify the delimiter (auto-detection attempted).
- Null Values (CSV): Define custom strings (comma-separated) to be interpreted as nulls (e.g.,
"", "NA", <N/A>). - SQL Examples: Provides context-aware SQL command suggestions based on the loaded data schema.
- SQL Query: Enter SQL commands (default table name:
- Drag and Drop: Load files by dragging and dropping them onto the application window.
- Asynchronous Operations: Uses Tokio for non-blocking file loading, saving, sorting, and SQL execution, keeping the UI responsive. Data state updates (
load,sort,format) happen asynchronously and results are seamlessly integrated back into the UI. - Robust Error Handling: Utilizes a custom
PolarsViewErrorenum and displays errors clearly in a non-blocking notification window. - Theming: Switch between Light and Dark themes.
Architecture Overview
PolarsView uses eframe for the application framework and egui for the immediate-mode GUI. Core application state (PolarsViewApp) manages UI layout, event handling, and orchestrates background tasks via a shared tokio runtime.
Data and its associated configuration (filters, format) are held within DataFrameContainer, primarily using Arc to allow cheap cloning and sharing between the main UI thread and asynchronous tasks spawned by tokio. State updates (loading new data, applying sorts, changing format) typically result in creating a new Arc<DataFrameContainer> instance. Communication between the UI thread and completed async tasks uses tokio::sync::oneshot channels.
Change detection for UI settings (in Format and Query panels) relies on comparing the state of DataFormat or DataFilters before and after rendering the UI controls within a single frame. If a difference is detected, the corresponding async update function is triggered.
The main table rendering relies heavily on egui_extras::TableBuilder for performance, and formatting/alignment logic is delegated based on DataFormat settings.
For a visual representation of module relationships, see chart.txt.
Building and Running
-
Prerequisites:
- Rust and Cargo (latest stable version recommended, minimum version 1.85 as defined in
Cargo.toml).
- Rust and Cargo (latest stable version recommended, minimum version 1.85 as defined in
-
Clone the Repository:
-
Build and Install (Release Mode):
&& # Or to install with the 'special' formatting feature (see decimal_and_layout_v2.rs): # cargo build --release --features special && cargo install --path=. --features specialThis compiles the application in release mode (optimized) and installs the binary (
polars-view) into your Cargo bin directory (~/.cargo/bin/by default), making it available in your PATH. -
Run:
-
If
[path_to_file]is provided, the application will attempt to load it on startup. Supported formats:.csv,.json,.ndjson,.parquet. -
Use
polars-view --helpfor a detailed list of available command-line options (e.g.,--delimiter,--query,--table-name,--null-values). -
Logging/Tracing: Control log output using the
RUST_LOGenvironment variable:export RUST_LOG=info(General information)export RUST_LOG=debug(Detailed information for debugging)export RUST_LOG=trace(Very detailed, for granular debugging)- Combine levels:
export RUST_LOG=polars_view=debug,polars=info - Run directly:
RUST_LOG=debug polars-view data.parquet
-
Examples:
# Using backticks or double quotes for column names with spaces: RUST_LOG=info
-
Usage Guide
- Opening Files:
- Provide the file path as a command-line argument.
- Use the "File" > "Open File..." menu (Ctrl+O).
- Drag and drop a supported file onto the application window.
- Viewing Data:
- The main panel displays the data in a table. Use horizontal and vertical scrollbars if needed.
- Column headers show the column names. Click to sort.
- Adjust column widths by dragging the separators between headers.
- Configuring View & Data:
- Expand the panels on the left ("Metadata", "Schema", "Format", "Query") to view information and change settings.
- Format Panel: Adjust alignment per data type, set float precision, and toggle column expansion (
Expand Cols). Changes trigger an efficient asynchronous update. - Query Panel: Set CSV options (delimiter, nulls, schema inference), toggle
Remove Null Cols, define and apply SQL queries. Applying SQL or changing most query settings triggers an asynchronous data reload/requery.
- Applying SQL:
- Enter your query in the "SQL Query" text area (using
AllDataas the default table name unless changed via CLI or config). - Click "Apply SQL Commands". The table will update after the query executes asynchronously. Refer to Polars SQL documentation.
- Check the dynamically generated "SQL Command Examples" for syntax relevant to your data.
- Enter your query in the "SQL Query" text area (using
- Saving Data:
- Save (Ctrl+S): Overwrites the original file (if applicable) with the currently displayed data (after filtering/sorting). Use with caution.
- Save As... (Ctrl+A): Opens a dialog to save the currently displayed data to a new file. You can choose the output format (CSV, JSON, NDJSON, Parquet) and location.
- Exiting: Use "File" > "Exit" or close the window.
Core Dependencies
- Polars: High-performance DataFrame library (CSV, JSON, Parquet, SQL features enabled).
- eframe / egui: Immediate-mode GUI framework.
- egui_extras: Additional widgets for egui (
TableBuilder). - tokio: Asynchronous runtime for background tasks.
- clap: Command-line argument parsing.
- tracing / tracing-subscriber: Application logging.
- thiserror: Error handling boilerplate.
- rfd: Native file dialogs.
- cfg-if: Conditional compilation helpers.
License
This project is licensed under the MIT License.