PolarsView

Polars View

A fast and interactive viewer for CSV, Json and Parquet data, built with Polars and egui.

This project is inspired by and forked from the parqbench project.

Features

Fast Loading: Leverages Polars for efficient data loading and processing.
CSV, Json and Parquet Support: Handles CSV, Json (and Newline-Delimited Json) and Parquet file formats.
Save as: Save to file and select CSV, Json, NDJson or Parquet format.
Metadata Display: Shows file metadata (column count, row count) and schema information.
Sorting Capabilities: Interactive column sorting (ascending/descending).
Drag and Drop: Supports dragging and dropping files directly onto the application window.
Asynchronous Operations: Uses Tokio for non-blocking file loading, saving, and query execution.
Error Handling: Custom error enum (PolarsViewError) for robust error handling and informative error messages. Errors are displayed in a dedicated notification window.
Query Panel Configuration Options:
- CSV Delimiter: Specify the delimiter character for CSV files (automatic detection is also available).
- Schema Inference Length: Control the number of rows used to infer the data schema.
- Decimal Places: Set the desired decimal precision for floating-point numbers.
- Remove All-Null Columns: Option to automatically drop columns where all values are null, cleaning up the dataset.
- Null Values (CSV): Allows you to specify custom strings that should be interpreted as null or missing values when loading CSV files. This is useful when your CSV data uses non-standard representations for missing data (e.g., "N/A", "NULL", "--", or other custom placeholders). You can enter a comma-separated list of these values.

Building and Running

Prerequisites:
- Rust and Cargo (latest stable version recommended, minimum version 1.85).

Clone the Repository:

git clone https://github.com/claudiofsr/polars-view.git
cd polars-view

Build and Install:
```
cargo b -r && cargo install --path=.
```

Run:

polars-view [path_to_file] [options]

Replace [path_to_file] with the actual path to your CSV, Json or Parquet file.
Use polars-view --help for a list of available options (delimiter, query, table name).

Tracing Options (for logging):

To enable detailed logging, use the RUST_LOG environment variable:

RUST_LOG=info polars-view [path_to_file] [options]  # General info logs
RUST_LOG=debug polars-view [path_to_file] [options] # More detailed logs
RUST_LOG=trace polars-view [path_to_file] [options] # Very detailed logs (for debugging)

You can also specify the log level for specific modules:

RUST_LOG=polars_view=debug,polars=info polars-view [path_to_file] [options]

Persistent Setting (for the current shell session):

In your shell (bash, zsh, etc.), use export:

export RUST_LOG=debug # Set the desired log level
polars-view [path_to_file] [options]

Examples:

polars-view data.parquet
polars-view --delimiter ';' data.csv --query "SELECT * FROM AllData WHERE Col_With_Integers > 10"
polars-view data.csv -q "Select Col_A, \`Col B\` From AllData Where Col_A < 30"
polars-view data.csv -q "Select Col_A, \"Col B\" From AllData Where Col_A < 30 And \`Col C\` > 5"
polars-view data.csv -q "Select Linhas, \"Valor Total do Item\" From AllData Where Linhas < 30"
RUST_LOG=info polars-view data.parquet

Usage

Open a File:
- Run the application with a file path as an argument.
- Use the "File" -> "Open" menu option.
- Drag and drop a CSV, Json or Parquet file onto the application window.
Filtering:
- Use the "Query" panel to set CSV delimiter, schema inference length, and decimal places. The application will automatically attempt to detect the CSV delimiter.
- Enter SQL queries in the "SQL Query" field. The default table name is "AllData," but you can change this with the --table-name argument or in the "Query" panel.
- Click "Apply SQL Commands" to apply the filters and execute the SQL query (if provided).
- Check "Drop Null Cols" to remove any columns that contain only null values.
- Null Values (CSV): In the "Query" panel, enter a comma-separated list of strings you want to treat as null values in the "Null Values" field (e.g., "", " ", NA, NULL, --).
Sorting:
- Click the column headers in the table to sort by that column (ascending/descending). The sort state cycles through Not Sorted ⇒ Descending ⇔ Ascending. The entire DataFrame is sorted, not just the visible rows.
Metadata:
- The "Metadata" panel shows file metadata.
- The "Schema" panel displays the data schema.
SQL Examples:

The interface includes a "SQL Command Examples" section that provides pre-defined SQL commands based on the loaded data's schema. These cover various common filtering and aggregation operations. See Polars SQL documentation. The examples are dynamically generated to reflect the column names and data types of your specific file.
```
SELECT * FROM AllData;
SELECT * FROM AllData WHERE column_name > value;
SELECT column1, COUNT(*) FROM AllData GROUP BY column1;
```
Save as:
Use the "File" -> "Save As..." menu option.
Select one of supported formats to save CSV, Json, NDJson or Parquet.

Dependencies

Polars: High-performance DataFrame library.
eframe: Immediate-mode GUI library.
egui: Immediate-mode GUI library.
egui_extras: Complement egui features
clap: Command-line argument parser.
tokio: Asynchronous runtime.
tracing: Application-level tracing framework.
thiserror: Library for deriving the Error trait.
rfd: Native file dialogs.
anstyle: Print styled output to terminal.

License