Expand description
dsq-formats: File format support for dsq
This crate provides comprehensive support for reading and writing various structured data formats including CSV, Parquet, JSON, and more.
§Features
- Format Detection: Automatic format detection from file extensions and content
- Unified Interface: Consistent reader/writer traits across all formats
- Performance: Optimized implementations using Polars DataFrames
- Extensibility: Easy to add new formats with macro-based boilerplate reduction
§Supported Formats
§Input Formats
- CSV (
.csv) - Comma-separated values with customizable options - TSV (
.tsv) - Tab-separated values - Parquet (
.parquet) - Columnar storage with compression - JSON (
.json) - Standard JSON arrays and objects - JSON Lines (
.jsonl,.ndjson) - Newline-delimited JSON - Arrow (
.arrow) - Apache Arrow IPC format - Avro (
.avro) - Apache Avro serialization
§Output Formats
All input formats plus:
- Excel (
.xlsx) - Microsoft Excel format - ORC (
.orc) - Optimized Row Columnar format
§Architecture
The format system is built around:
DataFormat- Enum representing all supported formats- [
DataReader] / [DataWriter] - Traits for reading/writing data - Format-specific implementations with consistent option structs
- Macros to reduce boilerplate for new format implementations
Re-exports§
pub use error::Error;pub use error::FormatError;pub use error::Result;pub use format::detect_format_from_content;pub use format::DataFormat;pub use format::FormatOptions;pub use reader::FormatReadOptions;pub use reader::ReadOptions;pub use writer::AvroCompression;pub use writer::CompressionLevel;pub use writer::CsvEncoding;pub use writer::FormatWriteOptions;pub use writer::OrcCompression;pub use writer::WriteOptions;
Modules§
- adt
- ADT (ASCII Delimited Text) format reading and writing ADT (ASCII Delimited Text) format support
- csv
- CSV format reading and writing
- error
- Error types and result handling
- format
- File format detection and metadata
- json
- JSON format reading and writing
- parquet
- Parquet format reading and writing
- reader
- Generic data reader interface
- writer
- Generic data writer interface
Structs§
- Build
Info - Build information structure
Constants§
- BUILD_
INFO - Build information for dsq-formats
- VERSION
- Version information