π Csv Lib
A high-performance, zero-copy CSV reader for Rust, optimized for extremely fast parsing using:
- Memory-mapped files (
memmap2
) - SIMD acceleration (AVX2 on x86_64, NEON on aarch64)
- memchr3 fallback for broad CPU compatibility
- Configurable line breaks, delimiters, and string escaping
- Low memory overhead, even on massive datasets
β¨ Features
- π Memory-mapped CSV reading (no
BufReader
overhead) - π SIMD acceleration (AVX2, NEON) if available
- π Fallback to
memchr3
for full CPU compatibility - π Per-row zero-copy parsing
- π Per-field parsing using efficient iterators
- π Support for custom delimiters and string escaping
- π Support for column type mapping or auto-detection
- π Optional FFI export for C, C++, Python, C#, and Java between other options
- π Safe cursor management
- π UTF-8, Windows1252 and custom encoding support (
encoding_rs
)
βοΈ Installation
Add this to your Cargo.toml
:
[]
= "0.1"
or you can use cargo directly:
If you also want FFI support:
[]
= { = "0.1", = ["ffi"] }
or you can use cargo directly:
π οΈ Basic Usage
Reading rows and fields from a CSV. I strongly recommend check Advanced Usage, in this guide
use ;
π CsvConfig
The CsvConfig
structure allows full customization of CSV parsing.
let config = new;
Configurable options:
Field | Description |
---|---|
delimiter |
Field separator character (e.g., b',' ) |
string_separator |
Field quoting character (e.g., b'"' ) |
line_break |
Line terminator character (e.g., b'\n' ) |
encoder |
Character encoding (encoding_rs ) |
type_map |
Optional mapping of columns to DataType |
force_memcach3 |
Force fallback to memchr3 |
π₯ Advanced Usage:
Field Parsing with AutoDetect, And DataTypes
π InRowIter
Overview
InRowIter
is a zero-copy iterator over fields inside a row:
let row = b"field1;field2;\"field;with;delimiter\";field4";
let mut iter = new;
while let Some = iter.next
π Features:
π Feature | π Description |
---|---|
π’ Field retrieval by index | Access any field directly using its column index. if extraction ir raw it dont allocates nothing |
π§© String separator handling | Correctly processes fields enclosed with separators. |
π Escaped quote support | Parses embedded quotes inside quoted fields ("" β " ). |
β‘ Efficient field counting | Counts the number of fields in a row without allocation. |
π Performance Tips
If you are going to test performance of the library, do it in release
mode. It have a huge difference due the trash lines of code cargo generates in debug profile, and the time of process is awful
- Use
force_memcach3 = false
to take advantage of SIMD (AVX2 or NEON). - Match your
delimiter
,line_break
, andstring_separator
properly to the file format. - Prefer UTF-8 / Windows-1252 encodings for maximum parsing speed.
- Process fields immediately without copying them if possible (
&[u8]
slices).
π§ Next Version
- Working to implent AVX512 to the lib, which can handle a larger vector. This feature will be a module feature , due is not compatible with common targets, and is unstable in some Alder Laker (12th Gen Intel) processors.
- Planned adding parallel processing
- Planned add async feature
π Useful Links
The reached performance was possible due this 3 crates
π Author
Made with passion by Ignacio PΓ©rez Panizza πΊπΎ π§