SSV means Space-Separated Values, and is an alternative to CSV.
It is meant to be a cleaner format for human-writen data without the hassles of numbers containing commas, particularly on languages that use comma as decimal separator.
Rules
- Values are separated by a sequence of at least one spacing element.
- A spacing element is either a SPACE (byte value/codepoint 32) or a TAB (byte value/codepoint 9).
- The first value in a row can be preceded by spacing, which is ignored.
- The last value in a row can be succeeded by spacing, which is ignored.
- Rows of values are separated by line-breaks. A line-break is an LF (byte value/codepoint 10) optionally preceded by CR (byte value/codepoint 13).
- Values may be enclosed in quotes (
"). - Values must be enclosed in quotes in the following cases:
- the value is empty;
- the value contains any spacing element;
- the value contains a line-break;
- the value contains only quotes;
- the value is the first thing in a row and starts with a HASH sign (
#).
- Values containing quotes are encoded by duplicating the quotes.
- A line starting with the HASH sign (
#) is ignored until the next line-break (or end of the content). Such line is considered a comment line.
Example
The following content:
| # | Name | Age | Note |
|---|---|---|---|
| 1 | John Doe | 53 | a.k.a. "Joe" |
| 77 | Mary | 23 |
can be encoded as:
"#" Name Age Note
1 "John Doe" 53 "a.k.a. ""Joe"""
77 Mary 23 ""
Bytes and Chars - modules, imports
This SSV lib has a single generic implementation that is specialized for the "domains":
| domain | Element | String | StringSlice |
|---|---|---|---|
| bytes | [u8] |
[Vec<u8>] |
&[u8] |
| chars | [char] |
[String] |
&str |
The generic implementation is in the [engine] module.
The modules [bytes] and [chars] have specializations that are aliases
for types in the [engine] module. Code using this crate should not have
references to the [engine] module, only to the specializations modules.
Reading SSV
Given a byte reader (a value implementing the [std::io::Read] trait),
SSV can be read with:
Tokenizer- an iterator that validates and returns tokens, including spacing, line-breaks and comments.Reader- an iterator that returns rows. Each row is aVecof values.read- a utility function that creates aReaderobject.
There is also the read_file function that reads
from a file given its path.
Writing SSV
Given a byte writer (a value implementing the [std::io::Write] trait),
SSV can be written with:
FluentWriter- an object that writes items with a fluent interface. Delimiters such as spacing and line-breaks are automatically written when required.Writer- an object that writes in a row-oriented way.write- a utility function that uses aWriterobject to write SSV content.
There is also the write_file function that
writes to a file given its path.