usv-0.16.1 has been yanked.
Unicode Separated Values (USV)
Unicode separated values (USV) is a data format that uses Unicode symbol characters between data parts.
The USV repo is https://github.com/sixarm/usv.
USV characters
Separators:
-
Unit Separator (US) is U+001F or U+241F ␟
-
Record Separator (RS) is U+001E or U+241E ␞
-
Group Separator (GS) is U+001D or U+241D ␝
-
File Separator (FS) is U+001C or U+241C ␜
Modifiers:
-
Escape (ESC) is U+001B or U+241B ␛
-
End of Transmission (EOT) is U+0004 or U+2404 ␄
Liners:
-
Carriage Return (CR) is U+000D
-
Line Feed (LF) is U+000A
Units
use *;
let input = "a␟b";
let units: Units = input.units.collect;
assert_eq!;
Records
use *;
let input = "a␟b␞c␟d";
let records: Records = input.records.collect;
assert_eq!;
Groups
use *;
let input = "a␟b␞c␟d␝e␟f␞g␟h";
let groups: Groups = input.groups.collect;
assert_eq!;
Files
use *;
let input = "a␟b␞c␟d␝e␟f␞g␟h␜i␟j␞k␟l␝m␟n␞o␟p";
let files: Files = input.files.collect;
assert_eq!;
Token
A token is the underlying USV enumeration for parsing a string to output:
pub enum Token {
Unit(String),
UnitSeparator,
RecordSeparator,
GroupSeparator,
FileSeparator,
EndOfTransmission,
}
Type aliases
-
Token = described above
-
Tokens = Vec
-
Unit = String
-
Units = Vec
-
Record = Units
-
Records = Vec
-
Group = Records
-
Groups = Vec
-
File = Groups
-
Files = Vec