usv 0.17.1

USV: Unicode Separated Values (USV) for data markup. This is for spreadsheets, databases, content management, and more. This builds on ASCII Separated Values (ASV) and contrasts with Comma Separated Values (CSV).
Documentation

Unicode Separated Values (USV)

Unicode separated values (USV) is a data format that uses Unicode symbol characters between data parts.

The USV repo is https://github.com/sixarm/usv.

USV characters

Separators:

  • File Separator (FS) is U+001C or U+241C ␜

  • Group Separator (GS) is U+001D or U+241D ␝

  • Record Separator (RS) is U+001E or U+241E ␞

  • Unit Separator (US) is U+001F or U+241F ␟

Modifiers:

  • Escape (ESC) is U+001B or U+241B ␛

  • End of Transmission (EOT) is U+0004 or U+2404 ␄

Liners:

  • Carriage Return (CR) is U+000D

  • Line Feed (LF) is U+000A

Units

use usv::*;
let input = "a␟b␟";
let units: Units = input.units().collect();
assert_eq!(units, ["a", "b"]);

Records

use usv::*;
let input = "a␟b␟␞c␟d␟␞";
let records: Records = input.records().collect();
assert_eq!(records, [["a", "b"],["c", "d"]]);

Groups

use usv::*;
let input = "a␟b␟␞c␟d␟␞␝e␟f␟␞g␟h␟␞␝";
let groups: Groups = input.groups().collect();
assert_eq!(groups, [[["a", "b"],["c", "d"]],[["e", "f"],["g", "h"]]]);

Files

use usv::*;
let input = "a␟b␟␞c␟d␟␞␝e␟f␟␞g␟h␟␞␝␜i␟j␟␞k␟l␟␞␝m␟n␟␞o␟p␟␞␝␜";
let files: Files = input.files().collect();
assert_eq!(files, [[[["a", "b"],["c", "d"]],[["e", "f"],["g", "h"]]],[[["i", "j"],["k", "l"]],[["m", "n"],["o", "p"]]]]);

Token

A token is the underlying USV enumeration for parsing a string to output:

pub enum Token {
    Unit(String),
    UnitSeparator,
    RecordSeparator,
    GroupSeparator,
    FileSeparator,
    EndOfTransmission,
}

Type aliases

  • Token = described above

  • Tokens = Vec

  • Unit = String

  • Units = Vec

  • Record = Units

  • Records = Vec

  • Group = Records

  • Groups = Vec

  • File = Groups

  • Files = Vec