Expand description
This module contains a UniCode Mapping (.ucm) file format parser and all of the data files
available in the Unicode Consortium’s icu-data repository. For a list, see KNOWN_CHARSETS.
Most uses of this library should look like this:
use icu_data::ucm::{request_mapping_file, parser::parse as parse_ucm};
let f = request_mapping_file("java-EUC_JP-1.3_P").unwrap(); // holds the .ucm file as a String
let enc = parse_ucm(&f).unwrap(); // holds an `Encoding`
/* ... */If you only want a single encoding, they’re all in the module named mappings. They are all
lazy_static types, so are only evaluated when used. The evaluation of them can panic,
because it is just the code above, but they all work on my machine, and will only ever panic if
Brotli decompression or tar metadata parsing fails.
Example:
use icu_data::ucm::mappings;
assert_eq!(mappings::JAVA_EUC_JP_1_3_P.codepoints.len(), 13139);Modules§
- mappings
- Lazilly evaluated
static’s, holding anEncodingfor each encoding - parser
- A
.ucmfile format (UniCode Mapping) Pest parser
Structs§
- Codepoint
- This represents a
CHARMAProw in a.ucm(UniCode Mapping) file. - Encoding
- This represents a single
.ucm(UniCode Mapping) file.
Enums§
- Equivalence
Type - The “equivalence type” of the Unicode codepoint to the bytestring in the
Encoding. The equivalence types are defined by the Unicode consortium as such: - IcuData
Error - Error type. You should only ever expect to see
UnknownMappingRequestedunless you’re doing development on the library.
Statics§
- KNOWN_
CHARSETS - This is a list of all of the encodings known to
request_mapping_file, without.ucm.
Traits§
- Pest
Parser - A trait with a single method that parses strings.
Functions§
- request_
mapping_ file - Given the name of an encoding known to ICU, return its raw UCM data as a String.