Expand description
This package contains an implementation of the IANA CHARSET specification.
These are the official names for character sets that may be used in the Internet and may be referred to in Internet documentation. These names are expressed in ANSI_X3.4-1968 which is commonly called US-ASCII or simply ASCII. The character set most commonly use in the Internet and used especially in protocol standards is US-ASCII, this is strongly encouraged. The use of the name US-ASCII is also encouraged.
The character set names may be up to 40 characters taken from the printable characters of US-ASCII. However, no distinction is made between use of upper and lower case letters.
The MIBenum value is a unique value for use in MIBs to identify coded character sets.
The value space for MIBenum values has been divided into three regions. The first region (3-999) consists of coded character sets that have been standardized by some standard setting organization. This region is intended for standards that do not have subset implementations. The second region (1000-1999) is for the Unicode and ISO/IEC 10646 coded character sets together with a specification of a (set of) sub-repertoires that may occur. The third region (>1999) is intended for vendor specific coded character sets.
§Example
use codes_iana_charset as charset;
let latin_1 = charset::CHARSET_4;
assert_eq!(latin_1.id(), 4);
assert_eq!(latin_1.name(), "ISO_8859-1:1987");
assert_eq!(
latin_1.source(),
"[ISO-IR: International Register of Escape Sequences] Note: The current registration authority is IPSJ/ITSCJ, Japan.",
);
assert_eq!(latin_1.preferred_alias(), Some("ISO-8859-1"));
assert_eq!(latin_1.aliases(), &[
"iso-ir-100",
"ISO_8859-1",
"ISO-8859-1",
"latin1",
"l1",
"IBM819",
"CP819",
"csISOLatin1"
]);
assert_eq!(latin_1.reference(), Some("[RFC1345][Keld_Simonsen]"));
Note that the implementation of FromStr
takes into account all aliases.
use codes_iana_charset as charset;
use std::str::FromStr;
let latin_1 = charset::CHARSET_4;
let iso_8859_1 = charset::CharacterSetCode::from_str("ISO_8859-1").unwrap();
assert_eq!(latin_1, iso_8859_1);
let some_charset = charset::CharacterSetCode::try_from(4).unwrap();
assert_eq!(some_charset, iso_8859_1);
§Features
Structs§
- Character
SetCode - This type is used to encapsulate the numeric MIB enum for IANA-defined Character Sets.
Enums§
- Character
SetCode Error - Common
Error
type, mainly used forFromStr
failures.
Constants§
- ALL_
CODES - Provides an array of all defined CharacterSetCode codes, useful for queries.
- CHARSET_
3 - US-ASCII
- CHARSET_
4 - ISO_8859-1:1987
- CHARSET_
5 - ISO_8859-2:1987
- CHARSET_
6 - ISO_8859-3:1988
- CHARSET_
7 - ISO_8859-4:1988
- CHARSET_
8 - ISO_8859-5:1988
- CHARSET_
9 - ISO_8859-6:1987
- CHARSET_
10 - ISO_8859-7:1987
- CHARSET_
11 - ISO_8859-8:1988
- CHARSET_
12 - ISO_8859-9:1989
- CHARSET_
13 - ISO-8859-10
- CHARSET_
14 - ISO_6937-2-add
- CHARSET_
15 - JIS_X0201
- CHARSET_
16 - JIS_Encoding
- CHARSET_
17 - Shift_JIS
- CHARSET_
18 - Extended_UNIX_Code_Packed_Format_for_Japanese
- CHARSET_
19 - Extended_UNIX_Code_Fixed_Width_for_Japanese
- CHARSET_
20 - BS_4730
- CHARSET_
21 - SEN_850200_C
- CHARSET_
22 - IT
- CHARSET_
23 - ES
- CHARSET_
24 - DIN_66003
- CHARSET_
25 - NS_4551-1
- CHARSET_
26 - NF_Z_62-010
- CHARSET_
27 - ISO-10646-UTF-1
- CHARSET_
28 - ISO_646.basic:1983
- CHARSET_
29 - INVARIANT
- CHARSET_
30 - ISO_646.irv:1983
- CHARSET_
31 - NATS-SEFI
- CHARSET_
32 - NATS-SEFI-ADD
- CHARSET_
33 - NATS-DANO
- CHARSET_
34 - NATS-DANO-ADD
- CHARSET_
35 - SEN_850200_B
- CHARSET_
36 - KS_C_5601-1987
- CHARSET_
37 - ISO-2022-KR
- CHARSET_
38 - EUC-KR
- CHARSET_
39 - ISO-2022-JP
- CHARSET_
40 - ISO-2022-JP-2
- CHARSET_
41 - JIS_C6220-1969-jp
- CHARSET_
42 - JIS_C6220-1969-ro
- CHARSET_
43 - PT
- CHARSET_
44 - greek7-old
- CHARSET_
45 - latin-greek
- CHARSET_
46 - NF_Z_62-010_(1973)
- CHARSET_
47 - Latin-greek-1
- CHARSET_
48 - ISO_5427
- CHARSET_
49 - JIS_C6226-1978
- CHARSET_
50 - BS_viewdata
- CHARSET_
51 - INIS
- CHARSET_
52 - INIS-8
- CHARSET_
53 - INIS-cyrillic
- CHARSET_
54 - ISO_5427:1981
- CHARSET_
55 - ISO_5428:1980
- CHARSET_
56 - GB_1988-80
- CHARSET_
57 - GB_2312-80
- CHARSET_
58 - NS_4551-2
- CHARSET_
59 - videotex-suppl
- CHARSET_
60 - PT2
- CHARSET_
61 - ES2
- CHARSET_
62 - MSZ_7795.3
- CHARSET_
63 - JIS_C6226-1983
- CHARSET_
64 - greek7
- CHARSET_
65 - ASMO_449
- CHARSET_
66 - iso-ir-90
- CHARSET_
67 - JIS_C6229-1984-a
- CHARSET_
68 - JIS_C6229-1984-b
- CHARSET_
69 - JIS_C6229-1984-b-add
- CHARSET_
70 - JIS_C6229-1984-hand
- CHARSET_
71 - JIS_C6229-1984-hand-add
- CHARSET_
72 - JIS_C6229-1984-kana
- CHARSET_
73 - ISO_2033-1983
- CHARSET_
74 - ANSI_X3.110-1983
- CHARSET_
75 - T.61-7bit
- CHARSET_
76 - T.61-8bit
- CHARSET_
77 - ECMA-cyrillic
- CHARSET_
78 - CSA_Z243.4-1985-1
- CHARSET_
79 - CSA_Z243.4-1985-2
- CHARSET_
80 - CSA_Z243.4-1985-gr
- CHARSET_
81 - ISO_8859-6-E
- CHARSET_
82 - ISO_8859-6-I
- CHARSET_
83 - T.101-G2
- CHARSET_
84 - ISO_8859-8-E
- CHARSET_
85 - ISO_8859-8-I
- CHARSET_
86 - CSN_369103
- CHARSET_
87 - JUS_I.B1.002
- CHARSET_
88 - IEC_P27-1
- CHARSET_
89 - JUS_I.B1.003-serb
- CHARSET_
90 - JUS_I.B1.003-mac
- CHARSET_
91 - greek-ccitt
- CHARSET_
92 - NC_NC00-10:81
- CHARSET_
93 - ISO_6937-2-25
- CHARSET_
94 - GOST_19768-74
- CHARSET_
95 - ISO_8859-supp
- CHARSET_
96 - ISO_10367-box
- CHARSET_
97 - latin-lap
- CHARSET_
98 - JIS_X0212-1990
- CHARSET_
99 - DS_2089
- CHARSET_
100 - us-dk
- CHARSET_
101 - dk-us
- CHARSET_
102 - KSC5636
- CHARSET_
103 - UNICODE-1-1-UTF-7
- CHARSET_
104 - ISO-2022-CN
- CHARSET_
105 - ISO-2022-CN-EXT
- CHARSET_
106 - UTF-8
- CHARSET_
109 - ISO-8859-13
- CHARSET_
110 - ISO-8859-14
- CHARSET_
111 - ISO-8859-15
- CHARSET_
112 - ISO-8859-16
- CHARSET_
113 - GBK
- CHARSET_
114 - GB18030
- CHARSET_
115 - OSD_EBCDIC_DF04_15
- CHARSET_
116 - OSD_EBCDIC_DF03_IRV
- CHARSET_
117 - OSD_EBCDIC_DF04_1
- CHARSET_
118 - ISO-11548-1
- CHARSET_
119 - KZ-1048
- CHARSET_
1000 - ISO-10646-UCS-2
- CHARSET_
1001 - ISO-10646-UCS-4
- CHARSET_
1002 - ISO-10646-UCS-Basic
- CHARSET_
1003 - ISO-10646-Unicode-Latin1
- CHARSET_
1004 - ISO-10646-J-1
- CHARSET_
1005 - ISO-Unicode-IBM-1261
- CHARSET_
1006 - ISO-Unicode-IBM-1268
- CHARSET_
1007 - ISO-Unicode-IBM-1276
- CHARSET_
1008 - ISO-Unicode-IBM-1264
- CHARSET_
1009 - ISO-Unicode-IBM-1265
- CHARSET_
1010 - UNICODE-1-1
- CHARSET_
1011 - SCSU
- CHARSET_
1012 - UTF-7
- CHARSET_
1013 - UTF-16BE
- CHARSET_
1014 - UTF-16LE
- CHARSET_
1015 - UTF-16
- CHARSET_
1016 - CESU-8
- CHARSET_
1017 - UTF-32
- CHARSET_
1018 - UTF-32BE
- CHARSET_
1019 - UTF-32LE
- CHARSET_
1020 - BOCU-1
- CHARSET_
1021 - UTF-7-IMAP
- CHARSET_
2000 - ISO-8859-1-Windows-3.0-Latin-1
- CHARSET_
2001 - ISO-8859-1-Windows-3.1-Latin-1
- CHARSET_
2002 - ISO-8859-2-Windows-Latin-2
- CHARSET_
2003 - ISO-8859-9-Windows-Latin-5
- CHARSET_
2004 - hp-roman8
- CHARSET_
2005 - Adobe-Standard-Encoding
- CHARSET_
2006 - Ventura-US
- CHARSET_
2007 - Ventura-International
- CHARSET_
2008 - DEC-MCS
- CHARSET_
2009 - IBM850
- CHARSET_
2010 - IBM852
- CHARSET_
2011 - IBM437
- CHARSET_
2012 - PC8-Danish-Norwegian
- CHARSET_
2013 - IBM862
- CHARSET_
2014 - PC8-Turkish
- CHARSET_
2015 - IBM-Symbols
- CHARSET_
2016 - IBM-Thai
- CHARSET_
2017 - HP-Legal
- CHARSET_
2018 - HP-Pi-font
- CHARSET_
2019 - HP-Math8
- CHARSET_
2020 - Adobe-Symbol-Encoding
- CHARSET_
2021 - HP-DeskTop
- CHARSET_
2022 - Ventura-Math
- CHARSET_
2023 - Microsoft-Publishing
- CHARSET_
2024 - Windows-31J
- CHARSET_
2025 - GB2312
- CHARSET_
2026 - Big5
- CHARSET_
2027 - macintosh
- CHARSET_
2028 - IBM037
- CHARSET_
2029 - IBM038
- CHARSET_
2030 - IBM273
- CHARSET_
2031 - IBM274
- CHARSET_
2032 - IBM275
- CHARSET_
2033 - IBM277
- CHARSET_
2034 - IBM278
- CHARSET_
2035 - IBM280
- CHARSET_
2036 - IBM281
- CHARSET_
2037 - IBM284
- CHARSET_
2038 - IBM285
- CHARSET_
2039 - IBM290
- CHARSET_
2040 - IBM297
- CHARSET_
2041 - IBM420
- CHARSET_
2042 - IBM423
- CHARSET_
2043 - IBM424
- CHARSET_
2044 - IBM500
- CHARSET_
2045 - IBM851
- CHARSET_
2046 - IBM855
- CHARSET_
2047 - IBM857
- CHARSET_
2048 - IBM860
- CHARSET_
2049 - IBM861
- CHARSET_
2050 - IBM863
- CHARSET_
2051 - IBM864
- CHARSET_
2052 - IBM865
- CHARSET_
2053 - IBM868
- CHARSET_
2054 - IBM869
- CHARSET_
2055 - IBM870
- CHARSET_
2056 - IBM871
- CHARSET_
2057 - IBM880
- CHARSET_
2058 - IBM891
- CHARSET_
2059 - IBM903
- CHARSET_
2060 - IBM904
- CHARSET_
2061 - IBM905
- CHARSET_
2062 - IBM918
- CHARSET_
2063 - IBM1026
- CHARSET_
2064 - EBCDIC-AT-DE
- CHARSET_
2065 - EBCDIC-AT-DE-A
- CHARSET_
2066 - EBCDIC-CA-FR
- CHARSET_
2067 - EBCDIC-DK-NO
- CHARSET_
2068 - EBCDIC-DK-NO-A
- CHARSET_
2069 - EBCDIC-FI-SE
- CHARSET_
2070 - EBCDIC-FI-SE-A
- CHARSET_
2071 - EBCDIC-FR
- CHARSET_
2072 - EBCDIC-IT
- CHARSET_
2073 - EBCDIC-PT
- CHARSET_
2074 - EBCDIC-ES
- CHARSET_
2075 - EBCDIC-ES-A
- CHARSET_
2076 - EBCDIC-ES-S
- CHARSET_
2077 - EBCDIC-UK
- CHARSET_
2078 - EBCDIC-US
- CHARSET_
2079 - UNKNOWN-8BIT
- CHARSET_
2080 - MNEMONIC
- CHARSET_
2081 - MNEM
- CHARSET_
2082 - VISCII
- CHARSET_
2083 - VIQR
- CHARSET_
2084 - KOI8-R
- CHARSET_
2085 - HZ-GB-2312
- CHARSET_
2086 - IBM866
- CHARSET_
2087 - IBM775
- CHARSET_
2088 - KOI8-U
- CHARSET_
2089 - IBM00858
- CHARSET_
2090 - IBM00924
- CHARSET_
2091 - IBM01140
- CHARSET_
2092 - IBM01141
- CHARSET_
2093 - IBM01142
- CHARSET_
2094 - IBM01143
- CHARSET_
2095 - IBM01144
- CHARSET_
2096 - IBM01145
- CHARSET_
2097 - IBM01146
- CHARSET_
2098 - IBM01147
- CHARSET_
2099 - IBM01148
- CHARSET_
2100 - IBM01149
- CHARSET_
2101 - Big5-HKSCS
- CHARSET_
2102 - IBM1047
- CHARSET_
2103 - PTCP154
- CHARSET_
2104 - Amiga-1251
- CHARSET_
2105 - KOI7-switched
- CHARSET_
2106 - BRF
- CHARSET_
2107 - TSCII
- CHARSET_
2108 - CP51932
- CHARSET_
2109 - windows-874
- CHARSET_
2250 - windows-1250
- CHARSET_
2251 - windows-1251
- CHARSET_
2252 - windows-1252
- CHARSET_
2253 - windows-1253
- CHARSET_
2254 - windows-1254
- CHARSET_
2255 - windows-1255
- CHARSET_
2256 - windows-1256
- CHARSET_
2257 - windows-1257
- CHARSET_
2258 - windows-1258
- CHARSET_
2259 - TIS-620
- CHARSET_
2260 - CP50220
- IANA_
CHARSET - An instance of the
Standard
struct defined in thecodes_agency
package that describes the ISO-10383 specification.