Crate codes_iana_charset

Source
Expand description

This package contains an implementation of the IANA CHARSET specification.

These are the official names for character sets that may be used in the Internet and may be referred to in Internet documentation. These names are expressed in ANSI_X3.4-1968 which is commonly called US-ASCII or simply ASCII. The character set most commonly use in the Internet and used especially in protocol standards is US-ASCII, this is strongly encouraged. The use of the name US-ASCII is also encouraged.

The character set names may be up to 40 characters taken from the printable characters of US-ASCII. However, no distinction is made between use of upper and lower case letters.

The MIBenum value is a unique value for use in MIBs to identify coded character sets.

The value space for MIBenum values has been divided into three regions. The first region (3-999) consists of coded character sets that have been standardized by some standard setting organization. This region is intended for standards that do not have subset implementations. The second region (1000-1999) is for the Unicode and ISO/IEC 10646 coded character sets together with a specification of a (set of) sub-repertoires that may occur. The third region (>1999) is intended for vendor specific coded character sets.

§Example

use codes_iana_charset as charset;

let latin_1 = charset::CHARSET_4;
assert_eq!(latin_1.id(), 4);
assert_eq!(latin_1.name(), "ISO_8859-1:1987");
assert_eq!(
    latin_1.source(),
    "[ISO-IR: International Register of Escape Sequences] Note: The current registration authority is IPSJ/ITSCJ, Japan.",
);
assert_eq!(latin_1.preferred_alias(), Some("ISO-8859-1"));
assert_eq!(latin_1.aliases(), &[
    "iso-ir-100",
    "ISO_8859-1",
    "ISO-8859-1",
    "latin1",
    "l1",
    "IBM819",
    "CP819",
    "csISOLatin1"
]);
assert_eq!(latin_1.reference(), Some("[RFC1345][Keld_Simonsen]"));

Note that the implementation of FromStr takes into account all aliases.

use codes_iana_charset as charset;
use std::str::FromStr;

let latin_1 = charset::CHARSET_4;

let iso_8859_1 = charset::CharacterSetCode::from_str("ISO_8859-1").unwrap();

assert_eq!(latin_1, iso_8859_1);

let some_charset = charset::CharacterSetCode::try_from(4).unwrap();

assert_eq!(some_charset, iso_8859_1);

§Features

Structs§

CharacterSetCode
This type is used to encapsulate the numeric MIB enum for IANA-defined Character Sets.

Enums§

CharacterSetCodeError
Common Error type, mainly used for FromStr failures.

Constants§

ALL_CODES
Provides an array of all defined CharacterSetCode codes, useful for queries.
CHARSET_3
US-ASCII
CHARSET_4
ISO_8859-1:1987
CHARSET_5
ISO_8859-2:1987
CHARSET_6
ISO_8859-3:1988
CHARSET_7
ISO_8859-4:1988
CHARSET_8
ISO_8859-5:1988
CHARSET_9
ISO_8859-6:1987
CHARSET_10
ISO_8859-7:1987
CHARSET_11
ISO_8859-8:1988
CHARSET_12
ISO_8859-9:1989
CHARSET_13
ISO-8859-10
CHARSET_14
ISO_6937-2-add
CHARSET_15
JIS_X0201
CHARSET_16
JIS_Encoding
CHARSET_17
Shift_JIS
CHARSET_18
Extended_UNIX_Code_Packed_Format_for_Japanese
CHARSET_19
Extended_UNIX_Code_Fixed_Width_for_Japanese
CHARSET_20
BS_4730
CHARSET_21
SEN_850200_C
CHARSET_22
IT
CHARSET_23
ES
CHARSET_24
DIN_66003
CHARSET_25
NS_4551-1
CHARSET_26
NF_Z_62-010
CHARSET_27
ISO-10646-UTF-1
CHARSET_28
ISO_646.basic:1983
CHARSET_29
INVARIANT
CHARSET_30
ISO_646.irv:1983
CHARSET_31
NATS-SEFI
CHARSET_32
NATS-SEFI-ADD
CHARSET_33
NATS-DANO
CHARSET_34
NATS-DANO-ADD
CHARSET_35
SEN_850200_B
CHARSET_36
KS_C_5601-1987
CHARSET_37
ISO-2022-KR
CHARSET_38
EUC-KR
CHARSET_39
ISO-2022-JP
CHARSET_40
ISO-2022-JP-2
CHARSET_41
JIS_C6220-1969-jp
CHARSET_42
JIS_C6220-1969-ro
CHARSET_43
PT
CHARSET_44
greek7-old
CHARSET_45
latin-greek
CHARSET_46
NF_Z_62-010_(1973)
CHARSET_47
Latin-greek-1
CHARSET_48
ISO_5427
CHARSET_49
JIS_C6226-1978
CHARSET_50
BS_viewdata
CHARSET_51
INIS
CHARSET_52
INIS-8
CHARSET_53
INIS-cyrillic
CHARSET_54
ISO_5427:1981
CHARSET_55
ISO_5428:1980
CHARSET_56
GB_1988-80
CHARSET_57
GB_2312-80
CHARSET_58
NS_4551-2
CHARSET_59
videotex-suppl
CHARSET_60
PT2
CHARSET_61
ES2
CHARSET_62
MSZ_7795.3
CHARSET_63
JIS_C6226-1983
CHARSET_64
greek7
CHARSET_65
ASMO_449
CHARSET_66
iso-ir-90
CHARSET_67
JIS_C6229-1984-a
CHARSET_68
JIS_C6229-1984-b
CHARSET_69
JIS_C6229-1984-b-add
CHARSET_70
JIS_C6229-1984-hand
CHARSET_71
JIS_C6229-1984-hand-add
CHARSET_72
JIS_C6229-1984-kana
CHARSET_73
ISO_2033-1983
CHARSET_74
ANSI_X3.110-1983
CHARSET_75
T.61-7bit
CHARSET_76
T.61-8bit
CHARSET_77
ECMA-cyrillic
CHARSET_78
CSA_Z243.4-1985-1
CHARSET_79
CSA_Z243.4-1985-2
CHARSET_80
CSA_Z243.4-1985-gr
CHARSET_81
ISO_8859-6-E
CHARSET_82
ISO_8859-6-I
CHARSET_83
T.101-G2
CHARSET_84
ISO_8859-8-E
CHARSET_85
ISO_8859-8-I
CHARSET_86
CSN_369103
CHARSET_87
JUS_I.B1.002
CHARSET_88
IEC_P27-1
CHARSET_89
JUS_I.B1.003-serb
CHARSET_90
JUS_I.B1.003-mac
CHARSET_91
greek-ccitt
CHARSET_92
NC_NC00-10:81
CHARSET_93
ISO_6937-2-25
CHARSET_94
GOST_19768-74
CHARSET_95
ISO_8859-supp
CHARSET_96
ISO_10367-box
CHARSET_97
latin-lap
CHARSET_98
JIS_X0212-1990
CHARSET_99
DS_2089
CHARSET_100
us-dk
CHARSET_101
dk-us
CHARSET_102
KSC5636
CHARSET_103
UNICODE-1-1-UTF-7
CHARSET_104
ISO-2022-CN
CHARSET_105
ISO-2022-CN-EXT
CHARSET_106
UTF-8
CHARSET_109
ISO-8859-13
CHARSET_110
ISO-8859-14
CHARSET_111
ISO-8859-15
CHARSET_112
ISO-8859-16
CHARSET_113
GBK
CHARSET_114
GB18030
CHARSET_115
OSD_EBCDIC_DF04_15
CHARSET_116
OSD_EBCDIC_DF03_IRV
CHARSET_117
OSD_EBCDIC_DF04_1
CHARSET_118
ISO-11548-1
CHARSET_119
KZ-1048
CHARSET_1000
ISO-10646-UCS-2
CHARSET_1001
ISO-10646-UCS-4
CHARSET_1002
ISO-10646-UCS-Basic
CHARSET_1003
ISO-10646-Unicode-Latin1
CHARSET_1004
ISO-10646-J-1
CHARSET_1005
ISO-Unicode-IBM-1261
CHARSET_1006
ISO-Unicode-IBM-1268
CHARSET_1007
ISO-Unicode-IBM-1276
CHARSET_1008
ISO-Unicode-IBM-1264
CHARSET_1009
ISO-Unicode-IBM-1265
CHARSET_1010
UNICODE-1-1
CHARSET_1011
SCSU
CHARSET_1012
UTF-7
CHARSET_1013
UTF-16BE
CHARSET_1014
UTF-16LE
CHARSET_1015
UTF-16
CHARSET_1016
CESU-8
CHARSET_1017
UTF-32
CHARSET_1018
UTF-32BE
CHARSET_1019
UTF-32LE
CHARSET_1020
BOCU-1
CHARSET_1021
UTF-7-IMAP
CHARSET_2000
ISO-8859-1-Windows-3.0-Latin-1
CHARSET_2001
ISO-8859-1-Windows-3.1-Latin-1
CHARSET_2002
ISO-8859-2-Windows-Latin-2
CHARSET_2003
ISO-8859-9-Windows-Latin-5
CHARSET_2004
hp-roman8
CHARSET_2005
Adobe-Standard-Encoding
CHARSET_2006
Ventura-US
CHARSET_2007
Ventura-International
CHARSET_2008
DEC-MCS
CHARSET_2009
IBM850
CHARSET_2010
IBM852
CHARSET_2011
IBM437
CHARSET_2012
PC8-Danish-Norwegian
CHARSET_2013
IBM862
CHARSET_2014
PC8-Turkish
CHARSET_2015
IBM-Symbols
CHARSET_2016
IBM-Thai
CHARSET_2017
HP-Legal
CHARSET_2018
HP-Pi-font
CHARSET_2019
HP-Math8
CHARSET_2020
Adobe-Symbol-Encoding
CHARSET_2021
HP-DeskTop
CHARSET_2022
Ventura-Math
CHARSET_2023
Microsoft-Publishing
CHARSET_2024
Windows-31J
CHARSET_2025
GB2312
CHARSET_2026
Big5
CHARSET_2027
macintosh
CHARSET_2028
IBM037
CHARSET_2029
IBM038
CHARSET_2030
IBM273
CHARSET_2031
IBM274
CHARSET_2032
IBM275
CHARSET_2033
IBM277
CHARSET_2034
IBM278
CHARSET_2035
IBM280
CHARSET_2036
IBM281
CHARSET_2037
IBM284
CHARSET_2038
IBM285
CHARSET_2039
IBM290
CHARSET_2040
IBM297
CHARSET_2041
IBM420
CHARSET_2042
IBM423
CHARSET_2043
IBM424
CHARSET_2044
IBM500
CHARSET_2045
IBM851
CHARSET_2046
IBM855
CHARSET_2047
IBM857
CHARSET_2048
IBM860
CHARSET_2049
IBM861
CHARSET_2050
IBM863
CHARSET_2051
IBM864
CHARSET_2052
IBM865
CHARSET_2053
IBM868
CHARSET_2054
IBM869
CHARSET_2055
IBM870
CHARSET_2056
IBM871
CHARSET_2057
IBM880
CHARSET_2058
IBM891
CHARSET_2059
IBM903
CHARSET_2060
IBM904
CHARSET_2061
IBM905
CHARSET_2062
IBM918
CHARSET_2063
IBM1026
CHARSET_2064
EBCDIC-AT-DE
CHARSET_2065
EBCDIC-AT-DE-A
CHARSET_2066
EBCDIC-CA-FR
CHARSET_2067
EBCDIC-DK-NO
CHARSET_2068
EBCDIC-DK-NO-A
CHARSET_2069
EBCDIC-FI-SE
CHARSET_2070
EBCDIC-FI-SE-A
CHARSET_2071
EBCDIC-FR
CHARSET_2072
EBCDIC-IT
CHARSET_2073
EBCDIC-PT
CHARSET_2074
EBCDIC-ES
CHARSET_2075
EBCDIC-ES-A
CHARSET_2076
EBCDIC-ES-S
CHARSET_2077
EBCDIC-UK
CHARSET_2078
EBCDIC-US
CHARSET_2079
UNKNOWN-8BIT
CHARSET_2080
MNEMONIC
CHARSET_2081
MNEM
CHARSET_2082
VISCII
CHARSET_2083
VIQR
CHARSET_2084
KOI8-R
CHARSET_2085
HZ-GB-2312
CHARSET_2086
IBM866
CHARSET_2087
IBM775
CHARSET_2088
KOI8-U
CHARSET_2089
IBM00858
CHARSET_2090
IBM00924
CHARSET_2091
IBM01140
CHARSET_2092
IBM01141
CHARSET_2093
IBM01142
CHARSET_2094
IBM01143
CHARSET_2095
IBM01144
CHARSET_2096
IBM01145
CHARSET_2097
IBM01146
CHARSET_2098
IBM01147
CHARSET_2099
IBM01148
CHARSET_2100
IBM01149
CHARSET_2101
Big5-HKSCS
CHARSET_2102
IBM1047
CHARSET_2103
PTCP154
CHARSET_2104
Amiga-1251
CHARSET_2105
KOI7-switched
CHARSET_2106
BRF
CHARSET_2107
TSCII
CHARSET_2108
CP51932
CHARSET_2109
windows-874
CHARSET_2250
windows-1250
CHARSET_2251
windows-1251
CHARSET_2252
windows-1252
CHARSET_2253
windows-1253
CHARSET_2254
windows-1254
CHARSET_2255
windows-1255
CHARSET_2256
windows-1256
CHARSET_2257
windows-1257
CHARSET_2258
windows-1258
CHARSET_2259
TIS-620
CHARSET_2260
CP50220
IANA_CHARSET
An instance of the Standard struct defined in the codes_agency package that describes the ISO-10383 specification.