Crate encoding_rs [−] [src]
encoding_rs is a Gecko-oriented Free Software / Open Source implementation of the Encoding Standard in Rust. Gecko-oriented means that converting to and from UTF-16 is supported in addition to converting to and from UTF-8, that the performance and streamability goals are browser-oriented and that FFI-friendliness is a goal.
Availability
The code is available under the
Apache license, Version 2.0
or the MIT license, at your option.
See the
COPYRIGHT
file for details.
The repository is on GitHub. The
crate is available on crates.io.
Examples
Example programs:
Decode using the non-streaming API:
use encoding_rs::*; let expectation = "\u{30CF}\u{30ED}\u{30FC}\u{30FB}\u{30EF}\u{30FC}\u{30EB}\u{30C9}"; let bytes = b"\x83n\x83\x8D\x81[\x81E\x83\x8F\x81[\x83\x8B\x83h"; let (cow, encoding_used, had_errors) = SHIFT_JIS.decode(bytes); assert_eq!(&cow[..], expectation); assert_eq!(encoding_used, SHIFT_JIS); assert!(!had_errors);
Decode using the streaming API with minimal unsafe
:
use encoding_rs::*; let expectation = "\u{30CF}\u{30ED}\u{30FC}\u{30FB}\u{30EF}\u{30FC}\u{30EB}\u{30C9}"; // Use an array of byte slices to demonstrate content arriving piece by // piece from the network. let bytes: [&'static [u8]; 4] = [b"\x83", b"n\x83\x8D\x81", b"[\x81E\x83\x8F\x81[\x83", b"\x8B\x83h"]; // Very short output buffer to demonstrate the output buffer getting full. // Normally, you'd use something like `[0u8; 2048]`. let mut buffer_bytes = [0u8; 8]; // Rust doesn't allow us to stack-allocate a `mut str` without `unsafe`. let mut buffer: &mut str = unsafe { std::mem::transmute(&mut buffer_bytes[..]) }; // How many bytes in the buffer currently hold significant data. let mut bytes_in_buffer = 0usize; // Collect the output to a string for demonstration purposes. let mut output = String::new(); // The `Decoder` let mut decoder = SHIFT_JIS.new_decoder(); // Track whether we see errors. let mut total_had_errors = false; // Decode using a fixed-size intermediate buffer (for demonstrating the // use of a fixed-size buffer; normally when the output of an incremental // decode goes to a `String` one would use `Decoder.decode_to_string()` to // avoid the intermediate buffer). for input in &bytes[..] { // The number of bytes already read from current `input` in total. let mut total_read_from_current_input = 0usize; loop { let (result, read, written, had_errors) = decoder.decode_to_str(&input[total_read_from_current_input..], &mut buffer[bytes_in_buffer..], false); total_read_from_current_input += read; bytes_in_buffer += written; if had_errors { total_had_errors = true; } match result { CoderResult::InputEmpty => { // We have consumed the current input buffer. Break out of // the inner loop to get the next input buffer from the // outer loop. break; }, CoderResult::OutputFull => { // Write the current buffer out and consider the buffer // empty. output.push_str(&buffer[..bytes_in_buffer]); bytes_in_buffer = 0usize; continue; } } } } // Process EOF loop { let (result, _, written, had_errors) = decoder.decode_to_str(b"", &mut buffer[bytes_in_buffer..], true); bytes_in_buffer += written; if had_errors { total_had_errors = true; } // Write the current buffer out and consider the buffer empty. // Need to do this here for both `match` arms, because we exit the // loop on `CoderResult::InputEmpty`. output.push_str(&buffer[..bytes_in_buffer]); bytes_in_buffer = 0usize; match result { CoderResult::InputEmpty => { // Done! break; }, CoderResult::OutputFull => { continue; } } } assert_eq!(&output[..], expectation); assert!(!total_had_errors);
Web / Browser Focus
Both in terms of scope and performance, the focus is on the Web. For scope, this means that encoding_rs implements the Encoding Standard fully and doesn't implement encodings that are not specified in the Encoding Standard. For performance, this means that decoding performance is important as well as performance for encoding into UTF-8 or encoding the Basic Latin range (ASCII) into legacy encodings. Non-Basic Latin needs to be encoded into legacy encodings in only two places in the Web platform: in the query part of URLs, in which case it's a matter of relatively rare error handling, and in form submission, in which case the user action and networking tend to hide the performance of the encoder.
Deemphasizing performance of encoding non-Basic Latin text into legacy encodings enables smaller code size thanks to the encoder side using the decode-optimized data tables without having encode-optimized data tables at all. Even in decoders, smaller lookup table size is preferred over avoiding multiplication operations.
Additionally, performance is a non-goal for the ASCII-incompatible
ISO-2022-JP and UTF-16 encodings, which are rarely used on the Web. For
clarity, this means that performance is a non-goal for UTF-16 as used on
the wire as an interchange encoding (UTF-16 on the [u8]
side of the API).
Good performance for UTF-16 used as an in-RAM Unicode representation
(UTF-16 the [u16]
side of the API) is a goal.
Despite the focus on the Web, encoding_rs may well be useful for decoding email, although you'll need to implement UTF-7 decoding and label handling by other means. (Due to the Web focus, patches to add UTF-7 are unwelcome in encoding_rs itself.) Also, despite the browser focus, the hope is that non-browser applications that wish to consume Web content or submit Web forms in a Web-compatible way will find encoding_rs useful.
Streaming & Non-Streaming; Rust & C/C++
The API in Rust has two modes of operation: streaming and non-streaming. The streaming API is the foundation of the implementation and should be used when processing data that arrives piecemeal from an i/o stream. The streaming API has an FFI wrapper (as a separate crate) that exposes it to C callers. The non-streaming part of the API is for Rust callers only and is smart about borrowing instead of copying when possible. When streamability is not needed, the non-streaming API should be preferrer in order to avoid copying data when a borrow suffices.
There is no analogous C API exposed via FFI, mainly because C doesn't have standard types for growable byte buffers and Unicode strings that know their length.
The C API (header file generated at target/include/encoding_rs.h
when
building encoding_rs) can, in turn, be wrapped for use from C++. Such a
C++ wrapper could re-create the non-streaming API in C++ for C++ callers.
Currently, the C binding comes with a
C++ wrapper
that uses STL+GSL types, but this
wrapper doesn't provide non-streaming convenience methods at this time. A
C++ wrapper with XPCOM/MFBT types is planned but does not exist yet.
The Encoding
type is common to both the streaming and non-streaming
modes. In the streaming mode, decoding operations are performed with a
Decoder
and encoding operations with an Encoder
object obtained via
Encoding
. In the non-streaming mode, decoding and encoding operations are
performed using methods on Encoding
objects themselves, so the Decoder
and Encoder
objects are not used at all.
Memory management
The non-streaming mode never performs heap allocations (even the methods
that write into a Vec<u8>
or a String
by taking them as arguments do
not reallocate the backing buffer of the Vec<u8>
or the String
). That
is, the non-streaming mode uses caller-allocated buffers exclusively.
The methods of the streaming mode that return a Vec<u8>
or a String
perform heap allocations but only to allocate the backing buffer of the
Vec<u8>
or the String
.
Encoding
is always statically allocated. Decoder
and Encoder
need no
Drop
cleanup.
Buffer reading and writing behavior
Based on experience gained with the java.nio.charset
encoding converter
API and with the Gecko uconv encoding converter API, the buffer reading
and writing behaviors of encoding_rs are asymmetric: input buffers are
fully drained but output buffers are not always fully filled.
When reading from an input buffer, encoding_rs always consumes all input
up to the next error or to the end of the buffer. In particular, when
decoding, even if the input buffer ends in the middle of a byte sequence
for a character, the decoder consumes all input. This has the benefit that
the caller of the API can always fill the next buffer from the start from
whatever source the bytes come from and never has to first copy the last
bytes of the previous buffer to the start of the next buffer. However, when
encoding, the UTF-8 input buffers have to end at a character boundary, which
is a requirement for the Rust str
type anyway, and UTF-16 input buffer
boundaries falling in the middle of a surrogate pair result in both
suggorates being treated individually as unpaired surrogates.
Additionally, decoders guarantee that they can be fed even one byte at a time and encoders guarantee that they can be fed even one code point at a time. This has the benefit of not placing restrictions on the size of chunks the content arrives e.g. from network.
When writing into an output buffer, encoding_rs makes sure that the code unit sequence for a character is never split across output buffer boundaries. This may result in wasted space at the end of an output buffer, but the advantages are that the output side of both decoders and encoders is greatly simplified compared to designs that attempt to fill output buffers exactly even when that entails splitting a code unit sequence and when encoding_rs methods return to the caller, the output produces thus far is always valid taken as whole. (In the case of encoding to ISO-2022-JP, the output needs to be considered as a whole, because the latest output buffer taken alone might not be valid taken alone if the transition away from the ASCII state occurred in an earlier output buffer. However, since the ISO-2022-JP decoder doesn't treat streams that don't end in the ASCII state as being in error despite the encoder generating a transition to the ASCII state at the end, the claim about the partial output taken as a whole being valid is true even for ISO-2022-JP.)
Error Reporting
Based on experience gained with the java.nio.charset
encoding converter
API and with the Gecko uconv encoding converter API, the error reporting
behaviors of encoding_rs are asymmetric: decoder errors include offsets
that leave it up to the caller to extract the erroneous bytes from the
input stream if the caller wishes to do so but encoder errors provide the
code point associated with the error without requiring the caller to
extract it from the input on its own.
On the encoder side, an error is always triggered by the most recently
pushed Unicode scalar, which makes it simple to pass the char
to the
caller. Also, it's very typical for the caller to wish to do something with
this data: generate a numeric escape for the character. Additionally, the
ISO-2022-JP encoder reports U+FFFD instead of the actual input character in
certain cases, so requiring the caller to extract the character from the
input buffer would require the caller to handle ISO-2022-JP details.
Furthermore, requiring the caller to extract the character from the input
buffer would require the caller to implement UTF-8 or UTF-16 math, which is
the job of an encoding conversion library.
On the decoder side, errors are triggered in more complex ways. For example, when decoding the sequence ESC, '$', buffer boundary, 'A' as ISO-2022-JP, the ESC byte is in error, but this is discovered only after the buffer boundary when processing 'A'. Thus, the bytes in error might not be the ones most recently pushed to the decoder and the error might not even be in the current buffer.
Some encoding conversion APIs address the problem by not acknowledging trailing bytes of an input buffer as consumed if it's still possible for future bytes to cause the trailing bytes to be in error. This way, error reporting can always refer to the most recently pushed buffer. This has the problem that the caller of the API has to copy the unconsumed trailing bytes to the start of the next buffer before being able to fill the rest of the next buffer. This is annoying, error-prone and inefficient.
A possible solution would be making the decoder remember recently consumed bytes in order to be able to include a copy of the erroneous bytes when reporting an error. This has two problem: First, callers a rarely interested in the erroneous bytes, so attempts to identify them are most often just overhead anyway. Second, the rare applications that are interested typically care about the location of the error in the input stream.
To keep the API convenient for common uses and the overhead low while making it possible to develop applications, such as HTML validators, that care about which bytes were in error, encoding_rs reports the length of the erroneous sequence and the number of bytes consumed after the erroneous sequence. As long as the caller doesn't discard the 6 most recent bytes, this makes it possible for callers that care about the erroneous bytes to locate them.
No Convenience API for Custom Replacements
The Web Platform and, therefore, the Encoding Standard supports only one error recovery mode for decoders and only one error recovery mode for encoders. The supported error recovery mode for decoders is emitting the REPLACEMENT CHARACTER on error. The supported error recovery mode for encoders is emitting an HTML decimal numeric character reference for unmappable characters.
Since encoding_rs is Web-focused, these are the only error recovery modes for which convenient support is provided. Moreover, on the decoder side, there aren't really good alternatives for emitting the REPLACEMENT CHARACTER on error (other than treating errors as fatal). In particular, simply ignoring errors is a security problem, so it would be a bad idea for encoding_rs to provide a mode that encouraged callers to ignore errors.
On the encoder side, there are plausible alternatives for HTML decimal numeric character references. For example, when outputting CSS, CSS-style escapes would seem to make sense. However, instead of facilitating the output of CSS, JS, etc. in non-UTF-8 encodings, encoding_rs takes the design position that you shouldn't generate output in encodings other than UTF-8, except where backward compatibility with interacting with the legacy Web requires it. The legacy Web requires it only when parsing the query strings of URLs and when submitting forms, and those two both use HTML decimal numeric character references.
While encoding_rs doesn't make encoder replacements other than HTML decimal
numeric character references easy, it does make them possible.
encode_from_utf8()
, which emits HTML decimal numeric character references
for unmappable characters, is implemented on top of
encode_from_utf8_without_replacement()
. Applications that really, really
want other replacement schemes for unmappable characters can likewise
implement them on top of encode_from_utf8_without_replacement()
.
No Extensibility by Design
The set of encodings supported by encoding_rs is not extensible by design.
That is, Encoding
, Decoder
and Encoder
are intentionally struct
s
rather than trait
s. encoding_rs takes the design position that all future
text interchange should be done using UTF-8, which can represent all of
Unicode. (It is, in fact, the only encoding supported by the Encoding
Standard and encoding_rs that can represent all of Unicode and that has
encoder support. UTF-16LE and UTF-16BE don't have encoder support, and
gb18030 cannot encode U+E5E5.) The other encodings are supported merely for
legacy compatibility and not due to non-UTF-8 encodings having benefits
other than being able to consume legacy content.
Considering that UTF-8 can represent all of Unicode and is already supported by all Web browsers, introducing a new encoding wouldn't add to the expressiveness but would add to compatibility problems. In that sense, adding new encodings to the Web Platform doesn't make sense, and, in fact, post-UTF-8 attempts at encodings, such as BOCU-1, have been rejected from the Web Platform. On the other hand, the set of legacy encodings that must be supported for a Web browser to be able to be successful is not going to expand. Empirically, the set of encodings specified in the Encoding Standard is already sufficient and the set of legacy encodings won't grow retroactively.
Since extensibility doesn't make sense considering the Web focus of
encoding_rs and adding encodings to Web clients would be actively harmful,
it makes sense to make the set of encodings that encoding_rs supports
non-extensible and to take the (admittedly small) benefits arising from
that, such as the size of Decoder
and Encoder
objects being known ahead
of time, which enables stack allocation thereof.
This does have downsides for applications that might want to put encoding_rs to non-Web uses if those non-Web uses involve legacy encodings that aren't needed for Web uses. The needs of such applications should not complicate encoding_rs itself, though. It is up to those applications to provide a framework that delegates the operations with encodings that encoding_rs supports to encoding_rs and operations with other encodings to something else (as opposed to encoding_rs itself providing an extensibility framework).
Panics
Methods in encoding_rs can panic if the API is used against the requirements stated in the documentation, if a state that's supposed to be impossible is reached due to an internal bug or on integer overflow. When used according to documentation with buffer sizes that stay below integer overflow, in the absence of internal bugs, encoding_rs does not panic.
Panics aren't documented beyond this on individual methods.
The FFI code does not deal with unwinding across the FFI boundary. Therefore, when using FFI, encoding_rs must be compiled with panics aborting in order to avoid Undefined Behavior.
At-Risk Parts of the API
The foreseeable source of partially backward-incompatible API change is the
way the instances of Encoding
are made available.
If Rust changes to allow the entries of [&'static Encoding; N]
to be
initialized with static
s of type &'static Encoding
, the non-reference
FOO_INIT
public Encoding
instances will be removed from the public API.
If Rust changes to make the referent of pub const FOO: &'static Encoding
unique when the constant is used in different crates, the reference-typed
static
s for the encoding instances will be changed from static
to
const
and the non-reference-typed _INIT
instances will be removed.
Mapping Spec Concepts onto the API
Spec Concept | Streaming | Non-Streaming |
---|---|---|
encoding | &'static Encoding | &'static Encoding |
UTF-8 encoding | UTF_8 | UTF_8 |
get an encoding | Encoding::for_label(label) | Encoding::for_label(label) |
name | encoding.name() | encoding.name() |
get an output encoding | encoding.output_encoding() | encoding.output_encoding() |
decode | let d = encoding.new_decoder(); | encoding.decode(src) |
UTF-8 decode | let d = UTF_8.new_decoder_with_bom_removal(); | UTF_8.decode_with_bom_removal(src) |
UTF-8 decode without BOM | let d = UTF_8.new_decoder_without_bom_handling(); | UTF_8.decode_without_bom_handling(src) |
UTF-8 decode without BOM or fail | let d = UTF_8.new_decoder_without_bom_handling(); | UTF_8.decode_without_bom_handling_and_without_replacement(src) |
encode | let e = encoding.new_encoder(); | encoding.encode(src) |
UTF-8 encode | Use the UTF-8 nature of Rust strings directly:write(src.as_bytes()); | Use the UTF-8 nature of Rust strings directly:src.as_bytes() |
Compatibility with the rust-encoding API
The crate encoding_rs_compat is a drop-in replacement for rust-encoding 0.2.32 that implements (most of) the API of rust-encoding 0.2.32 on top of encoding_rs.
Mapping rust-encoding concepts to encoding_rs concepts
The following table provides a mapping from rust-encoding constructs to encoding_rs ones.
rust-encoding | encoding_rs |
---|---|
encoding::EncodingRef | &'static encoding_rs::Encoding |
encoding::all::WINDOWS_31J (not based on the WHATWG name for some encodings) | encoding_rs::SHIFT_JIS (always the WHATWG name uppercased and hyphens replaced with underscores) |
encoding::all::ERROR | Not available because not in the Encoding Standard |
encoding::all::ASCII | Not available because not in the Encoding Standard |
encoding::all::ISO_8859_1 | Not available because not in the Encoding Standard |
encoding::all::HZ | Not available because not in the Encoding Standard |
encoding::label::encoding_from_whatwg_label(string) | encoding_rs::Encoding::for_label(string) |
enc.whatwg_name() (always lower case) | enc.name() (potentially mixed case) |
enc.name() | Not available because not in the Encoding Standard |
encoding::decode(bytes, encoding::DecoderTrap::Replace, enc) | enc.decode(bytes) |
enc.decode(bytes, encoding::DecoderTrap::Replace) | enc.decode_without_bom_handling(bytes) |
enc.encode(string, encoding::EncoderTrap::NcrEscape) | enc.encode(string) |
enc.raw_decoder() | enc.new_decoder_without_bom_handling() |
enc.raw_encoder() | enc.new_encoder() |
encoding::RawDecoder | encoding_rs::Decoder |
encoding::RawEncoder | encoding_rs::Encoder |
raw_decoder.raw_feed(src, dst_string) | dst_string.reserve(decoder.max_utf8_buffer_length_without_replacement(src.len())); |
raw_encoder.raw_feed(src, dst_vec) | dst_vec.reserve(encoder.max_buffer_length_from_utf8_without_replacement(src.len())); |
raw_decoder.raw_finish(dst) | dst_string.reserve(decoder.max_utf8_buffer_length_without_replacement(0)); |
raw_encoder.raw_finish(dst) | dst_vec.reserve(encoder.max_buffer_length_from_utf8_without_replacement(0)); |
encoding::DecoderTrap::Strict | decode* methods that have _without_replacement in their name (and treating the `Malformed` result as fatal). |
encoding::DecoderTrap::Replace | decode* methods that do not have _without_replacement in their name. |
encoding::DecoderTrap::Ignore | It is a bad idea to ignore errors due to security issues, but this could be implemented using decode* methods that have _without_replacement in their name. |
encoding::DecoderTrap::Call(DecoderTrapFunc) | Can be implemented using decode* methods that have _without_replacement in their name. |
encoding::EncoderTrap::Strict | encode* methods that have _without_replacement in their name (and treating the `Unmappable` result as fatal). |
encoding::EncoderTrap::Replace | Can be implemented using encode* methods that have _without_replacement in their name. |
encoding::EncoderTrap::Ignore | It is a bad idea to ignore errors due to security issues, but this could be implemented using encode* methods that have _without_replacement in their name. |
encoding::EncoderTrap::NcrEscape | encode* methods that do not have _without_replacement in their name. |
encoding::EncoderTrap::Call(EncoderTrapFunc) | Can be implemented using encode* methods that have _without_replacement in their name. |
Structs
Decoder |
A converter that decodes a byte stream into Unicode according to a character encoding in a streaming (incremental) manner. |
Encoder |
A converter that encodes a Unicode stream into bytes according to a character encoding in a streaming (incremental) manner. |
Encoding |
An encoding as defined in the Encoding Standard. |
Enums
CoderResult |
Result of a (potentially partial) decode or encode operation with replacement. |
DecoderResult |
Result of a (potentially partial) decode operation without replacement. |
EncoderResult |
Result of a (potentially partial) encode operation without replacement. |
Statics
BIG5 |
The Big5 encoding. |
BIG5_INIT |
The initializer for the Big5 encoding. |
EUC_JP |
The EUC-JP encoding. |
EUC_JP_INIT |
The initializer for the EUC-JP encoding. |
EUC_KR |
The EUC-KR encoding. |
EUC_KR_INIT |
The initializer for the EUC-KR encoding. |
GB18030 |
The gb18030 encoding. |
GB18030_INIT |
The initializer for the gb18030 encoding. |
GBK |
The GBK encoding. |
GBK_INIT |
The initializer for the GBK encoding. |
IBM866 |
The IBM866 encoding. |
IBM866_INIT |
The initializer for the IBM866 encoding. |
ISO_2022_JP |
The ISO-2022-JP encoding. |
ISO_2022_JP_INIT |
The initializer for the ISO-2022-JP encoding. |
ISO_8859_10 |
The ISO-8859-10 encoding. |
ISO_8859_10_INIT |
The initializer for the ISO-8859-10 encoding. |
ISO_8859_13 |
The ISO-8859-13 encoding. |
ISO_8859_13_INIT |
The initializer for the ISO-8859-13 encoding. |
ISO_8859_14 |
The ISO-8859-14 encoding. |
ISO_8859_14_INIT |
The initializer for the ISO-8859-14 encoding. |
ISO_8859_15 |
The ISO-8859-15 encoding. |
ISO_8859_15_INIT |
The initializer for the ISO-8859-15 encoding. |
ISO_8859_16 |
The ISO-8859-16 encoding. |
ISO_8859_16_INIT |
The initializer for the ISO-8859-16 encoding. |
ISO_8859_2 |
The ISO-8859-2 encoding. |
ISO_8859_2_INIT |
The initializer for the ISO-8859-2 encoding. |
ISO_8859_3 |
The ISO-8859-3 encoding. |
ISO_8859_3_INIT |
The initializer for the ISO-8859-3 encoding. |
ISO_8859_4 |
The ISO-8859-4 encoding. |
ISO_8859_4_INIT |
The initializer for the ISO-8859-4 encoding. |
ISO_8859_5 |
The ISO-8859-5 encoding. |
ISO_8859_5_INIT |
The initializer for the ISO-8859-5 encoding. |
ISO_8859_6 |
The ISO-8859-6 encoding. |
ISO_8859_6_INIT |
The initializer for the ISO-8859-6 encoding. |
ISO_8859_7 |
The ISO-8859-7 encoding. |
ISO_8859_7_INIT |
The initializer for the ISO-8859-7 encoding. |
ISO_8859_8 |
The ISO-8859-8 encoding. |
ISO_8859_8_I |
The ISO-8859-8-I encoding. |
ISO_8859_8_INIT |
The initializer for the ISO-8859-8 encoding. |
ISO_8859_8_I_INIT |
The initializer for the ISO-8859-8-I encoding. |
KOI8_R |
The KOI8-R encoding. |
KOI8_R_INIT |
The initializer for the KOI8-R encoding. |
KOI8_U |
The KOI8-U encoding. |
KOI8_U_INIT |
The initializer for the KOI8-U encoding. |
MACINTOSH |
The macintosh encoding. |
MACINTOSH_INIT |
The initializer for the macintosh encoding. |
REPLACEMENT |
The replacement encoding. |
REPLACEMENT_INIT |
The initializer for the replacement encoding. |
SHIFT_JIS |
The Shift_JIS encoding. |
SHIFT_JIS_INIT |
The initializer for the Shift_JIS encoding. |
UTF_16BE |
The UTF-16BE encoding. |
UTF_16BE_INIT |
The initializer for the UTF-16BE encoding. |
UTF_16LE |
The UTF-16LE encoding. |
UTF_16LE_INIT |
The initializer for the UTF-16LE encoding. |
UTF_8 |
The UTF-8 encoding. |
UTF_8_INIT |
The initializer for the UTF-8 encoding. |
WINDOWS_1250 |
The windows-1250 encoding. |
WINDOWS_1250_INIT |
The initializer for the windows-1250 encoding. |
WINDOWS_1251 |
The windows-1251 encoding. |
WINDOWS_1251_INIT |
The initializer for the windows-1251 encoding. |
WINDOWS_1252 |
The windows-1252 encoding. |
WINDOWS_1252_INIT |
The initializer for the windows-1252 encoding. |
WINDOWS_1253 |
The windows-1253 encoding. |
WINDOWS_1253_INIT |
The initializer for the windows-1253 encoding. |
WINDOWS_1254 |
The windows-1254 encoding. |
WINDOWS_1254_INIT |
The initializer for the windows-1254 encoding. |
WINDOWS_1255 |
The windows-1255 encoding. |
WINDOWS_1255_INIT |
The initializer for the windows-1255 encoding. |
WINDOWS_1256 |
The windows-1256 encoding. |
WINDOWS_1256_INIT |
The initializer for the windows-1256 encoding. |
WINDOWS_1257 |
The windows-1257 encoding. |
WINDOWS_1257_INIT |
The initializer for the windows-1257 encoding. |
WINDOWS_1258 |
The windows-1258 encoding. |
WINDOWS_1258_INIT |
The initializer for the windows-1258 encoding. |
WINDOWS_874 |
The windows-874 encoding. |
WINDOWS_874_INIT |
The initializer for the windows-874 encoding. |
X_MAC_CYRILLIC |
The x-mac-cyrillic encoding. |
X_MAC_CYRILLIC_INIT |
The initializer for the x-mac-cyrillic encoding. |
X_USER_DEFINED |
The x-user-defined encoding. |
X_USER_DEFINED_INIT |
The initializer for the x-user-defined encoding. |