Crate encoding_c

Expand description

The C API for encoding_rs.

§Mapping from Rust

§Naming convention

The wrapper function for each method has a name that starts with the name of the struct lower-cased, followed by an underscore and ends with the name of the method.

For example, Encoding::for_label() is wrapped as encoding_for_label().

§Arguments

Functions that wrap non-static methods take the self object as their first argument.

Slice argument foo is decomposed into a pointer foo and a length foo_len.

§Return values

Multiple return values become out-params. When an out-param is length-related, foo_len for a slice becomes a pointer in order to become an in/out-param.

DecoderResult, EncoderResult and CoderResult become uint32_t. InputEmpty becomes INPUT_EMPTY. OutputFull becomes OUTPUT_FULL. Unmappable becomes the scalar value of the unmappable character. Malformed becomes a number whose lowest 8 bits, which can have the decimal value 0, 1, 2 or 3, indicate the number of bytes that were consumed after the malformed sequence and whose next-lowest 8 bits, when shifted right by 8 indicate the length of the malformed byte sequence (possible decimal values 1, 2, 3 or 4). The maximum possible sum of the two is 6.

Structs§

ConstEncoding: Newtype for *const Encoding in order to be able to implement Sync for it.

Constants§

ENCODING_NAME_MAX_LENGTH: The minimum length of buffers that may be passed to encoding_name().
INPUT_EMPTY: Return value for *_decode_* and *_encode_* functions that indicates that the input has been exhausted.
OUTPUT_FULL: Return value for *_decode_* and *_encode_* functions that indicates that the output space has been exhausted.

Statics§

BIG5_ENCODING: The Big5 encoding.
EUC_JP_ENCODING: The EUC-JP encoding.
EUC_KR_ENCODING: The EUC-KR encoding.
GB18030_ENCODING: The gb18030 encoding.
GBK_ENCODING: The GBK encoding.
IBM866_ENCODING: The IBM866 encoding.
ISO_2022_JP_ENCODING: The ISO-2022-JP encoding.
ISO_8859_2_ENCODING: The ISO-8859-2 encoding.
ISO_8859_3_ENCODING: The ISO-8859-3 encoding.
ISO_8859_4_ENCODING: The ISO-8859-4 encoding.
ISO_8859_5_ENCODING: The ISO-8859-5 encoding.
ISO_8859_6_ENCODING: The ISO-8859-6 encoding.
ISO_8859_7_ENCODING: The ISO-8859-7 encoding.
ISO_8859_8_ENCODING: The ISO-8859-8 encoding.
ISO_8859_8_I_ENCODING: The ISO-8859-8-I encoding.
ISO_8859_10_ENCODING: The ISO-8859-10 encoding.
ISO_8859_13_ENCODING: The ISO-8859-13 encoding.
ISO_8859_14_ENCODING: The ISO-8859-14 encoding.
ISO_8859_15_ENCODING: The ISO-8859-15 encoding.
ISO_8859_16_ENCODING: The ISO-8859-16 encoding.
KOI8_R_ENCODING: The KOI8-R encoding.
KOI8_U_ENCODING: The KOI8-U encoding.
MACINTOSH_ENCODING: The macintosh encoding.
REPLACEMENT_ENCODING: The replacement encoding.
SHIFT_JIS_ENCODING: The Shift_JIS encoding.
UTF_8_ENCODING: The UTF-8 encoding.
UTF_16BE_ENCODING: The UTF-16BE encoding.
UTF_16LE_ENCODING: The UTF-16LE encoding.
WINDOWS_874_ENCODING: The windows-874 encoding.
WINDOWS_1250_ENCODING: The windows-1250 encoding.
WINDOWS_1251_ENCODING: The windows-1251 encoding.
WINDOWS_1252_ENCODING: The windows-1252 encoding.
WINDOWS_1253_ENCODING: The windows-1253 encoding.
WINDOWS_1254_ENCODING: The windows-1254 encoding.
WINDOWS_1255_ENCODING: The windows-1255 encoding.
WINDOWS_1256_ENCODING: The windows-1256 encoding.
WINDOWS_1257_ENCODING: The windows-1257 encoding.
WINDOWS_1258_ENCODING: The windows-1258 encoding.
X_MAC_CYRILLIC_ENCODING: The x-mac-cyrillic encoding.
X_USER_DEFINED_ENCODING: The x-user-defined encoding.

Functions§

decoder_decode_to_utf8^⚠: Incrementally decode a byte stream into UTF-8 with malformed sequences replaced with the REPLACEMENT CHARACTER.
decoder_decode_to_utf8_without_replacement^⚠: Incrementally decode a byte stream into UTF-8 without replacement.
decoder_decode_to_utf16^⚠: Incrementally decode a byte stream into UTF-16 with malformed sequences replaced with the REPLACEMENT CHARACTER.
decoder_decode_to_utf16_without_replacement^⚠: Incrementally decode a byte stream into UTF-16 without replacement.
decoder_encoding^⚠: The Encoding this Decoder is for.
decoder_free^⚠: Deallocates a Decoder previously allocated by encoding_new_decoder().
decoder_latin1_byte_compatible_up_to^⚠: Checks for compatibility with storing Unicode scalar values as unsigned bytes taking into account the state of the decoder.
decoder_max_utf8_buffer_length^⚠: Query the worst-case UTF-8 output size with replacement.
decoder_max_utf8_buffer_length_without_replacement^⚠: Query the worst-case UTF-8 output size without replacement.
decoder_max_utf16_buffer_length^⚠: Query the worst-case UTF-16 output size (with or without replacement).
encoder_encode_from_utf8^⚠: Incrementally encode into byte stream from UTF-8 with unmappable characters replaced with HTML (decimal) numeric character references.
encoder_encode_from_utf8_without_replacement^⚠: Incrementally encode into byte stream from UTF-8 without replacement.
encoder_encode_from_utf16^⚠: Incrementally encode into byte stream from UTF-16 with unmappable characters replaced with HTML (decimal) numeric character references.
encoder_encode_from_utf16_without_replacement^⚠: Incrementally encode into byte stream from UTF-16 without replacement.
encoder_encoding^⚠: The Encoding this Encoder is for.
encoder_free^⚠: Deallocates an Encoder previously allocated by encoding_new_encoder().
encoder_has_pending_state^⚠: Returns true if this is an ISO-2022-JP encoder that’s not in the ASCII state and false otherwise.
encoder_max_buffer_length_from_utf8_if_no_unmappables^⚠: Query the worst-case output size when encoding from UTF-8 with replacement.
encoder_max_buffer_length_from_utf8_without_replacement^⚠: Query the worst-case output size when encoding from UTF-8 without replacement.
encoder_max_buffer_length_from_utf16_if_no_unmappables^⚠: Query the worst-case output size when encoding from UTF-16 with replacement.
encoder_max_buffer_length_from_utf16_without_replacement^⚠: Query the worst-case output size when encoding from UTF-16 without replacement.
encoding_ascii_valid_up_to^⚠: Validates ASCII.
encoding_can_encode_everything^⚠: Checks whether the output encoding of this encoding can encode every Unicode scalar. (Only true if the output encoding is UTF-8.)
encoding_for_bom^⚠: Performs non-incremental BOM sniffing.
encoding_for_label^⚠: Implements the get an encoding algorithm.
encoding_for_label_no_replacement^⚠: This function behaves the same as encoding_for_label(), except when encoding_for_label() would return REPLACEMENT_ENCODING, this method returns NULL instead.
encoding_is_ascii_compatible^⚠: Checks whether the bytes 0x00…0x7F map exclusively to the characters U+0000…U+007F and vice versa.
encoding_is_single_byte^⚠: Checks whether this encoding maps one byte to one Basic Multilingual Plane code point (i.e. byte length equals decoded UTF-16 length) and vice versa (for mappable characters).
encoding_iso_2022_jp_ascii_valid_up_to^⚠: Validates ISO-2022-JP ASCII-state data.
encoding_name^⚠: Writes the name of the given Encoding to a caller-supplied buffer as ASCII and returns the number of bytes / ASCII characters written.
encoding_new_decoder^⚠: Allocates a new Decoder for the given Encoding on the heap with BOM sniffing enabled and returns a pointer to the newly-allocated Decoder.
encoding_new_decoder_into^⚠: Allocates a new Decoder for the given Encoding into memory provided by the caller with BOM sniffing enabled. (In practice, the target should likely be a pointer previously returned by encoding_new_decoder().)
encoding_new_decoder_with_bom_removal^⚠: Allocates a new Decoder for the given Encoding on the heap with BOM removal and returns a pointer to the newly-allocated Decoder.
encoding_new_decoder_with_bom_removal_into^⚠: Allocates a new Decoder for the given Encoding into memory provided by the caller with BOM removal.
encoding_new_decoder_without_bom_handling^⚠: Allocates a new Decoder for the given Encoding on the heap with BOM handling disabled and returns a pointer to the newly-allocated Decoder.
encoding_new_decoder_without_bom_handling_into^⚠: Allocates a new Decoder for the given Encoding into memory provided by the caller with BOM handling disabled.
encoding_new_encoder^⚠: Allocates a new Encoder for the given Encoding on the heap and returns a pointer to the newly-allocated Encoder. (Exception, if the Encoding is replacement, a new Decoder for UTF-8 is instantiated (and that Decoder reports UTF_8 as its Encoding).
encoding_new_encoder_into^⚠: Allocates a new Encoder for the given Encoding into memory provided by the caller. (In practice, the target should likely be a pointer previously returned by encoding_new_encoder().)
encoding_output_encoding^⚠: Returns the output encoding of this encoding. This is UTF-8 for UTF-16BE, UTF-16LE and replacement and the encoding itself otherwise.
encoding_utf8_valid_up_to^⚠: Validates UTF-8.

Crate encoding_cCopy item path