pub struct DecodeReaderBytesBuilder { /* private fields */ }
Expand description
A builder for constructing a byte oriented transcoder to UTF-8.
Implementations§
Source§impl DecodeReaderBytesBuilder
impl DecodeReaderBytesBuilder
Sourcepub fn new() -> DecodeReaderBytesBuilder
pub fn new() -> DecodeReaderBytesBuilder
Create a new decoder builder with a default configuration.
By default, no explicit encoding is used, but if a UTF-8 or UTF-16 BOM is detected, then an appropriate encoding is automatically detected and transcoding is performed (where invalid sequences map to the Unicode replacement codepoint).
Sourcepub fn build<R: Read>(&self, rdr: R) -> DecodeReaderBytes<R, Vec<u8>> ⓘ
pub fn build<R: Read>(&self, rdr: R) -> DecodeReaderBytes<R, Vec<u8>> ⓘ
Build a new decoder that wraps the given reader.
Sourcepub fn build_with_buffer<R: Read, B: AsMut<[u8]>>(
&self,
rdr: R,
buffer: B,
) -> Result<DecodeReaderBytes<R, B>>
pub fn build_with_buffer<R: Read, B: AsMut<[u8]>>( &self, rdr: R, buffer: B, ) -> Result<DecodeReaderBytes<R, B>>
Build a new decoder that wraps the given reader and uses the given buffer internally for transcoding.
This is useful for cases where it is advantageuous to amortize allocation. Namely, this method permits reusing a buffer for subsequent decoders.
This returns an error if the buffer is smaller than 4 bytes (which is too small to hold maximum size of a single UTF-8 encoded codepoint).
Sourcepub fn encoding(
&mut self,
encoding: Option<&'static Encoding>,
) -> &mut DecodeReaderBytesBuilder
pub fn encoding( &mut self, encoding: Option<&'static Encoding>, ) -> &mut DecodeReaderBytesBuilder
Set an explicit encoding to be used by this decoder.
When an explicit encoding is set, BOM sniffing is disabled and the encoding provided will be used unconditionally. Errors in the encoded bytes are replaced by the Unicode replacement codepoint.
By default, no explicit encoding is set.
Sourcepub fn utf8_passthru(&mut self, yes: bool) -> &mut DecodeReaderBytesBuilder
pub fn utf8_passthru(&mut self, yes: bool) -> &mut DecodeReaderBytesBuilder
Enable UTF-8 passthru, even when a UTF-8 BOM is observed.
When an explicit encoding is not set (thereby invoking automatic encoding detection via BOM sniffing), then a UTF-8 BOM will cause UTF-8 transcoding to occur. In particular, if the source contains invalid UTF-8 sequences, then they are replaced with the Unicode replacement codepoint.
This transcoding may not be desirable. For example, the caller may already have its own UTF-8 handling where invalid UTF-8 is appropriately handled, in which case, doing an extra transcoding step is extra and unnecessary work. Enabling this option will prevent that extra transcoding step from occurring. In this case, the bytes emitted by the reader are passed through unchanged (including the BOM) and the caller will be responsible for handling any invalid UTF-8.
§Example
This example demonstrates the effect of enabling this option on data that includes a UTF-8 BOM but also, interestingly enough, subsequently includes invalid UTF-8.
extern crate encoding_rs;
extern crate encoding_rs_io;
use std::error::Error;
use std::io::Read;
use encoding_rs_io::DecodeReaderBytesBuilder;
fn example() -> Result<(), Box<Error>> {
let source_data = &b"\xEF\xBB\xBFfoo\xFFbar"[..];
let mut decoder = DecodeReaderBytesBuilder::new()
.utf8_passthru(true)
.build(source_data);
let mut dest = vec![];
decoder.read_to_end(&mut dest)?;
// Without the passthru option, you'd get "foo\u{FFFD}bar".
assert_eq!(dest, b"\xEF\xBB\xBFfoo\xFFbar");
Ok(())
}
Sourcepub fn strip_bom(&mut self, yes: bool) -> &mut DecodeReaderBytesBuilder
pub fn strip_bom(&mut self, yes: bool) -> &mut DecodeReaderBytesBuilder
Whether or not to always strip a BOM if one is found.
When this is enabled, if a BOM is found at the beginning of a stream,
then it is ignored. This applies even when utf8_passthru
is enabled
or if bom_sniffing
is disabled.
This is disabled by default.
§Example
This example shows how to remove the BOM if it’s present even when
utf8_passthru
is enabled.
extern crate encoding_rs;
extern crate encoding_rs_io;
use std::error::Error;
use std::io::Read;
use encoding_rs_io::DecodeReaderBytesBuilder;
fn example() -> Result<(), Box<Error>> {
let source_data = &b"\xEF\xBB\xBFfoo\xFFbar"[..];
let mut decoder = DecodeReaderBytesBuilder::new()
.utf8_passthru(true)
.strip_bom(true)
.build(source_data);
let mut dest = vec![];
decoder.read_to_end(&mut dest)?;
// If `strip_bom` wasn't enabled, then this would include the BOM.
assert_eq!(dest, b"foo\xFFbar");
Ok(())
}
Sourcepub fn bom_override(&mut self, yes: bool) -> &mut DecodeReaderBytesBuilder
pub fn bom_override(&mut self, yes: bool) -> &mut DecodeReaderBytesBuilder
Give the highest precedent to the BOM, if one is found.
When this is enabled, and if a BOM is found, then the encoding
indicated by that BOM is used even if an explicit encoding has been
set via the encoding
method.
This does not override utf8_passthru
.
This is disabled by default.
Sourcepub fn bom_sniffing(&mut self, yes: bool) -> &mut DecodeReaderBytesBuilder
pub fn bom_sniffing(&mut self, yes: bool) -> &mut DecodeReaderBytesBuilder
Enable BOM sniffing
When this is enabled and an explicit encoding is not set, the decoder will try to detect the encoding with BOM.
When this is disabled and an explicit encoding is not set, the decoder will treat the input as raw bytes. The bytes will be passed through unchanged, including any BOM that may be present.
This is enabled by default.
Trait Implementations§
Source§impl Clone for DecodeReaderBytesBuilder
impl Clone for DecodeReaderBytesBuilder
Source§fn clone(&self) -> DecodeReaderBytesBuilder
fn clone(&self) -> DecodeReaderBytesBuilder
1.0.0 · Source§fn clone_from(&mut self, source: &Self)
fn clone_from(&mut self, source: &Self)
source
. Read moreSource§impl Debug for DecodeReaderBytesBuilder
impl Debug for DecodeReaderBytesBuilder
Source§impl Default for DecodeReaderBytesBuilder
impl Default for DecodeReaderBytesBuilder
Source§fn default() -> DecodeReaderBytesBuilder
fn default() -> DecodeReaderBytesBuilder
Auto Trait Implementations§
impl Freeze for DecodeReaderBytesBuilder
impl RefUnwindSafe for DecodeReaderBytesBuilder
impl Send for DecodeReaderBytesBuilder
impl Sync for DecodeReaderBytesBuilder
impl Unpin for DecodeReaderBytesBuilder
impl UnwindSafe for DecodeReaderBytesBuilder
Blanket Implementations§
Source§impl<T> BorrowMut<T> for Twhere
T: ?Sized,
impl<T> BorrowMut<T> for Twhere
T: ?Sized,
Source§fn borrow_mut(&mut self) -> &mut T
fn borrow_mut(&mut self) -> &mut T
Source§impl<T> CloneToUninit for Twhere
T: Clone,
impl<T> CloneToUninit for Twhere
T: Clone,
Source§unsafe fn clone_to_uninit(&self, dst: *mut T)
unsafe fn clone_to_uninit(&self, dst: *mut T)
clone_to_uninit
)