chisel-decoders
Overview
This crate contains a very simple, lean implementation of a decoder that will consume u8
bytes from a given
Read
implementation, and decode into the Rust internal char
type using UTF-8 . This is an offshoot lib from an
ongoing toy parser project, and is used as the first stage of the scanning/lexing phase of the parser in order avoid
unnecessary allocations during the u8
sequence -> char
conversion.
Note that the implementation is pretty fast and loose, and under the covers utilises some bit-twiddlin' in
conjunction with the unsafe transmute
function to do the conversions. No string allocations are used during
conversion. There is minimal checking (other than bit-masking) of the inbound bytes - it is not intended to be
a full-blown UTF8 validation library, although improved/feature-flagged validation may be added at a later date.
Usage
Usage is very simple, provided you have something that implements Read
in order to source some bytes:
Create from a slice
Just wrap your array in a mut
reader, and then plug it into a new instance of Utf8Decoder
:
# use BufReader;
# use Utf8Decoder;
let buffer: & = &;
let mut reader = new;
let _decoder = new;
Create from a file
Just crack open your file, wrap in a Read
instance and then plug into a new instance of Utf8Decoder
:
# use File;
# use BufReader;
# use PathBuf;
# use Utf8Decoder;
let path = from;
let f = open;
let mut reader = new;
let _decoder = new;
Consuming Decoded chars
Once you've created an instance of a specific decoder, you simply iterate over the chars
in
order to pull out the decoded characters (a decoder implements Iterator<Item=char>
):
# use File;
# use BufReader;
# use PathBuf;
# use Utf8Decoder;
let path = from;
let f = open;
let mut reader = new;
let decoder = new;
for c in decoder
Building and Testing
As you would expect, just cargo build
in order to build the crate.
Suggestions and Requests
If you have any suggestions, requests or even just comments relating to this crate, then please just add an issue and I'll try and take a look when I get change. Please feel free to fork this repo if you want to utilise/modify this code in any of your own work.