- Finish unit testing odds and ends:
- Squeeze a few more percent coverage out of `pipe.rs`
- `Parser`
- Update `README`:
- De-emphasize "Architecture" (and maybe remove)
- Add a Features section, maybe after Performance that emphasizes:
- Fast performance
- Streaming and what that might mean, including that you don't need to see the whole input
at once and can handle JSON split across input buffers
- Precise error positions (line, column, and offset)
- Low memory pressure and low allocator pressure
- Simple, idiomatic API that is easy to work with
- Incremental parsing
- Add a Comparison section that lists other crates and hyperlinks out to separate short
comparison docs so it doesn't clutter main doc but is available.
- Add a use cases section but have it link out to a separate doc.
- Go over the various key `Content`/`Literal` methods like `len()` and into_buf()` to make sure the
appropriate ones are inlined.
- Run `cargo bench` as part of the GitHub Actions.
- Add number parse methods into `Content`, with provided implementations.
- Basic algorithm is: if one chunk, use `str::parse`-ish functions directly. If multiple
chunks but would fit in a reasonable stack buffer, copy it there and `str::parse`, otherwise
put in heap buffer and `str::parse`. Obviously `str::parse` is a placeholder for whatever the
real function name is.
- Add `Content::cmp_unescaped -> Ordering` to `Content` to allow it to compare content to other
strings without allocating to unescape. This should be a provided method on the trait.
- Add overall crate documentation (`lib.rs`).
- Once at least one streaming type is available, update module-level documentation for mod `lexical`
with an *Examples* heading as the first section and give an example of using each lexer type.
Right now with only `FixedAnalyzer` that exercise seems a bit pointless.
- Re-export the following into the root: `Token`, `FixedAnalyzer`, `Parser`.
- Replace `#[inline(always)]` with `#[inline]` except for methods that are just a reference return
or single method call.
- Put `#[must_use]` directives in appropriate places.
PERFORMANCE
===========
Performance is now "pretty good" and is mainly limited by the API design.
That being said, I have a hunch that maybe 10%-15% could be wrung out of the lexical analyzer by
improving string handling. The main ideas are:
1. Refactor so the slow path (`lexical::state::Machine::str_slow()`) can return to the fast path,
without increasing parameter count or complexity of the existing fast path start.
2. Rewrite the fast path (`lexical::state::Machine::str()`) to use SIMD.