High-performance numeric conversion routines for use in a
no_std environment. This does not depend on any standard library features, nor a system allocator.
If you want a minimal, stable, and compile-time friendly version of lexical's float-parsing algorithm, see minimal-lexical. If you want a minimal, performant float parser, recent versions of the Rust standard library should be sufficient.
Table of Contents
- Getting Started
- Partial/Complete Parsers
- Platform Support
- Versioning and Version Support
Add lexical to your
 = "^6.0"
And get started using lexical:
// Number to string use BUFFER_SIZE; let mut buffer = ; write; // "3.0", always has a fraction suffix, write; // "3" // String to number. let i: i32 = parse?; // Ok(3), auto-type deduction. let f: f32 = parse?; // Ok(3.5) let d: f64 = parse?; // Ok(3.5), error checking parse. let d: f64 = parse?; // Err(Error(_)), failed to parse.
In order to use lexical in generic code, the trait bounds
to_string) are provided.
/// Multiply a value in a string by multiplier, and serialize to string.
Lexical has both partial and complete parsers: the complete parsers ensure the entire buffer is used while parsing, without ignoring trailing characters, while the partial parsers parse as many characters as possible, returning both the parsed value and the number of parsed digits. Upon encountering an error, lexical will return an error indicating both the error type and the index at which the error occurred inside the buffer.
// This will return Err(Error::InvalidDigit(3)), indicating // the first invalid character occurred at the index 3 in the input // string (the space character). let x: i32 = parse?;
// This will return Ok((123, 3)), indicating that 3 digits were successfully // parsed, and that the returned value is `123`. let : = parse_partial?;
lexical-core does not depend on a standard library, nor a system allocator. To use
lexical-core in a
no_std environment, add the following to
 = "0.8.5" = false # Can select only desired parsing/writing features. = ["write-integers", "write-floats", "parse-integers", "parse-floats"]
And get started using lexical:
// A constant for the maximum number of bytes a formatter will write. use BUFFER_SIZE; let mut buffer = ; // Number to string. The underlying buffer must be a slice of bytes. let count = write; assert_eq!; let count = write; assert_eq!; // String to number. The input must be a slice of bytes. let i: i32 = parse?; // Ok(3), auto-type deduction. let f: f32 = parse?; // Ok(3.5) let d: f64 = parse?; // Ok(3.5), error checking parse. let d: f64 = parse?; // Err(Error(_)), failed to parse.
Lexical feature-gates each numeric conversion routine, resulting in faster compile times if certain numeric conversions. These features can be enabled/disabled for both
lexical-core (which does not require a system allocator) and
lexical. By default, all conversions are enabled.
- parse-floats: Enable string-to-float conversions.
- parse-integers: Enable string-to-integer conversions.
- write-floats: Enable float-to-string conversions.
- write-integers: Enable integer-to-string conversions.
Lexical is highly customizable, and contains numerous other optional features:
- std: Enable use of the Rust standard library (enabled by default).
- power-of-two: Enable conversions to and from non-decimal strings.
- radix: Allow conversions to and from non-decimal strings.
- format: Customize acceptable number formats for number parsing and writing.
- compact: Optimize for binary size at the expense of performance.
- safe: Require all array indexing to be bounds-checked.
- f16: Add support for numeric conversions to-and-from 16-bit floats.
To ensure the safety when bounds checking is disabled, we extensively fuzz the all numeric conversion routines. See the Safety section below for more information.
Lexical also places a heavy focus on code bloat: with algorithms both optimized for performance and size. By default, this focuses on performance, however, using the
compact feature, you can also opt-in to reduced code size at the cost of performance. The compact algorithms minimize the use of pre-computed tables and other optimizations at the cost of performance.
⚠ WARNING: If changing the number of significant digits written, disabling the use of exponent notation, or changing exponent notation thresholds,
BUFFER_SIZEmay be insufficient to hold the resulting output.
WriteOptions::buffer_sizewill provide a correct upper bound on the number of bytes written. If a buffer of insufficient length is provided, lexical-core will panic.
Every language has competing specifications for valid numerical input, meaning a number parser for Rust will incorrectly accept or reject input for different programming or data languages. For example:
// Valid in Rust strings. // Not valid in JSON. let f: f64 = parse?; // 3e7 // Let's only accept JSON floats. const JSON: u128 = JSON; let options = new; let f: f64 = ?; // 3e7 let f: f64 = ?; // Errors!
Due the high variability in the syntax of numbers in different programming and data languages, we provide 2 different APIs to simplify converting numbers with different syntax requirements.
- Number Format API (feature-gated via
- Options API.
A limited subset of functionality is documented in examples below, however, the complete specification can be found in the API reference documentation.
Number Format API
The number format class provides numerous flags to specify number syntax when parsing or writing. When the
power-of-two feature is enabled, additional flags are added:
- The radix for the significant digits (default
- The radix for the exponent base (default
- The radix for the exponent digits (default
format feature is enabled, numerous other syntax and digit separator flags are enabled, including:
- A digit separator character, to group digits for increased legibility.
- Whether leading, trailing, internal, and consecutive digit separators are allowed.
- Toggling required float components, such as digits before the decimal point.
- Toggling whether special floats are allowed or are case-sensitive.
Many pre-defined constants therefore exist to simplify common use-cases, including:
- JSON, XML, TOML, YAML, SQLite, and many more.
- Rust, Python, C#, FORTRAN, COBOL literals and strings, and many more.
An example of building a custom number format is as follows:
const FORMAT: u128 = new // Disable exponent notation. .no_exponent_notation // Disable all special numbers, such as Nan and Inf. .no_special .build; // Due to use in a `const fn`, we can't panic or expect users to unwrap invalid // formats, so it's up to the caller to verify the format. If an invalid format // is provided to a parser or writer, the function will error or panic, respectively. debug_assert!;
The options API allows customizing number parsing and writing at run-time, such as specifying the maximum number of significant digits, exponent characters, and more.
An example of building a custom options struct is as follows:
use num; let options = builder // Only write up to 5 significant digits, IE, `1.23456` becomes `1.2345`. .max_significant_digits // Never write less than 5 significant digits, `1.1` becomes `1.1000`. .min_significant_digits // Trim the trailing `.0` from integral float strings. .trim_floats // Use a European-style decimal point. .decimal_point // Panic if we try to write NaN as a string. .nan_string // Write infinity as "Infinity". .inf_string .build .unwrap;
Float parsing is difficult to do correctly, and major bugs have been found in implementations from libstdc++'s strtod to Python. In order to validate the accuracy of the lexical, we employ the following external tests:
- Hrvoje Abraham's strtod test cases.
- Rust's test-float-parse unittests.
- Testbase's stress tests for converting from decimal to binary.
- Nigel Tao's tests extracted from test suites for Freetype, Google's double-conversion library, IBM's IEEE-754R compliance test, as well as numerous other curated examples.
- Various difficult cases reported on blogs.
Although lexical may contain bugs leading to rounding error, it is tested against a comprehensive suite of random-data and near-halfway representations, and should be fast and correct for the vast majority of use-cases.
Various benchmarks, binary sizes, and compile times are shown here:
The compile-times when building with all numeric conversions enabled. For a more fine-tuned breakdown, see build timings.
The binary sizes of stripped binaries compiled at optimization level "2". For a more fine-tuned breakdown, see binary sizes.
Benchmarks -- Parse Integer
A benchmark on randomly-generated integers uniformly distributed over the entire range. For a more fine-tuned breakdown, see benchmarks.
Benchmarks -- Parse Float
A benchmark on parsing floats from various real-world data sets. For a more fine-tuned breakdown, see benchmarks.
Benchmarks -- Write Integer
A benchmark on writing random integers uniformly distributed over the entire range. For a more fine-tuned breakdown, see benchmarks.
Benchmarks -- Write Float
A benchmark on writing floats generated via a random-number generator and parsed from a JSON document. For a more fine-tuned breakdown, see benchmarks.
Due to the use of memory unsafe code in the integer and float writers, we extensively fuzz our float writers and parsers. The fuzz harnesses may be found under fuzz, and are run continuously. So far, we've parsed and written over 72 billion floats.
Due to the simple logic of the integer writers, and the lack of memory safety in the integer parsers, we minimally fuzz both, and test it with edge-cases, which has shown no memory safety issues to date.
lexical-core is tested on a wide variety of platforms, including big and small-endian systems, to ensure portable code. Supported architectures include:
- x86_64 Linux, Windows, macOS, Android, iOS, FreeBSD, and NetBSD.
- x86 Linux, macOS, Android, iOS, and FreeBSD.
- aarch64 (ARM8v8-A) Linux, Android, and iOS.
- armv7 (ARMv7-A) Linux, Android, and iOS.
- arm (ARMv6) Linux, and Android.
- mips (MIPS) Linux.
- mipsel (MIPS LE) Linux.
- mips64 (MIPS64 BE) Linux.
- mips64el (MIPS64 LE) Linux.
- powerpc (PowerPC) Linux.
- powerpc64 (PPC64) Linux.
- powerpc64le (PPC64LE) Linux.
- s390x (IBM Z) Linux.
lexical-core should also work on a wide variety of other architectures and ISAs. If you have any issue compiling lexical-core on any architecture, please file a bug report.
Versioning and Version Support
The currently supported versions are:
- v0.7.x (Maintenance)
- v0.6.x (Maintenance)
- v0.8.x supports 1.51+, including stable, beta, and nightly.
- v0.7.x supports 1.37+, including stable, beta, and nightly.
- v0.6.x supports Rustc 1.24+, including stable, beta, and nightly.
Please report any errors compiling a supported lexical-core version on a compatible Rustc version.
lexical uses semantic versioning. Removing support for Rustc versions newer than the latest stable Debian or Ubuntu version is considered an incompatible API change, requiring a major version change.
All changes are documented in CHANGELOG.
Lexical is dual licensed under the Apache 2.0 license as well as the MIT license. See the LICENSE.md file for full license details.
Unless you explicitly state otherwise, any contribution intentionally submitted for inclusion in lexical by you, as defined in the Apache-2.0 license, shall be dual licensed as above, without any additional terms or conditions. Contributing to the repository means abiding by the code of conduct.
For the process on how to contribute to lexical, see the development quick-start guide.