Expand description
§𧔠Stringlet
A fast, cheap, compile-time constructible, Copy-able, kinda primitive inline string type. When storing these on the
stack, you probably want to use smaller sizes, hence the name. No dependencies are planned, except for optional SerDe
support, etc. The intention is to be no-std and no-alloc â which still requires feature-gating String interop.
str should mostly work. The design is prepared for
String functionality, but that needs to be implemented.
In my casual benchmarking it beats all other string kinds and crates nicely to spectacularly on some tests. There are
four flavors of mostly the same code. They differ in length handling, which shows only in some operations, like
len(), as_ref(), and as_str():
-
Stringlet,stringlet!(âŠ): This is fixed size, i.e. bounds for array access are compiled in, hence fast. -
VarStringlet,stringlet!(var âŠ),stringlet!(v âŠ): This adds one byte for the length â still pretty fast. Speed differs for some content processing, where SIMD gives an advantage for multiples of some power of 2, e.g.VarStringlet<32>. While for copying the advantage can be at one less, e.g.VarStringlet<31>. Length must be0..=255. -
TrimStringlet,stringlet!(trim âŠ),stringlet!(t âŠ): This can optionally trim one last byte, useful for codes with minimal length variation like ISO 639. This is achieved by tagging an unused last byte with a UTF-8 niche. The length gets calculated branchlessly with very few ops. -
SlimStringlet,stringlet!(slim âŠ),stringlet!(s âŠ): This uses the same UTF-8 niche, but fully: It projects the length into 6 bits of the last byte, when content is less than full size. Length must be0..=64. Though it is done branchlessly, there are a few more ops for length calculation. Hence this is the slowest, albeit by a small margin. Any bit hackers, who know how to do with less ops, welcome on board!
N.B.: Variable size VarStringlet seems a competitor to fixedstr::str,
arrayvec::ArrayString, and the semi-official
heapless::String. They lack a heapless::Str, to
match the faster fixed size Stringlet. That would be given by
fixedstr::zstr but their equality checks are not
optimized. I hope it can be independently confirmed (or debunked, if I mismeasured) that for tasks like == Self or == &str all variants in this crate seem by a factor faster than competitors.
use stringlet::{Stringlet, VarStringlet, TrimStringlet, SlimStringlet, stringlet};
let a: VarStringlet<10> = "shorter".into(); // override default stringlet size of 16 and donât use all of it
let b = a;
println!("{a} == {b}? {}", a == b); // No âvalue borrowed here after moveâ error đ
let nothing = Stringlet::<0>::new(); // Empty and zero size
let nil = VarStringlet::<5>::new(); // Empty and size 5 â would be impossible for fixed size Stringlet
let nada = TrimStringlet::<1>::new(); // Empty and size 1 â biggest an empty TrimStringlet can be
let x = stringlet!("Hello Rust!"); // Stringlet<11>
let y = stringlet!(v 14: "Hello Rust!"); // abbreviated VarStringlet<14>, more than length
let z = stringlet!(slim: "Hello Rust!"); // SlimStringlet<11>
let Κ = stringlet!(v: ["abcd", "abc", "ab"]); // VarStringlet<4> for each
let Ï = stringlet!(["abc", "def", "ghj"]); // Stringlet<3> for each
const HELLO: Stringlet<11> = stringlet!("Hello Rust!"); // Input length must match type
const PET: [Stringlet<3>; 4] = stringlet!(["cat", "dog", "ham", "pig"]); // size of 1st element
const PETS: [VarStringlet<8>; 4] = stringlet!(_: ["cat", "dog", "hamster", "piglet"]); // _: derive typeBut
error[E0599]: `SlimStringlet<99>` has excessive SIZE
--> src/main.rs:99:16
|
99 | let balloons = stringlet!(s 99: "Luftballons, auf ihremâŠ");
| ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ SIZE must be `0..=64`
âŠ
= note: the following trait bounds were not satisfied:
`stringlet::StringletBase<stringlet::Slim, 99>: stringlet::SlimConfig<99>`
which is required by `stringlet::StringletBase<stringlet::Slim, 99>: ConfigBase<stringlet::Slim, 99>`
= note: `SlimStringlet` cannot be longer than 64 bytes. Consider using `VarStringlet`!This is not your classical short string optimization (SSO) in so far as there is no overflow into an alternate
bigger storage. This is by design, as addressing two different data routes requires branching. On your hot path,
branch misprediction can be costly. Crate stringlet tries hard to be maximally branchless. The few ifs and ||s
refer to constants, so should be optimized away.
There is no Option or Result niche optimization yet. That should likewise be feasible for all stringlets with a
stored length. I only need to understand how to tell the compiler?
VarStringlet and SlimStringlet are configured so they can only be instantiated with valid sizes. For normal use
thatâs all there is to it. However when forwarding generic arguments to them you too have to bound by
VarConfig<SIZE> or SlimConfig<SIZE>. I wish I could just hide it all behind <const SIZE: usize<0..=64>>!
§Todo
-
StringletError&stringlet::Result -
Run Miri on various architectures. Whoâs willing to support with exotic stuff?
-
Run
cargo mutants -
Implement mutability,
+=,write!(). -
Document!
-
How to implement
Cow/BorrowwithStringas owned type? -
Or rather a
Cow-like storage-constrained/limitless pair that will transparently switch on overflow. -
Implement more traits.
-
format!()equivalentstringlet!(format âŠ)orformat_stringlet!() -
Integrate into string-rosetta-rs
-
Implement for popular 3rd party crates.
-
Why does this not pick up the default SIZE of 16:
let fail = Stringlet::new(); -
Is there a downside to
Copyby default? -
Whatâs our minimal rust-version?
| Platt (Low German) | Semi-literal Translation |
|---|---|
| Op de Straat löpptân Jung mitân TĂŒddelband inâne anner Handân Bodderbrood mit Kees, wenn he blots ni mit de Been inân TĂŒddel keem un dor liggt he ok all lang op de Nees un he rasselt mitân Dassel opân Kantsteen un he bitt sick ganz geheurig op de Tung, as he opsteiht, seggt he: Hett ni weeh doon, datâsân Klacks för soân Kieler Jung | Upon the street runs a boy with a Twiddle-String in another hand a buttered bread with cheese, if he only not with the legs into the twiddle came and there lies he already long upon the nose and he rattles with the noggin upon a kerbstone and he bites himself greatly upon the tongue, as he up stands, says he: Has not hurt, thatâs a trifle for such a boy from Kiel |
Dedicated to my father, who taught me this iconic Northgerman â¶ String song. The many similarities nicely illustrate how English (Anglo-Saxon) comes from Northgermany. There are even more cognates that have somewhat shifted in usage: löppt: elopes (runs) â Jung: young (boy) â Been: bones (legs) â weeh doon: woe done (has hurt) â and maybe Kant: cant (tilt on edge)
Macros§
- stringlet
- Turn a
strexpression into the smallestStringletthat can contain it. Or turn[str]into an array of the smallestStringletthat can contain them. You can explicitly ask for other kinds of stringlet. By defaultSIZEis the length of the 1ststrparameter, in which case that parameter must beconst. You can also give the size explicitly, or have it inferred from context along with the kind.
Structs§
- Stringlet
Base - An inline String of varying size bounds, which can be handled like a primitive type. This is the underlying type, which you would not use directly. Instead use one of:
Enums§
Traits§
Type Aliases§
- Slim
Stringlet - Slim variable length kind of stringlet, uses a UTF-8 niche: It projects the length into 6 bits of the last byte, when content is less
than full size. Length must be
0..=64. Though it is done branchlessly, there are a few more ops for length calculation. Hence this is the slowest, albeit by a small margin. - Stringlet
- Fixed length kind of stringlet, i.e. bounds for array access are compiled in, hence it is fast.
- Trim
Stringlet - Trimmed length kind of stringlet, which optionally trims one last byte, useful for codes with minimal length variation like ISO 639. This is achieved by tagging an unused last byte with a UTF-8 niche. The length gets calculated branchlessly with very few ops.
- VarStringlet
- Variable length kind of stringlet, with one extra byte for the length.
Speed differs for some content processing, where SIMD gives an advantage for multiples of some power of 2, e.g.
VarStringlet<32>. While for copying the advantage can be at one less, e.g.VarStringlet<31>. Size must be0..=255.