[−][src]Crate smartstring
Smart String
SmartString
is a wrapper around String
which offers
automatic inlining of small strings. It comes in two flavours:
LazyCompact
, which takes up exactly as much space as a String
and is generally a little faster, and Compact
, which is the same as
LazyCompact
except it will aggressively re-inline any expanded
String
s which become short enough to do so.
LazyCompact
is the default.
What Is It For?
The intended use for SmartString
is as a key type for a
B-tree (such as std::collections::BTreeMap
) or any kind of
array operation where cache locality is critical.
In general, it's a nice data type for reducing your heap allocations and
increasing the locality of string data. If you use SmartString
as a drop-in replacement for String
, you're almost certain to see
a slight performance boost, as well as slightly reduced memory usage.
How To Use It?
SmartString
has the exact same API as String
,
all the clever bits happen automatically behind the scenes, so you could just:
use smartstring::alias::String; use std::fmt::Write; let mut string = String::new(); string.push_str("This is just a string!"); string.clear(); write!(string, "Hello Joe!"); assert_eq!("Hello Joe!", string);
Give Me The Details
SmartString
is the same size as String
and
relies on pointer alignment to be able to store a discriminant bit in its
inline form that will never be present in its String
form, thus
giving us 24 bytes (on 64-bit architectures) minus one bit to encode our
inline string. It uses 23 bytes to store the string data and the remaining
7 bits to encode the string's length. When the available space is exceeded,
it swaps itself out with a String
containing its previous
contents. Likewise, if the string's length should drop below its inline
capacity again, it deallocates the string and moves its contents inline.
Given that we use the knowledge that a certain bit in the memory layout
of String
will always be unset as a discriminant, you would be
able to call std::mem::transmute::<String>()
on a boxed
smart string and start using it as a normal String
immediately -
there's no pointer tagging or similar trickery going on here.
(But please don't do that, there's an efficient Into<String>
implementation that does the exact same thing with no need to go unsafe
in your own code.)
It is aggressive about inlining strings, meaning that if you modify a heap allocated
string such that it becomes short enough for inlining, it will be inlined immediately
and the allocated String
will be dropped. This may cause multiple
unintended allocations if you repeatedly adjust your string's length across the
inline capacity threshold, so if your string's construction can get
complicated and you're relying on performance during construction, it might be better
to construct it as a String
and convert it once construction is done.
LazyCompact
looks the same as Compact
, except
it never re-inlines a string that's already been heap allocated, instead
keeping the allocation around in case it needs it. This makes for less
cache local strings, but is the best choice if you're more worried about
time spent on unnecessary allocations than cache locality.
Performance
It doesn't aim to be more performant than String
in the general case,
except that it doesn't trigger heap allocations for anything shorter than
its inline capacity and so can be reasonably expected to exceed
String
's performance perceptibly on shorter strings, as well as being more
memory efficient in these cases. There will always be a slight overhead on all
operations on boxed strings, compared to String
.
Caveat
The way smartstring
gets by without a discriminant is dependent on the memory layout of the
std::string::String
struct, which isn't something the Rust compiler and standard library make any
guarantees about. smartstring
makes an assumption about how it's been laid out, which has held
basically since rustc came into existence, but is nonetheless not a safe assumption to make, and if
the layout ever changes, smartstring
will stop working properly (at least on little-endian
architectures, the assumptions made on big-endian archs will hold regardless of the actual memory
layout). Its test suite does comprehensive validation of these assumptions, and as long as the
CI build is passing for any given rustc version,
you can be sure it will do its job properly on all tested architectures. You can also check out the
smartstring
source tree yourself and run cargo test
to validate it for your particular
configuration.
As an extra precaution, some runtime checks are made as well, so that if the memory layout
assumption no longer holds, smartstring
will not work correctly, but there should be no security
implications and it should crash early.
Feature Flags
smartstring
comes with optional support for the following crates through Cargo
feature flags. You can enable them in your Cargo.toml
file like this:
[dependencies]
smartstring = { version = "*", features = ["proptest", "serde"] }
Feature | Description |
---|---|
arbitrary | Arbitrary implementation for SmartString . |
proptest | A strategy for generating SmartString s from a regular expression. |
serde | Serialize and Deserialize implementations for SmartString . |
Modules
alias | Convenient type aliases. |
proptest |
|
Structs
Compact | A compact string representation equal to |
Drain | A draining iterator for a |
LazyCompact | A representation similar to |
SmartString | A smart string. |
Traits
SmartStringMode | Marker trait for |