[−][src]Module measureme::stringtable
A string table implementation with a tree-like encoding.
Each entry in the table represents a string and is encoded as a list of components where each component can either be
- a string value that contains actual UTF-8 string content,
- a string ID that contains a reference to another entry, or
- a terminator tag which marks the end of a component list.
The string content of an entry is defined as the concatenation of the content of its components. The content of a string value is its actual UTF-8 bytes. The content of a string ID is the contents of the entry it references.
The byte-level encoding of component lists uses the structure of UTF-8 in order to save space:
-
A valid UTF-8 codepoint never starts with the bits
10
as this bit prefix is reserved for bytes in the middle of a UTF-8 codepoint byte sequence. We make use of this fact by letting all string ID components start with this10
prefix. Thus when we parse the contents of a value we know to stop if the start byte of the next codepoint has this prefix. -
A valid UTF-8 string cannot contain the
0xFF
byte and since string IDs start with10
as described above, they also cannot start with a0xFF
byte. Thus we can safely use0xFF
as our component list terminator.
The sample composite string ["abc", ID(42), "def", TERMINATOR] would thus be encoded as:
['a', 'b' , 'c', 128, 0, 0, 42, 'd', 'e', 'f', 255] ^^^^^^^^^^^^^ ^^^ string ID 42 with 0b10 prefix terminator (0xFF)
As you can see string IDs are encoded in big endian format so that highest order bits show up in the first byte we encounter.
Each string in the table is referred to via a StringId
. StringId
s may
be generated in two ways:
- Calling
StringTableBuilder::alloc()
which returns theStringId
for the allocated string. - Calling
StringId::new_virtual()
to create a "virtual"StringId
that later can be mapped to an actual string viaStringTableBuilder::map_virtual_to_concrete_string()
.
String IDs allow you to deduplicate strings by allocating a string once and then referring to it by id over and over. This is a useful trick for strings which are recorded many times and it can significantly reduce the size of profile trace files.
StringId
s are partitioned according to type:
[0 .. MAX_VIRTUAL_STRING_ID, METADATA_STRING_ID, .. ]
From 0
to MAX_VIRTUAL_STRING_ID
are the allowed values for virtual strings.
After MAX_VIRTUAL_STRING_ID
, there is one string id (METADATA_STRING_ID
) which is used
internally by measureme
to record additional metadata about the profiling session.
After METADATA_STRING_ID
are all other StringId
values.
Structs
StringId | A |
StringTableBuilder | Write-only version of the string table |
Enums
StringComponent | A single component of a string. Used for building composite table entries. |
Constants
FIRST_REGULAR_STRING_ID | |
MAX_STRING_ID | |
METADATA_STRING_ID | The id of the profile metadata string entry. |
STRING_ID_MASK | |
TERMINATOR |
Traits
SerializableString | Anything that implements |