A string table implementation with a tree-like encoding.
Each entry in the table represents a string and is encoded as a list of components where each component can either be
- a string value that contains actual UTF-8 string content,
- a string ID that contains a reference to another entry, or
- a terminator tag which marks the end of a component list.
The string content of an entry is defined as the concatenation of the content of its components. The content of a string value is its actual UTF-8 bytes. The content of a string ID is the contents of the entry it references.
The byte-level encoding of component lists uses the structure of UTF-8 in order to save space:
A valid UTF-8 codepoint never starts with the bits
10as this bit prefix is reserved for bytes in the middle of a UTF-8 codepoint byte sequence. We make use of this fact by letting all string ID components start with this
10prefix. Thus when we parse the contents of a value we know to stop if the start byte of the next codepoint has this prefix.
A valid UTF-8 string cannot contain the
0xFFbyte and since string IDs start with
10as described above, they also cannot start with a
0xFFbyte. Thus we can safely use
0xFFas our component list terminator.
The sample composite string ["abc", ID(42), "def", TERMINATOR] would thus be encoded as:
['a', 'b' , 'c', 128, 0, 0, 42, 'd', 'e', 'f', 255] ^^^^^^^^^^^^^ ^^^ string ID 42 with 0b10 prefix terminator (0xFF)
As you can see string IDs are encoded in big endian format so that highest order bits show up in the first byte we encounter.
Each string in the table is referred to via a
be generated in two ways:
StringTable::alloc()which returns the
StringIdfor the allocated string.
String IDs allow you to deduplicate strings by allocating a string once and then referring to it by id over and over. This is a useful trick for strings which are recorded many times and it can significantly reduce the size of profile trace files.
StringIds are partitioned according to type:
[0 .. MAX_PRE_RESERVED_STRING_ID, METADATA_STRING_ID, .. ]
MAX_PRE_RESERVED_STRING_ID are the allowed values for reserved strings.
MAX_PRE_RESERVED_STRING_ID, there is one string id (
METADATA_STRING_ID) which is used
measureme to record additional metadata about the profiling session.
METADATA_STRING_ID are all other
Write-only version of the string table
A single component of a string. Used for building composite table entries.
The id of the profile metadata string entry.
Anything that implements