[−][src]Function mangling::mangle
pub fn mangle<T>(name: impl IntoIterator<Item = T>) -> String where
T: Borrow<u8>,
Takes an iterator over bytes and produces a String
whose contents obey the rules for an
identifier in the C language.
The length N of the output in bytes, relative to the input length K, follows these rules, which are considered to be requirements on future implementations:
- N > K
- N ≤ 4 * K + 2
- N ≤ ceil(3.5 * K) + 2 when K > 1
Additionally, the current implementation satisfies these additional constraints:
- N = 1 + ceil(log10(K + 1)) + K when input matches
^[A-Za-z_]*$
- N = 2 + ceil(log10(K + 1)) + 2 * K when input matches
^[^A-Za-z_]+$
Examples
let mangle_list = &[ ("" , "_" ), ("_123" , "_4_123" ), ("123" , "_03_313233" ), ("(II)I" , "_01_282II01_291I" ), ("<init>" , "_01_3c4init01_3e" ), ("<init>:()V" , "_01_3c4init04_3e3a28291V" ), ("GCD" , "_3GCD" ), ("StackMapTable" , "_13StackMapTable" ), ("java/lang/Object", "_4java01_2f4lang01_2f6Object"), ]; for &(before, after) in mangle_list { assert_eq!(after, mangle(before.bytes())); }
Implementation details
The resulting symbol begins with an underscore character _
, and is
followed by zero or more groups of two types: printables and non-printables.
The content of the input byte stream determines which type of group comes
first, after which the two types alternate strictly.
- A printable group corresponds to the longest substring of the input that
can be consumed while matching the (case-insensitive) regular expression
[a-z][a-z0-9_]*
. The mangled form isNaaa
whereN
is the unbounded decimal length of the substring in the original input, andaaa
is the literal substring. - A non-printable group represents the shortest substring in the input that
can be consumed before a printable substring begins to match. The mangled
form is
0N_xxxxxx
where0
and_
are literal,N
is the unbounded decimal length of the substring in the original input, andxxxxxx
is the lowercase hexadecimal expansion of the original bytes (two hexadecimal digits per input byte, most significant nybble first).
Note that despite the description above, the current implementation does not actually use regular expressions for matching.