Crate zalgo_codec
source ·Expand description
This is a crate implementing the zalgo encoding and decoding functions originally written in Python by Scott Conner.
Using the functions defined in this crate you can transform an ASCII string into a unicode string that is a single “character” wide. The encoding is reversible, but this string will be larger than the original in terms of bytes. The crate also provides functions to encode python code and wrap the result in a decoder that decodes and executes the encoded string. This way the file looks very different, but executes the same way as before. This lets you do the mother of all refactoring by converting your entire python program into a single line of code. Can not encode carriage returns, so files written on non-unix operating systems might not work. The file encoding functions will attempt to encode files anyway by ignoring carriage returns.
Explanation:
Characters U+0300–U+036F are the combining characters for unicode Latin.
The fun thing about combining characters is that you can add as many of these characters
as you like to the original character and it does not create any new symbols,
it only adds symbols on top of the character. It’s supposed to be used in order to
create characters such as á by taking a normal a and adding another character
to give it the mark (U+301, in this case). Fun fact, Unicode doesn’t specify
any limit on the number of these characters.
Conveniently, this gives us 112 different characters we can map to,
which nicely maps to the ASCII character range 0x20 -> 0x7F, aka all the non-control characters.
The only issue is that we can’t have new lines in this system, so to fix that,
we can simply map 0x7F (DEL) to 0x0A (LF).
This can be represented as (CHARACTER - 11) % 133 - 21, and decoded with (CHARACTER + 22) % 133 + 10.
Original post.
Structs
Functions
zalgo_encode
and decompresses it
to an ASCII string.zalgo_decode
.