Module wow_srp::normalized_string [−][src]
Expand description
Functionality for keeping strings in a format the client expects.
Background
The client uppercases both the username and password before hashing them. The username sent to
the server is also an uppercased version. This means that to the client, there’s no difference
between logging in as alice
, ALICE
, or anything in between. This is no problem for ASCII
characters as they have well defined upper- and lowercase letters.
Unicode characters, however, act differently and without any real pattern.
The letter ń
, Unicode code point U+0144
, name LATIN SMALL LETTER N WITH ACUTE
for example,
appears as a capital N
in the client, and sends the byte 0x4E
which is ASCII N. This is
despite the letter Ń
, Unicode code point U+0144
, name LATIN CAPITAL LETTER N WITH ACUTE
existing.
The letter ž
, Unicode code point U+017E
, name LATIN SMALL LETTER Z WITH CARON
appears as
the literal letter ž
and gets sent over the network as the bytes 0xC5 0xBE
which is UTF-8
for that same letter.
The letter Ž
, Unicode code point U+017D
, name LATIN CAPITAL LETTER Z WITH CARON
appears as
the literal letter Ž
in the client and gets sent over the network as the bytes 0xC5 0xBD
which is UTF-8 for that same letter.
The letter ƒ
, Unicode code point U+0192
, name LATIN SMALL LETTER F WITH HOOK
, appears as
the literal letter ƒ
and gets sent over the network as the bytes 0xC6 0x92
which is UTF-8
for that same letter.
The letter Ƒ
, Unicode code point U+0191
, name LATIN CAPITAL LETTER F WITH HOOK
appears as
the lower case version in the client and gets sent over the network as the lowercase version.
None of the Cyrillic letters show in the client and get transmitted as a question mark (byte 0x3F
).
These wildly varying rules for transforming the username and password means that the only way to really be sure how a specific character is represented on the client and gets sent over the network is to test every single unicode character. The behavior is also not guaranteed to be the same across different versions, or even different localizations of the same version.
The client is able to enter up to 16 characters in the client, which will be sent over the network as one or more UTF-8 bytes.
Problems
The user will need to register their account outside of the client. They might name their account
Ƒast
and get through registration because the web service does not know that the letter Ƒ
can not
be represented in the client and is instead shown and sent as ƒ
. The user is unable to log in, instead
getting an “Account does not exist” message.
Another user creates an account named ńacho
and gets through registration. Since the letter ń
is
represented as the letter N
in the client, the sign up service
makes this transformation in order to stay in sync with the client.
This might allow the user to log into the account named Nacho
, depending on which
verifier/salt pair is fetched from the database.
Authentication relies on the signup service, server and client to have the exact same behavior, otherwise vulnerabilities will appear or users might be unable to log in.
Solution
The only manageable solution is to stick to only the ASCII character set and reject all other characters. This greatly reduces the complexity of every link in the chain and decreases possible vulnerabilities.
This also provides the benefit of knowing exactly how large an account name can be.
Structs
Represents usernames and passwords containing only allowed characters.
Constants
The highest amount of letters that the client will allow in both the username and password fields. Always 16.