LazUTF8 accepts overlong UTF-8 characters
Original Reporter info from Mantis: szali
-
Reporter name: szali
Original Reporter info from Mantis: szali
- Reporter name: szali
Description:
For example, UTF8CharacterToUnicode accepts sequences which encode an UTF-8 character which could be encoded with less bytes than it is. This is prohibited by the UTF-8 standard, as it could allow XSS attacks. Like:
- you enter a malicious SQL query
- the dumb program checks whether it contains any SQL control characters using Pos('"', s) and stuff (uses no UTF8 string functions), and it escapes them
- then it runs the SQL query
Now if e.g. '"' and other control characters are encoded with more than one bytes, the program will not notice these malicious UTF-8 codepoints because it is so dumb that it only looks for the ASCII version of the '"' character. Then, when it tries to parse and execute it, the attack will be successful.