Wrong conversion of unicode code points in fpjsonscanner
Original Reporter info from Mantis: luizamerico
-
Reporter name: Luiz Americo
Original Reporter info from Mantis: luizamerico
- Reporter name: Luiz Americo
Description:
Hi, converting unicode code points out of basic latin, like CJK, leads to wrong encoded value.
See the attached example. It should output the below ( see http://www.fileformat.info/info/unicode/char/4E01/index.htm )
E4
B8
81
The problem was pointed by Reinier Olislagers. The suggested fix by him fix the problem. See below
Then, adapted line jsonscannerutf8.pp line 244 (FPC 2.7.1 jsonscanner.pp
line 238/239), function DoFetchToken
to convert incoming json string data to UTF8 instead of system codepage:
e.g.
from:
// Takes care of conversion...
S:=WideChar(StrToInt('$'+S));
to:
// Convert from Unicode codepoint in hex to UTF8... via UTF16:
S:=Utf8Encode(WideString(WideChar(StrToInt('$'+S))));
Mantis conversion info:
- Mantis ID: 22310
- Fixed in version: 3.0.0
- Fixed in revision: 21837 (#a7d55bc9)
- Target version: 2.6.1