View Issue Details

IDProjectCategoryView StatusLast Update
0038618FPCFCLpublic2021-03-14 16:48
ReporterShen Min Assigned ToMichael Van Canneyt  
PrioritynormalSeverityminorReproducibilityalways
Status resolvedResolutionfixed 
Fixed in Version3.3.1 
Summary0038618: Not processing "\u" characters in JSON String correctly.
DescriptionThe fpjson unit has a bug for processing "\u" characters in JSON String.
Steps To ReproduceI have made a demo project to show the bug at:
https://github.com/shenmin/fpjson-bug-demo

The fpjson unit processes the "\u" characters in json string incorrect.

uses fpjson;

procedure TForm1.Button1Click(Sender : TObject);
var
    str : String;
    json : TJsonObject;
begin
    str := '{"name":"\u95e8\u88ab\u8111\u5b50\u6324\u574f\u4e86"}';
    json := GetJson(str) as TJsonObject;
    ShowMessage(json.Get('name', ''));
    json.Free;
end;
TagsNo tags attached.
Fixed in Revision48965
FPCOldBugId
FPCTarget4.0.0
Attached Files

Relationships

related to 0038622 closedMichael Van Canneyt JSONStringToString error when input contains consecutive \uxxxx blocks 

Activities

Shen Min

2021-03-13 04:36

reporter  

fpjson-bug-demo-main.zip (29,206 bytes)

Do-wan Kim

2021-03-14 15:37

reporter   ~0129656

Json reader convert 2 unicode chars at same time. It may makes trouble.

New patch works following json string. First unicode char is 4bytes unicode.

{"name":"\ud867\ude3d\u95e8\u88ab\u8111\u5b50\u6324\u574f\u4e86"}
38618_jsonscanner.pp-2.patch (2,317 bytes)   
Index: packages/fcl-json/src/jsonscanner.pp
===================================================================
--- packages/fcl-json/src/jsonscanner.pp	(revision 48948)
+++ packages/fcl-json/src/jsonscanner.pp	(working copy)
@@ -353,20 +353,29 @@
                         Error(SErrInvalidCharacter, [CurRow,CurColumn,FTokenStr[0]]);
                       end;
                       end;
-                    // ToDo: 4-bytes UTF16
                     if u1<>0 then
                       begin
-                      if (joUTF8 in Options) or (DefaultSystemCodePage=CP_UTF8) then
-                        S:=Utf8Encode(WideString(WideChar(u1)+WideChar(u2))) // ToDo: use faster function
-                      else
-                        S:=String(WideChar(u1)+WideChar(u2)); // WideChar converts the encoding. Should it warn on loss?
-                      u1:=0;
-                      end
-                    else
-                      begin
-                      S:='';
-                      u1:=u2;
-                      end
+                        if ((u1>=$D800) and (u1<=$DBFF)) and 
+						   ((u2>=$DC00) and (u2<=$DFFF)) then
+                          begin
+                            // 4bytes
+                            if (joUTF8 in Options) or (DefaultSystemCodePage=CP_UTF8) then
+                              S:=Utf8Encode(WideString(WideChar(u1)+WideChar(u2))) // ToDo: use faster function
+                            else
+                              S:=String(WideChar(u1)+WideChar(u2)); // WideChar converts the encoding. Should it warn on loss?
+                            u2:=0;
+                          end
+                        else
+                          begin
+                          // 2bytes
+                          if (joUTF8 in Options) or (DefaultSystemCodePage=CP_UTF8) then
+                            S:=Utf8Encode(WideString(WideChar(u1))) // ToDo: use faster function
+                          else
+                            S:=String(WideChar(u1)); // WideChar converts the encoding. Should it warn on loss?
+                          end;
+                      end else
+                        S:='';
+                    u1:=u2;
                     end;
               #0  : Error(SErrOpenString,[FCurRow]);
             else
38618_jsonscanner.pp-2.patch (2,317 bytes)   

Michael Van Canneyt

2021-03-14 16:09

administrator   ~0129657

我解决了问题。谢谢您!

Michael Van Canneyt

2021-03-14 16:11

administrator   ~0129658

@kim, the problem was the buffer for the UTF8 string was only 4 bytes, when UTF8 can have 4 bytes per character.

Do-wan Kim

2021-03-14 16:23

reporter   ~0129659

Increase buffer size doesn't work with this json string.

{"name":"\u95e8\ud867\ude3d\u88ab\u8111\u5b50\u6324\u574f\u4e86"}

Michael Van Canneyt

2021-03-14 16:48

administrator   ~0129662

@Kim, please file a separate bugreport for that.
If you make a patch from current source, please consider adding your case to the testsuite. (JSONReader tests, TestString)

Issue History

Date Modified Username Field Change
2021-03-13 04:36 Shen Min New Issue
2021-03-13 04:36 Shen Min File Added: fpjson-bug-demo-main.zip
2021-03-13 17:35 Bart Broersma Project Packages => FPC
2021-03-14 15:18 Michael Van Canneyt Assigned To => Michael Van Canneyt
2021-03-14 15:18 Michael Van Canneyt Status new => assigned
2021-03-14 15:31 Michael Van Canneyt Relationship added related to 0038622
2021-03-14 15:37 Do-wan Kim Note Added: 0129656
2021-03-14 15:37 Do-wan Kim File Added: 38618_jsonscanner.pp-2.patch
2021-03-14 16:09 Michael Van Canneyt Status assigned => resolved
2021-03-14 16:09 Michael Van Canneyt Resolution open => fixed
2021-03-14 16:09 Michael Van Canneyt Fixed in Version => 3.3.1
2021-03-14 16:09 Michael Van Canneyt Fixed in Revision => 48965
2021-03-14 16:09 Michael Van Canneyt FPCTarget => 4.0.0
2021-03-14 16:09 Michael Van Canneyt Note Added: 0129657
2021-03-14 16:11 Michael Van Canneyt Note Added: 0129658
2021-03-14 16:23 Do-wan Kim Note Added: 0129659
2021-03-14 16:48 Michael Van Canneyt Note Added: 0129662