View Issue Details

IDProjectCategoryView StatusLast Update
0037837FPCFCLpublic2020-10-13 08:58
ReporterBenito van der Zander Assigned ToMichael Van Canneyt  
PrioritynormalSeverityminorReproducibilityhave not tried
Status assignedResolutionopen 
Platformamd64OSlinux 
Product Version3.3.1 
Summary0037837: json parser allows garbage after single values
DescriptionDoParse stops after parsing the first value, so afterwards anything can be in the string:


  writeln(GetJSON('"abc"xxxxxxx').AsString);
  writeln(GetJSON('1234...xxxxxxx').AsFloat:4:4);
  writeln(GetJSON('[1,2,3]xxxxxxx').Items[0].AsFloat:4:4);
TagsNo tags attached.
Fixed in Revision
FPCOldBugId
FPCTarget
Attached Files

Activities

Michael Van Canneyt

2020-09-29 15:40

administrator   ~0125957

I'm not sure I consider this a bug.

A pascal program can also contain garbage after the final end.

I will think about it.

Benito van der Zander

2020-09-29 17:36

reporter   ~0125963

I am trying to make it pass all these tests: https://github.com/nst/JSONTestSuite (or rather another test suite that includes all those json tests as well)

They do not allow this (I do not care. I do not even use the parser, only the scanner)

Another failing case is a garbage 0, i.e. '123'#0 where the scanner thinks the number ends at #0




It could check it with the joStrict option

Or, return a pchar, so we know where it stopped parsing. Then it could also take a pchar as input, i.e. parse pchar instead of string/stream, and then continue parsing from there, so you could parse '123 456 789' without array and without copying the string .

Michael Van Canneyt

2020-09-29 17:56

administrator   ~0125965

No, no pchar.

Agreed on the check with joStrict. This should be as simple as adding a

if joStrict in Options then
  begin
  Repeat
     GetNextToken ;
  Until CurrentToken<>tkWhiteSpace;
  If CurrentToken<>tkEOF then
    Raise Exception.CreateFmt('Expected EOF, but got %s',[CurrentTokenString]);
 end;

after the code in TBaseJSONReader.DoExecute

Benito van der Zander

2020-09-29 18:24

reporter   ~0125966

> No, no pchar.

pchars can be very useful, I have had always plans to make a pchar only version of the scanner. pchars in and pchars out, and zero allocations. That has a good chance to be twice as fast (and decoding \u... would be moved elsewhere)

> Until CurrentToken<>tkWhiteSpace;
> If CurrentToken<>tkEOF then
> Raise Exception.CreateFmt('Expected EOF, but got %s',[CurrentTokenString]);

That is basically what I have in my parser

It is passing all these garbage tests, except for '123'#0

Michael Van Canneyt

2020-09-29 19:13

administrator   ~0125968

You don't need to convince me of the usefulness of pchar :-)

But in this case I don't want pchar because at some point the code must run in the browser, which does not know pchar.

Benito van der Zander

2020-10-12 23:36

reporter   ~0126274

> But in this case I don't want pchar because at some point the code must run in the browser, which does not know pchar.

No pchar? But the json scanner uses pchar everywhere in fetchtoken

But it could also be an offset in the string. Return parsed x chars, or take a start at char y...

Why must it run in the browser anyways? All the browsers can already parse json themselves

Michael Van Canneyt

2020-10-13 08:58

administrator   ~0126275

It's one thing to use pchar internally, it's another to build the API around it.
The pas2js parser also runs in the browser, this was possible because the pchar usage is limited.

The browser can indeed parse JSON, but that is one-shot.
(that is how it works currently)

By allowing the scanner to run in the browser the other classes (reader, evented reader) and options that the browser does not offer can also be reused.

Issue History

Date Modified Username Field Change
2020-09-29 14:17 Benito van der Zander New Issue
2020-09-29 15:40 Michael Van Canneyt Note Added: 0125957
2020-09-29 15:40 Michael Van Canneyt Assigned To => Michael Van Canneyt
2020-09-29 15:40 Michael Van Canneyt Status new => assigned
2020-09-29 17:36 Benito van der Zander Note Added: 0125963
2020-09-29 17:56 Michael Van Canneyt Note Added: 0125965
2020-09-29 18:24 Benito van der Zander Note Added: 0125966
2020-09-29 19:13 Michael Van Canneyt Note Added: 0125968
2020-10-12 23:36 Benito van der Zander Note Added: 0126274
2020-10-13 08:58 Michael Van Canneyt Note Added: 0126275