View Issue Details

IDProjectCategoryView StatusLast Update
0032389FPCPackagespublic2018-05-03 19:28
Reporterd.l.i.wAssigned ToMichael Van Canneyt 
PrioritynormalSeverityminorReproducibilityalways
Status resolvedResolutionfixed 
Product Version3.0.2Product Build 
Target Version3.2.0Fixed in Version3.1.1 
Summary0032389: Latest changes to sax_html.pp break script parsing
Description(efSubelementContent in HTMLElementProps[FStack[FNesting-1]].Flags) is False for script tags, therefore there are immediately closed and the scScript context is never reached.

The correct fix IMO would be to also check for efPCDATAContent.
Steps To ReproduceTry to parse the attached html.
Additional InformationRelated to 0032322
TagsNo tags attached.
Fixed in Revision37863
FPCOldBugId
FPCTarget
Attached Files
  • test.html (93 bytes)
  • 0032389_sax.patch (922 bytes)
    diff --git a/sax_html.pp b/sax_html.pp
    index 75a8158..1afdf35 100644
    --- a/sax_html.pp.ori
    +++ b/sax_html.pp
    @@ -553,16 +553,16 @@ begin
               AutoClose(TagName);
               namePush(TagName);
               DoStartElement('', TagName, '', Attr);
    -          if not (efSubelementContent in HTMLElementProps[FStack[FNesting-1]].Flags) then begin
    -            DoEndElement('', TagName, '');
    -            NamePop;
    -          end;
               if FStack[FNesting-1] in [etScript,etStyle] then
               begin
                 NewContext := scScript;
                 FScriptEndTag := '</' + HTMLElementProps[FStack[FNesting-1]].Name;
                 FScriptEndMatchPos := 1;
               end;
    +          if not (efSubelementContent in HTMLElementProps[FStack[FNesting-1]].Flags) then begin
    +            DoEndElement('', TagName, '');
    +            NamePop;
    +          end;
             end;
             if Assigned(Attr) then
               Attr.Free;
    
    0032389_sax.patch (922 bytes)
  • 0032389_sax_v2.patch (1,315 bytes)
    diff --git a/sax_html.pp b/sax_html.pp
    index 75a8158..2a7b386 100644
    --- a/sax_html.pp.ori
    +++ b/sax_html.pp
    @@ -553,16 +553,17 @@ begin
               AutoClose(TagName);
               namePush(TagName);
               DoStartElement('', TagName, '', Attr);
    -          if not (efSubelementContent in HTMLElementProps[FStack[FNesting-1]].Flags) then begin
    -            DoEndElement('', TagName, '');
    -            NamePop;
    -          end;
               if FStack[FNesting-1] in [etScript,etStyle] then
               begin
                 NewContext := scScript;
                 FScriptEndTag := '</' + HTMLElementProps[FStack[FNesting-1]].Name;
                 FScriptEndMatchPos := 1;
               end;
    +          if (efSubcontent*HTMLElementProps[FStack[FNesting-1]].Flags=[]) then begin
    +            // do not push empty elements, don't wait for AutoClose
    +            DoEndElement('', TagName, '');
    +            NamePop;
    +          end;
             end;
             if Assigned(Attr) then
               Attr.Free;
    @@ -574,8 +575,8 @@ begin
         scScript:
           begin
             DoCharacters(PSAXChar(TokenText), 0, Length(TokenText));
    -        DoEndElement('', HTMLElementProps[FStack[FNesting-1]].Name, '');
    -        namePop;
    +        DoEndElement('', Copy(FScriptEndTag, 3), '');
    +        NamePop;
             FScriptEndTag := '';
           end;
       end;
    
    0032389_sax_v2.patch (1,315 bytes)

Activities

d.l.i.w

2017-09-08 23:11

reporter  

test.html (93 bytes)

d.l.i.w

2017-09-08 23:26

reporter   ~0102727

if [efSubelementContent, efPCDATAContent] * HTMLElementProps[FStack[FNesting-1]].Flags = []

seems to work, but I'm not sure, if this breaks other things.

Martok

2017-09-09 11:40

reporter   ~0102732

Last edited: 2017-09-09 12:07

View 3 revisions

Adding PCData there is provably wrong...

Script parsing should not need the script element on the tag stack. If it does, *that* needs to be fixed because formally, [script] can only contain PCData, not other elements.

I'll look into it.

Edit: that was more obvious than I thought, should have seen this the first time around. Sorry! Patch attached.

There is actually another problem in EnterNewScannerContext, but this is never called with NewContext=scScript. We probably should remove that unreachable case label, but I haven't done so in the patch.

Martok

2017-09-09 11:58

reporter  

0032389_sax.patch (922 bytes)
diff --git a/sax_html.pp b/sax_html.pp
index 75a8158..1afdf35 100644
--- a/sax_html.pp.ori
+++ b/sax_html.pp
@@ -553,16 +553,16 @@ begin
           AutoClose(TagName);
           namePush(TagName);
           DoStartElement('', TagName, '', Attr);
-          if not (efSubelementContent in HTMLElementProps[FStack[FNesting-1]].Flags) then begin
-            DoEndElement('', TagName, '');
-            NamePop;
-          end;
           if FStack[FNesting-1] in [etScript,etStyle] then
           begin
             NewContext := scScript;
             FScriptEndTag := '</' + HTMLElementProps[FStack[FNesting-1]].Name;
             FScriptEndMatchPos := 1;
           end;
+          if not (efSubelementContent in HTMLElementProps[FStack[FNesting-1]].Flags) then begin
+            DoEndElement('', TagName, '');
+            NamePop;
+          end;
         end;
         if Assigned(Attr) then
           Attr.Free;
0032389_sax.patch (922 bytes)

d.l.i.w

2017-09-09 13:01

reporter   ~0102733

>Script parsing should not need the script element on the tag stack.
Yes, that is true.

The patch works for me. It would be good to get it into the next release.

Martok

2017-09-10 00:40

reporter   ~0102743

Hm, no, still not quite right. The TextNode gets inserted in the wrong place in the DOM.

I will have another look tomorrow, with more test files...

Martok

2017-09-10 16:49

reporter  

0032389_sax_v2.patch (1,315 bytes)
diff --git a/sax_html.pp b/sax_html.pp
index 75a8158..2a7b386 100644
--- a/sax_html.pp.ori
+++ b/sax_html.pp
@@ -553,16 +553,17 @@ begin
           AutoClose(TagName);
           namePush(TagName);
           DoStartElement('', TagName, '', Attr);
-          if not (efSubelementContent in HTMLElementProps[FStack[FNesting-1]].Flags) then begin
-            DoEndElement('', TagName, '');
-            NamePop;
-          end;
           if FStack[FNesting-1] in [etScript,etStyle] then
           begin
             NewContext := scScript;
             FScriptEndTag := '</' + HTMLElementProps[FStack[FNesting-1]].Name;
             FScriptEndMatchPos := 1;
           end;
+          if (efSubcontent*HTMLElementProps[FStack[FNesting-1]].Flags=[]) then begin
+            // do not push empty elements, don't wait for AutoClose
+            DoEndElement('', TagName, '');
+            NamePop;
+          end;
         end;
         if Assigned(Attr) then
           Attr.Free;
@@ -574,8 +575,8 @@ begin
     scScript:
       begin
         DoCharacters(PSAXChar(TokenText), 0, Length(TokenText));
-        DoEndElement('', HTMLElementProps[FStack[FNesting-1]].Name, '');
-        namePop;
+        DoEndElement('', Copy(FScriptEndTag, 3), '');
+        NamePop;
         FScriptEndTag := '';
       end;
   end;
0032389_sax_v2.patch (1,315 bytes)

Michael Van Canneyt

2017-12-29 14:26

administrator   ~0105109

Applied patch, added demo test program (with test.html added).

Thanks for the patch.

Issue History

Date Modified Username Field Change
2017-09-08 23:11 d.l.i.w New Issue
2017-09-08 23:11 d.l.i.w File Added: test.html
2017-09-08 23:26 d.l.i.w Note Added: 0102727
2017-09-09 11:40 Martok Note Added: 0102732
2017-09-09 11:58 Martok File Added: 0032389_sax.patch
2017-09-09 11:58 Martok Note Edited: 0102732 View Revisions
2017-09-09 12:07 Martok Note Edited: 0102732 View Revisions
2017-09-09 13:01 d.l.i.w Note Added: 0102733
2017-09-10 00:40 Martok Note Added: 0102743
2017-09-10 16:49 Martok File Added: 0032389_sax_v2.patch
2017-09-12 10:01 Michael Van Canneyt Assigned To => Michael Van Canneyt
2017-09-12 10:01 Michael Van Canneyt Status new => assigned
2017-12-29 14:26 Michael Van Canneyt Fixed in Revision => 37863
2017-12-29 14:26 Michael Van Canneyt Note Added: 0105109
2017-12-29 14:26 Michael Van Canneyt Status assigned => resolved
2017-12-29 14:26 Michael Van Canneyt Fixed in Version => 3.1.1
2017-12-29 14:26 Michael Van Canneyt Resolution open => fixed
2017-12-29 14:26 Michael Van Canneyt Target Version => 3.2.0