View Issue Details

IDProjectCategoryView StatusLast Update
0036523LazarusUtilitiespublic2020-01-08 01:49
ReporterdevEric69 Assigned ToJuha Manninen  
PrioritynormalSeverityminorReproducibilityalways
Status resolvedResolutionfixed 
PlatformLinuxOSUbuntu 
Product Version2.0.4 
Summary0036523: LHelp bug: titles with a '&' alone, cause a "range check error".
Description- compile lHelp.
- open user.chm avec lHelp.
==> there's a "Range chack error" abort because a title is called Keybord & Mouse.
Steps To ReproduceIn ChmSpecialParser.pas, line 67, ther's thisfunction FixEscapedHTML(AText: string): string;

Below, in the "Additional information" memo, I've pasted this *corrected* function (patch), so that it distinguishes now a '&' alone (as in the title of user.chm '6.4.2 Keybord & Mouse"), from a character belonging to the HTML grammar like ' ', or '&lt', etc., etc.
Additional Informationfunction FixEscapedHTML(AText: string): string;
var
  i, iPosAmpersand, iLenAText: Integer;
  bFoundClosureSemiColon: Boolean;
  ampstr: string; sTemp: string;
  ws: widestring;
  entity: widechar;
begin
  Result := '';
  i := 1;
  iLenAText:= Length(AText);
  while i <= iLenAText do begin
    if AText[i]='&' then begin
      iPosAmpersand:= i;
      ampStr := '';
      inc(i);
      while (i <= iLenAText) and (AText[i] <> ';') do begin
                ampStr := ampStr + AText[i];
        inc(i);
      end;
      //is there a Char ';', closing a possible HTML entity like '&{#x}~~~{~~};'?
      bFoundClosureSemiColon:= False;
      if (i > iLenAText) then begin
        if (AText[i-1] = ';') then
          bFoundClosureSemiColon := True
      end
            else begin
        if (AText[i] = ';') then
            bFoundClosureSemiColon := True;
      end;
      if bFoundClosureSemiColon then begin //so, if it's a possible HTML encoded character like "&xxx;" ...
        ws := UTF8Encode(ampStr);
        if ResolveHTMLEntityReference(ws, entity) then
          Result := Result + UnicodeToUTF8(cardinal(entity))
        else
          Result := Result + '?';
      end
      else begin //so, if it's not an HTML entity ie it's only an ampersand alone
        sTemp := RightStr(AText, iLenAText - (iPosAmpersand-1));
                Result := Result + sTemp;
            end
    end
    else
      Result := Result + AText[i];
    inc(i);
  end;
end;
TagsNo tags attached.
Fixed in Revisionr62510
LazTarget-
WidgetsetGTK 2
Attached Files

Activities

devEric69

2020-01-06 14:41

reporter  

user.chm (652,588 bytes)

devEric69

2020-01-06 14:50

reporter   ~0120232

Last edited: 2020-01-06 14:51

View 2 revisions

Sorry for the typos in 'Steps To Reproduce', it should be: "Below, in the "Additional information" memo, I've pasted this *corrected* function (patched), so that it distinguishes now a '&' alone (as in the title of user.chm '6.4.2 Keybord & Mouse'), from a character belonging to the HTML grammar like '  ' or '<' , etc."

Cyrax

2020-01-06 14:58

reporter   ~0120233

Can you make a proper patch, please?
https://wiki.freepascal.org/Creating_A_Patch

devEric69

2020-01-06 17:53

reporter   ~0120238

Last edited: 2020-01-07 09:30

View 2 revisions

There it is.

patch.diff (2,015 bytes)   
Index: chmspecialparser.pas
===================================================================
--- chmspecialparser.pas	(révision 62502)
+++ chmspecialparser.pas	(copie de travail)
@@ -66,27 +66,47 @@
 
 function FixEscapedHTML(AText: string): string;
 var
-  i: Integer;
-  ampstr: string;
+  i, iPosAmpersand, iLenAText: Integer;
+  bFoundClosureSemiColon: Boolean;
+  ampstr: string; sTemp: string;
   ws: widestring;
   entity: widechar;
 begin
   Result := '';
   i := 1;
-  while i <= Length(AText) do begin
+  iLenAText:= Length(AText);
+  while i <= iLenAText do begin
     if AText[i]='&' then begin
+      iPosAmpersand:= i;
       ampStr := '';
       inc(i);
-      while AText[i] <> ';' do begin
-        ampStr := ampStr + AText[i];
+      while (i <= iLenAText) and (AText[i] <> ';') do begin
+				ampStr := ampStr + AText[i];
         inc(i);
       end;
-      ws := UTF8Encode(ampStr);
-      if ResolveHTMLEntityReference(ws, entity) then
-        Result := Result + UnicodeToUTF8(cardinal(entity))
-      else
-        Result := Result + '?';
-    end else
+      //is there a Char ';', closing a possible HTML entity like '&{#x}~~~{~~};'?
+      bFoundClosureSemiColon:= False;
+      if (i > iLenAText) then begin
+        if (AText[i-1] = ';') then
+          bFoundClosureSemiColon := True
+      end
+			else begin
+        if (AText[i] = ';') then
+        	bFoundClosureSemiColon := True;
+      end;
+      if bFoundClosureSemiColon then begin	//so, if it's a possible HTML encoded character like "&xxx;" ...
+        ws := UTF8Encode(ampStr);
+        if ResolveHTMLEntityReference(ws, entity) then
+          Result := Result + UnicodeToUTF8(cardinal(entity))
+        else
+          Result := Result + '?';
+      end
+      else begin	//so, if it's not an HTML entity ie only an ampersand by itself
+        sTemp := RightStr(AText, iLenAText - (iPosAmpersand-1));
+				Result := Result + sTemp;
+			end
+    end
+    else
       Result := Result + AText[i];
     inc(i);
   end;
patch.diff (2,015 bytes)   

Juha Manninen

2020-01-08 01:49

developer   ~0120262

I applied an optimized version of the patch.
Thanks, this was an important fix. I believe the bug has caused weird problems for a long time.

Issue History

Date Modified Username Field Change
2020-01-06 14:41 devEric69 New Issue
2020-01-06 14:41 devEric69 File Added: user.chm
2020-01-06 14:50 devEric69 Note Added: 0120232
2020-01-06 14:51 devEric69 Note Edited: 0120232 View Revisions
2020-01-06 14:58 Cyrax Note Added: 0120233
2020-01-06 17:53 devEric69 File Added: patch.diff
2020-01-06 17:53 devEric69 Note Added: 0120238
2020-01-07 09:30 devEric69 Note Edited: 0120238 View Revisions
2020-01-08 01:47 Juha Manninen Assigned To => Juha Manninen
2020-01-08 01:47 Juha Manninen Status new => assigned
2020-01-08 01:49 Juha Manninen Status assigned => resolved
2020-01-08 01:49 Juha Manninen Resolution open => fixed
2020-01-08 01:49 Juha Manninen Fixed in Revision => r62510
2020-01-08 01:49 Juha Manninen LazTarget => -
2020-01-08 01:49 Juha Manninen Widgetset GTK 2 => GTK 2
2020-01-08 01:49 Juha Manninen Note Added: 0120262