View Issue Details

IDProjectCategoryView StatusLast Update
0038985FPCCompilerpublic2021-06-11 02:18
ReporterJ. Gareth Moreton Assigned To 
PrioritynormalSeveritymajorReproducibilityrandom
Status newResolutionopen 
PlatformCross-platform (x86 especially)OSMicrosoft Windows 
Product Version3.3.1 
Summary0038985: [Patch] Faulty conditional jump logic
DescriptionThis patch fixes some faulty jump logic in the "Dominated conditional jump" and the "condition_in" function. The following are fixed:

- The "Dominated conditional jump" had the subset check between the two jumps back-to-front.
- The "jmp<cond> before jmp<inv_cond>" had the subset check between the two jumps back-to-front, and wasn't quite correct in cases where <inv_cond> was a subset of the 2nd jump's conditions (rather than being exactly the same).
- "RemoveInstruction" and "RemoveCurrentP" inserted in the global jump optimisations where possible to reduce code maintenance.

x86-specific:

- "condition_in" incorrectly said "NE" was a subset of "L", "G", "A" and "B" (for example, if %reg = 0 and "cmp $1,%reg" is called, jne will branch, but jg will not. Treat "not equal" as equivalent to "less than OR greater than" or "above OR below").
- "condition_in" didn't consider "A", "B", "L" and "G" to be subsets of "NE".
- "condition_in" didn't consider "A" and "AE" to be subsets of "NC" (in regards to checking flags, B and C (and NB and NC) are interchangeable).
- "condition_in" incorrectly said "E" was a subset of "NB" (though logically sound when it comes to comparing numerical values, the conditions don't share flags. That is, E checks ZF = 0, while NB/NC check CF = 0).
Steps To ReproduceApply patch and confirm correct compilation on all platforms.
Additional InformationI'm amazed this issue hasn't caused problems before. It only came about when I was experimenting with a new peephole optimisation for x86.

If present in 3.2.0 or 3.2.2 (I can't remember as this optimisation of mine is quite old now), it may need to be backported.
Tagsbugfix, compiler, i386, optimization, optimizations, patch, x86, x86_64
Fixed in Revision
FPCOldBugId
FPCTarget-
Attached Files

Relationships

child of 0036271 resolvedFlorian [Patch] Jump optimisations in code generator 

Activities

J. Gareth Moreton

2021-06-10 22:58

developer   ~0131232

J. Gareth Moreton

2021-06-11 02:18

developer   ~0131237

Sorry for all the edits - it's proving a little tricky to get all the conditions logically correct when no errors appear.
jump-subset-fixes.patch (8,982 bytes)   
Index: compiler/aoptobj.pas
===================================================================
--- compiler/aoptobj.pas	(revision 49494)
+++ compiler/aoptobj.pas	(working copy)
@@ -2002,10 +2002,7 @@
 {$ifdef cpudelayslot}
                         RemoveDelaySlot(p);
 {$endif cpudelayslot}
-                        UpdateUsedRegs(tai(p.Next));
-                        AsmL.Remove(p);
-                        p.Free;
-                        p := hp1;
+                        RemoveCurrentP(p, hp1);
 
                         Result := True;
                         Exit;
@@ -2043,8 +2040,7 @@
 {$ifdef cpudelayslot}
                             RemoveDelaySlot(hp1);
 {$endif cpudelayslot}
-                            asml.remove(hp1);
-                            hp1.free;
+                            RemoveInstruction(hp1);
 
                             stoploop := False;
 
@@ -2081,7 +2077,7 @@
                 else
                   begin
                     { Do not try to optimize if the test generating the condition
-                      is the same instruction, like 'bne	$v0,$zero,.Lj3' for MIPS }
+                      is the same instruction, like 'bne $v0,$zero,.Lj3' for MIPS }
                     if (taicpu(p).ops>1) or (taicpu(hp1).ops>1) then
                       exit;
 
@@ -2089,7 +2085,8 @@
                         jmp<cond1>    @Lbl1
                         jmp<cond2>    @Lbl2
 
-                        Remove 2nd jump if conditions are equal or cond2 is fully covered by cond1
+                        Remove 2nd jump if conditions are equal or cond2 is a subset of cond1
+                        (as if the first jump didn't branch, then neither will the 2nd)
                     }
 
                     if condition_in(taicpu(hp1).condition, taicpu(p).condition) then
@@ -2098,10 +2095,11 @@
 
                         NCJLabel.decrefs;
                         GetNextInstruction(hp1, hp2);
+{$ifdef cpudelayslot}
+                        RemoveDelaySlot(hp1);
+{$endif cpudelayslot}
+                        RemoveInstruction(hp1);
 
-                        AsmL.Remove(hp1);
-                        hp1.Free;
-
                         hp1 := hp2;
 
                         { Flag another pass in case @Lbl2 appeared earlier in the procedure and is now a dead label }
@@ -2114,9 +2112,9 @@
                         jmp<cond1>  @Lbl1
                         jmp<cond2>  @Lbl2
 
-                      And inv(cond2) is a subset of cond1 (e.g. je followed by jne, or jae followed by jbe) )
+                      And inv(cond1) is a subset of cond2 (e.g. je followed by jne, or jae followed by jbe) )
                     }
-                    if condition_in(inverse_cond(taicpu(hp1).condition), taicpu(p).condition) then
+                    if condition_in(inverse_cond(taicpu(p).condition), taicpu(hp1).condition) then
                       begin
                         GetNextInstruction(hp1, hp2);
 
@@ -2124,6 +2122,9 @@
                           the first jump completely }
                         if FindLabel(CJLabel, hp2) then
                           begin
+                            { However, to be absolutely correct, cond2 must be changed to inv(cond1) }
+                            taicpu(hp1).condition := inverse_cond(taicpu(p).condition);
+
                             DebugMsg(SPeepholeOptimization+'jmp<cond> before jmp<inv_cond> - removed first jump',p);
 
                             CJLabel.decrefs;
@@ -2130,10 +2131,7 @@
 {$ifdef cpudelayslot}
                             RemoveDelaySlot(p);
 {$endif cpudelayslot}
-                            UpdateUsedRegs(tai(p.Next));
-                            AsmL.Remove(p);
-                            p.Free;
-                            p := hp1;
+                            RemoveCurrentP(p, hp1);
 
                             Result := True;
                             Exit;
@@ -2146,7 +2144,7 @@
                             improve this particular optimisation to work on AVR,
                             please do. [Kit] }
                           begin
-                            { Since cond1 is a subset of inv(cond2), jmp<cond2> will always branch if
+                            { Since inv(cond1) is a subset of cond2, jmp<cond2> will always branch if
                               jmp<cond1> does not, so change jmp<cond2> to an unconditional jump. }
 
                             DebugMsg(SPeepholeOptimization+'jmp<cond> before jmp<inv_cond> - made second jump unconditional',p);
Index: compiler/x86/aoptx86.pas
===================================================================
--- compiler/x86/aoptx86.pas	(revision 49494)
+++ compiler/x86/aoptx86.pas	(working copy)
@@ -3406,7 +3406,7 @@
             if (taicpu(p).oper[0]^.typ = top_reg) and
               (taicpu(p).oper[0]^.reg = taicpu(p).oper[1]^.reg) and
               MatchInstruction(hp1, A_JCC, []) and
-              (taicpu(hp1).oper[0]^.typ = top_ref) then
+              IsJumpToLabel(taicpu(hp1)) then
               begin
                 JumpLabel := TAsmLabel(taicpu(hp1).oper[0]^.ref^.symbol);
                 p_label := nil;
@@ -3422,7 +3422,7 @@
                   SuperRegistersEqual(taicpu(p_dist).oper[0]^.reg, taicpu(p).oper[0]^.reg) and
                   SuperRegistersEqual(taicpu(p_dist).oper[1]^.reg, taicpu(p).oper[1]^.reg) and
                   GetNextInstruction(p_dist, hp1_dist) and
-                  MatchInstruction(hp1_dist, A_JCC, []) then
+                  MatchInstruction(hp1_dist, A_JCC, []) then { This doesn't have to be an explicit label }
                   begin
                     JumpLabel_dist := TAsmLabel(taicpu(hp1_dist).oper[0]^.ref^.symbol);
 
@@ -3430,9 +3430,10 @@
                       { This is an infinite loop }
                       Exit;
 
-                    { Best optimisation when the second condition is a subset (or equal) to the first }
-                    if condition_in(taicpu(hp1_dist).condition, taicpu(hp1).condition) then
+                    { Best optimisation when the first condition is a subset (or equal) of the second }
+                    if condition_in(taicpu(hp1).condition, taicpu(hp1_dist).condition) then
                       begin
+                        { Any registers used here will already be allocated }
                         if Assigned(JumpLabel_dist) then
                           JumpLabel_dist.IncRefs;
 
Index: compiler/x86/cpubase.pas
===================================================================
--- compiler/x86/cpubase.pas	(revision 49494)
+++ compiler/x86/cpubase.pas	(working copy)
@@ -697,33 +697,32 @@
         if not Result then
           case Subset of
             C_A,  C_NBE:
-              Result := (c in [C_A,  C_AE, C_NB, C_NBE]);
-            C_AE, C_NB:
-              Result := (c in [C_AE, C_NB]);
-            C_B,  C_NAE:
+              Result := (c in [C_A,  C_AE, C_NB, C_NC, C_NBE,C_NE, C_NZ]);
+            C_AE, C_NB, C_NC:
+              { C_A  / C_NBE: CF = 0 and ZF = 0; not a subset because ZF has to be zero as well
+                C_AE / C_NB:  CF = 0 }
+              Result := (c in [C_AE, C_NB, C_NC]);
+            C_B,  C_C,  C_NAE:
+              { C_B  / C_NAE: CF = 1
+                C_BE / C_NA:  CF = 1 or ZF = 1 }
               Result := (c in [C_B,  C_BE, C_C,  C_NA, C_NAE]);
             C_BE, C_NA:
               Result := (c in [C_BE, C_NA]);
-            C_C:
-              { C_B  / C_NAE: CF = 1
-                C_BE / C_NA:  CF = 1 or ZF = 1 }
-              Result := (c in [C_B,  C_BE, C_NA, C_NAE]);
             C_E,  C_Z:
-              Result := (c in [C_AE, C_BE, C_E,  C_NA, C_NB, C_NG, C_NL]);
+              Result := (c in [C_AE, C_BE, C_E,  C_NA, C_NG, C_Z]);
             C_G,  C_NLE:
-              Result := (c in [C_G,  C_GE, C_NL, C_NLE]);
+              { Not-equal can be considered equivalent to less than or greater than }
+              Result := (c in [C_G,  C_GE, C_NE, C_NL, C_NLE,C_NZ]);
             C_GE, C_NL:
               Result := (c in [C_GE, C_NL]);
             C_L,  C_NGE:
-              Result := (c in [C_L,  C_LE, C_NG, C_NGE]);
+              Result := (c in [C_L,  C_LE, C_NE, C_NG, C_NGE,C_NZ]);
             C_LE, C_NG:
               Result := (c in [C_LE, C_NG]);
-            C_NC:
-              { C_A  / C_NBE: CF = 0 and ZF = 0; not a subset because ZF has to be zero as well
-                C_AE / C_NB:  CF = 0 }
-              Result := (c in [C_AE, C_NB]);
             C_NE, C_NZ:
-              Result := (c in [C_NE, C_NZ, C_A,  C_B,  C_NAE,C_NBE,C_L,  C_G,  C_NLE,C_NGE]);
+              { Note that not equal is NOT a subset of greater/less than because
+                not equal is less than OR greater than. Same with above and below }
+              Result := (c in [C_NE, C_NZ]);
             C_NP, C_PO:
               Result := (c in [C_NP, C_PO]);
             C_P,  C_PE:
jump-subset-fixes.patch (8,982 bytes)   

Issue History

Date Modified Username Field Change
2021-06-10 22:49 J. Gareth Moreton New Issue
2021-06-10 22:49 J. Gareth Moreton File Added: jump-subset-fixes.patch
2021-06-10 22:50 J. Gareth Moreton Tag Attached: patch.compiler
2021-06-10 22:50 J. Gareth Moreton Tag Attached: optimizations
2021-06-10 22:50 J. Gareth Moreton Tag Attached: bugfix
2021-06-10 22:50 J. Gareth Moreton Tag Detached: patch.compiler
2021-06-10 22:50 J. Gareth Moreton Tag Attached: patch
2021-06-10 22:50 J. Gareth Moreton Tag Attached: compiler
2021-06-10 22:50 J. Gareth Moreton Tag Attached: optimization
2021-06-10 22:50 J. Gareth Moreton Tag Attached: i386
2021-06-10 22:50 J. Gareth Moreton Tag Attached: x86
2021-06-10 22:50 J. Gareth Moreton Tag Attached: x86_64
2021-06-10 22:52 J. Gareth Moreton Relationship added child of 0036271
2021-06-10 22:54 J. Gareth Moreton Description Updated View Revisions
2021-06-10 22:54 J. Gareth Moreton FPCTarget => -
2021-06-10 22:58 J. Gareth Moreton File Deleted: jump-subset-fixes.patch
2021-06-10 22:58 J. Gareth Moreton Note Added: 0131232
2021-06-10 22:58 J. Gareth Moreton File Added: jump-subset-fixes.patch
2021-06-10 23:01 J. Gareth Moreton Description Updated View Revisions
2021-06-10 23:02 J. Gareth Moreton Description Updated View Revisions
2021-06-10 23:17 J. Gareth Moreton Severity minor => major
2021-06-11 02:17 J. Gareth Moreton File Deleted: jump-subset-fixes.patch
2021-06-11 02:18 J. Gareth Moreton Note Added: 0131237
2021-06-11 02:18 J. Gareth Moreton File Added: jump-subset-fixes.patch