View Issue Details

IDProjectCategoryView StatusLast Update
0037959FPCCompilerpublic2020-10-23 17:05
ReporterJ. Gareth Moreton Assigned ToFlorian  
PrioritynormalSeverityminorReproducibilityN/A
Status resolvedResolutionfixed 
PlatformCross-platformOSMicrosoft Windows 
Product Version3.3.1 
Fixed in Version3.3.1 
Summary0037959: [Patch] Peephole Optimizer pass reduction
DescriptionThis patch aims to reduce the number of passes performed by the Peephole Optimizer without harming the quality of the code under -O2 and above. The modified routines are platform-agnostic and should apply to all platforms where the routines have not been overridden.

Most specifically, the mandatory second execution of pass 1 has been removed.

See Additional Information for specific notes.
Steps To ReproduceApply patch and confirm successful compilation and identical binary output under -O2 and above.
Additional Information- The mandatory second run of pass 1 has been removed.
- -O1 will now perform slightly worse as only a single iteration of pass 1 is performed - this is deemed acceptable from past conversations and the fact that -O1 is generally only done for quick debugging.
- -O2 will perform up to 2 iterations of pass 1, but will drop out early if no changes were made in the first pass (i.e. none of the individual optimisations setting Result to True). Output should be the same as before.
- -O3 will perform up to 5 iterations of pass 1, and -O4 will perform up to 8, although it should never take that many... the finite limit is mostly a safeguard against a faulty optimisation that would otherwise cause an infinite loop. These numbers were chosen because, originally, Pass 1 has an upper limit of 4 iterations to protect against said faulty optimisations, and the mandatory second run would increase this to at least 5, or 8 if there was indeed a faulty optimisation (two runs of 4 iterations each).

Successful test criteria:
- Successful building of the compiler, RTL and no regressions in tests.
- -O2 shows no difference in the disassembly. Any differences should be indictive of an individual optimisation not setting Result to True when it should have done.
- Compile times should be generally slightly less than before, and almost equal when done under -O2 or with optimisations turned off.
TagsNo tags attached.
Fixed in Revision47146
FPCOldBugId
FPCTarget-
Attached Files

Relationships

parent of 0037972 resolvedFlorian [Patch] Minor refactor of reduced iterations of pass 1 

Activities

J. Gareth Moreton

2020-10-19 14:45

developer  

pass-reduction.patch (2,773 bytes)   
Index: compiler/aopt.pas
===================================================================
--- compiler/aopt.pas	(revision 47137)
+++ compiler/aopt.pas	(working copy)
@@ -273,9 +273,7 @@
     Procedure TAsmOptimizer.Optimize;
       Var
         HP: tai;
-        pass: longint;
       Begin
-        pass:=0;
         BlockStart := tai(AsmL.First);
         pass_1;
         While Assigned(BlockStart) Do
@@ -282,19 +280,9 @@
           Begin
             if (cs_opt_peephole in current_settings.optimizerswitches) then
               begin
-                if pass = 0 then
-                  PrePeepHoleOpts;
-                { Peephole optimizations }
+                PrePeepHoleOpts;
                 PeepHoleOptPass1;
-                { Only perform them twice in the first pass }
-                if pass = 0 then
-                  PeepHoleOptPass1;
-              end;
-            { more peephole optimizations }
-            if (cs_opt_peephole in current_settings.optimizerswitches) then
-              begin
                 PeepHoleOptPass2;
-                { if pass = last_pass then }
                 PostPeepHoleOpts;
               end;
             { free memory }
Index: compiler/aoptobj.pas
===================================================================
--- compiler/aoptobj.pas	(revision 47137)
+++ compiler/aoptobj.pas	(working copy)
@@ -2464,14 +2464,29 @@
 
 
     procedure TAOptObj.PeepHoleOptPass1;
+      const
+        MaxPasses: array[1..4] of Cardinal = (1, 2, 5, 8);
       var
         p : tai;
         stoploop, FirstInstruction, JumpOptsAvailable: boolean;
+        PassCount, MaxCount: Cardinal;
       begin
         JumpOptsAvailable := CanDoJumpOpts();
 
         StartPoint := BlockStart;
+        PassCount := 0;
 
+        { Determine the maximum number of passes allowed based on the compiler switches }
+        if (cs_opt_level4 in current_settings.optimizerswitches) then
+          { it should never take more than 8 passes, but the limit is finite to protect against faulty optimisations }
+          MaxCount := MaxPasses[4]
+        else if (cs_opt_level3 in current_settings.optimizerswitches) then
+          MaxCount := MaxPasses[3]
+        else if (cs_opt_level2 in current_settings.optimizerswitches) then
+          MaxCount := MaxPasses[2] { The original double run of Pass 1 }
+        else
+          MaxCount := MaxPasses[1];
+
         repeat
           stoploop:=true;
           p := StartPoint;
@@ -2523,7 +2538,10 @@
                 p := tai(UpdateUsedRegsAndOptimize(p).Next);
 
             end;
-        until stoploop or not(cs_opt_level3 in current_settings.optimizerswitches);
+
+          Inc(PassCount);
+
+        until stoploop or (PassCount >= MaxCount);
       end;
 
 
pass-reduction.patch (2,773 bytes)   

Florian

2020-10-20 22:05

administrator   ~0126432

Thanks, I committed it though I keep the passed for -O3 and -O4 the same: -O4 just means the same as -O3 but including unsafe optimizations like fast math.

Issue History

Date Modified Username Field Change
2020-10-19 14:45 J. Gareth Moreton New Issue
2020-10-19 14:45 J. Gareth Moreton File Added: pass-reduction.patch
2020-10-20 22:05 Florian Assigned To => Florian
2020-10-20 22:05 Florian Status new => resolved
2020-10-20 22:05 Florian Resolution open => fixed
2020-10-20 22:05 Florian Fixed in Version => 3.3.1
2020-10-20 22:05 Florian Fixed in Revision => 47146
2020-10-20 22:05 Florian FPCTarget => -
2020-10-20 22:05 Florian Note Added: 0126432
2020-10-23 17:05 J. Gareth Moreton Relationship added parent of 0037972