View Issue Details

IDProjectCategoryView StatusLast Update
0036583FPCCompilerpublic2020-01-19 21:11
ReporterJ. Gareth MoretonAssigned ToFlorian 
PrioritylowSeveritytweakReproducibilityN/A
Status resolvedResolutionfixed 
Platformi8086, i386 and x86_64OSMicrosoft WindowsOS Version10 Professional
Product Version3.3.1Product Buildr43920 
Target VersionFixed in Version3.3.1 
Summary0036583: [Patch / Refactor] x86: Merging of Post-Peephole and Reference Optimization stages
DescriptionThis patch seeks to reduce the number of passes required to compile a Pascal program on x86 platforms by merging the Post-Peephole Optimization stage with the Reference Optimization stage.

The justification is that the Post-Peephole Optimization stage only converts individual instructions into more efficient forms (or removes unnecessary CMP operations) after passes 1 and 2 have done their heavy lifting. At the same time, the Reference Optimization pass only stops on instructions and checks to see if their operands are references, and if they are, tidy them up so they are more consistent and can be stored more efficiently. When it comes to maintenance, this Reference Optimization pass is easy to overlook because it's hidden in an overridden "PostPeepHoleOpts" routine that otherwise just calls the inherited version.

By removing this pass and adding its per-instruction code to the end of the "PostPeepHoleOptsCpu" routine (only called if an optimisation routine returns False), the two passes can be efficiently and cleanly merged for a time saving of about 10% on large projects.
Steps To ReproduceApply patches and confirm correct compilation and cross-compilation of compiler, and no change to binaries built with the compiler.
Additional InformationPersonal timing metrics when compiling Lazarus on i386-win32 and x86_64-win64:

O3 Trunk (win64):

[103.164] 1308396 lines compiled, 103.2 sec
[104.633] 1308396 lines compiled, 104.6 sec
[97.766] 1308396 lines compiled, 97.8 sec
[99.023] 1308396 lines compiled, 99.0 sec
[99.352] 1308396 lines compiled, 99.4 sec

O3 New (win64):

[87.594] 1308396 lines compiled, 87.6 sec
[86.039] 1308396 lines compiled, 86.0 sec
[86.906] 1308396 lines compiled, 86.9 sec
[87.492] 1308396 lines compiled, 87.5 sec
[87.688] 1308396 lines compiled, 87.7 sec

----

O3 Trunk (win32):

[94.837] 1312002 lines compiled, 94.8 sec
[95.554] 1312002 lines compiled, 95.6 sec
[94.838] 1312002 lines compiled, 94.8 sec
[94.806] 1312002 lines compiled, 94.8 sec
[94.025] 1312002 lines compiled, 94.0 sec

O3 New (win32):

[83.923] 1312002 lines compiled, 83.9 sec
[84.648] 1312002 lines compiled, 84.6 sec
[83.618] 1312002 lines compiled, 83.6 sec
[84.051] 1312002 lines compiled, 84.1 sec
[88.813] 1312002 lines compiled, 88.8 sec
Tagscompiler, i386, i8086, patch, refactor, x86_64
Fixed in Revision44001
FPCOldBugId
FPCTarget-
Attached Files
  • post-ref-merge.patch (5,507 bytes)
    Index: compiler/x86/aoptx86.pas
    ===================================================================
    --- compiler/x86/aoptx86.pas	(revision 43920)
    +++ compiler/x86/aoptx86.pas	(working copy)
    @@ -94,9 +94,10 @@
             function PostPeepholeOptCall(var p : tai) : Boolean;
             function PostPeepholeOptLea(var p : tai) : Boolean;
     
    -        procedure OptReferences;
    -
             procedure ConvertJumpToRET(const p: tai; const ret_p: tai);
    +
    +        { Processor-dependent reference optimisation }
    +        class procedure OptimizeRefs(var p: taicpu); static;
           end;
     
         function MatchInstruction(const instr: tai; const op: TAsmOp; const opsize: topsizes): boolean;
    @@ -5310,22 +5311,13 @@
     {$endif}
     
     
    -    procedure TX86AsmOptimizer.OptReferences;
    +    class procedure TX86AsmOptimizer.OptimizeRefs(var p: taicpu);
           var
    -        p: tai;
    -        i: Integer;
    +        OperIdx: Integer;
           begin
    -        p := BlockStart;
    -        while (p <> BlockEnd) Do
    -          begin
    -            if p.typ=ait_instruction then
    -              begin
    -                for i:=0 to taicpu(p).ops-1 do
    -                  if taicpu(p).oper[i]^.typ=top_ref then
    -                    optimize_ref(taicpu(p).oper[i]^.ref^,false);
    -              end;
    -            p:=tai(p.next);
    -          end;
    +        for OperIdx := 0 to p.ops - 1 do
    +          if p.oper[OperIdx]^.typ = top_ref then
    +            optimize_ref(p.oper[OperIdx]^.ref^, False);
           end;
     
     end.
    Index: compiler/i386/aoptcpu.pas
    ===================================================================
    --- compiler/i386/aoptcpu.pas	(revision 43920)
    +++ compiler/i386/aoptcpu.pas	(working copy)
    @@ -39,8 +39,6 @@
             function PeepHoleOptPass1Cpu(var p: tai): boolean; override;
             function PeepHoleOptPass2Cpu(var p: tai): boolean; override;
             function PostPeepHoleOptsCpu(var p : tai) : boolean; override;
    -
    -        procedure PostPeepHoleOpts; override;
           end;
     
         Var
    @@ -299,6 +297,7 @@
                                           taicpu(p).opcode := A_MOV;
                                           taicpu(p).changeopsize(S_B);
                                           setsubreg(taicpu(p).oper[1]^.reg,R_SUBL);
    +                                      Result := True;
                                         end;
                                     end;
                                   else
    @@ -320,6 +319,7 @@
                                   taicpu(p).changeopsize(S_B);
                                   setsubreg(taicpu(p).oper[1]^.reg,R_SUBL);
                                   InsertLLItem(p.previous, p, hp1);
    +                              Result := True;
                                 end;
                        end;
                     A_TEST, A_OR:
    @@ -329,6 +329,11 @@
                     else
                       ;
                   end;
    +
    +              { Optimise any reference-type operands (if Result is True, the
    +                instruction will be checked on the next iteration) }
    +              if not Result then
    +                OptimizeRefs(taicpu(p));
                 end;
               else
                 ;
    @@ -336,13 +341,6 @@
           end;
     
     
    -    procedure TCpuAsmOptimizer.PostPeepHoleOpts;
    -      begin
    -        inherited;
    -        OptReferences;
    -      end;
    -
    -
     begin
       casmoptimizer:=TCpuAsmOptimizer;
     end.
    Index: compiler/i8086/aoptcpu.pas
    ===================================================================
    --- compiler/i8086/aoptcpu.pas	(revision 43920)
    +++ compiler/i8086/aoptcpu.pas	(working copy)
    @@ -37,7 +37,6 @@
           TCpuAsmOptimizer = class(TX86AsmOptimizer)
             function PeepHoleOptPass1Cpu(var p : tai) : boolean; override;
             function PostPeepHoleOptsCpu(var p : tai) : boolean; override;
    -        procedure PostPeepHoleOpts; override;
           End;
     
       Implementation
    @@ -166,6 +165,11 @@
                     else
                       ;
                   end;
    +
    +              { Optimise any reference-type operands (if Result is True, the
    +                instruction will be checked on the next iteration) }
    +              if not Result then
    +                OptimizeRefs(taicpu(p));
                 end;
               else
                 ;
    @@ -172,13 +176,6 @@
             end;
           end;
     
    -
    -    procedure TCpuAsmOptimizer.PostPeepHoleOpts;
    -      begin
    -        inherited;
    -        OptReferences;
    -      end;
    -
     begin
       casmoptimizer:=TCpuAsmOptimizer;
     end.
    Index: compiler/x86_64/aoptcpu.pas
    ===================================================================
    --- compiler/x86_64/aoptcpu.pas	(revision 43920)
    +++ compiler/x86_64/aoptcpu.pas	(working copy)
    @@ -35,7 +35,6 @@
         function PeepHoleOptPass1Cpu(var p: tai): boolean; override;
         function PeepHoleOptPass2Cpu(var p: tai): boolean; override;
         function PostPeepHoleOptsCpu(var p : tai) : boolean; override;
    -    procedure PostPeepHoleOpts; override;
       end;
     
     implementation
    @@ -192,6 +191,12 @@
                     else
                       ;
                   end;
    +
    +              { Optimise any reference-type operands (if Result is True, the
    +                instruction will be checked on the next iteration) }
    +              if not Result then
    +                OptimizeRefs(taicpu(p));
    +
                 end;
               else
                 ;
    @@ -199,12 +204,6 @@
           end;
     
     
    -    procedure TCpuAsmOptimizer.PostPeepHoleOpts;
    -      begin
    -        inherited;
    -        OptReferences;
    -      end;
    -
     begin
       casmoptimizer := TCpuAsmOptimizer;
     end.
    
    post-ref-merge.patch (5,507 bytes)

Activities

J. Gareth Moreton

2020-01-15 00:33

developer  

post-ref-merge.patch (5,507 bytes)
Index: compiler/x86/aoptx86.pas
===================================================================
--- compiler/x86/aoptx86.pas	(revision 43920)
+++ compiler/x86/aoptx86.pas	(working copy)
@@ -94,9 +94,10 @@
         function PostPeepholeOptCall(var p : tai) : Boolean;
         function PostPeepholeOptLea(var p : tai) : Boolean;
 
-        procedure OptReferences;
-
         procedure ConvertJumpToRET(const p: tai; const ret_p: tai);
+
+        { Processor-dependent reference optimisation }
+        class procedure OptimizeRefs(var p: taicpu); static;
       end;
 
     function MatchInstruction(const instr: tai; const op: TAsmOp; const opsize: topsizes): boolean;
@@ -5310,22 +5311,13 @@
 {$endif}
 
 
-    procedure TX86AsmOptimizer.OptReferences;
+    class procedure TX86AsmOptimizer.OptimizeRefs(var p: taicpu);
       var
-        p: tai;
-        i: Integer;
+        OperIdx: Integer;
       begin
-        p := BlockStart;
-        while (p <> BlockEnd) Do
-          begin
-            if p.typ=ait_instruction then
-              begin
-                for i:=0 to taicpu(p).ops-1 do
-                  if taicpu(p).oper[i]^.typ=top_ref then
-                    optimize_ref(taicpu(p).oper[i]^.ref^,false);
-              end;
-            p:=tai(p.next);
-          end;
+        for OperIdx := 0 to p.ops - 1 do
+          if p.oper[OperIdx]^.typ = top_ref then
+            optimize_ref(p.oper[OperIdx]^.ref^, False);
       end;
 
 end.
Index: compiler/i386/aoptcpu.pas
===================================================================
--- compiler/i386/aoptcpu.pas	(revision 43920)
+++ compiler/i386/aoptcpu.pas	(working copy)
@@ -39,8 +39,6 @@
         function PeepHoleOptPass1Cpu(var p: tai): boolean; override;
         function PeepHoleOptPass2Cpu(var p: tai): boolean; override;
         function PostPeepHoleOptsCpu(var p : tai) : boolean; override;
-
-        procedure PostPeepHoleOpts; override;
       end;
 
     Var
@@ -299,6 +297,7 @@
                                       taicpu(p).opcode := A_MOV;
                                       taicpu(p).changeopsize(S_B);
                                       setsubreg(taicpu(p).oper[1]^.reg,R_SUBL);
+                                      Result := True;
                                     end;
                                 end;
                               else
@@ -320,6 +319,7 @@
                               taicpu(p).changeopsize(S_B);
                               setsubreg(taicpu(p).oper[1]^.reg,R_SUBL);
                               InsertLLItem(p.previous, p, hp1);
+                              Result := True;
                             end;
                    end;
                 A_TEST, A_OR:
@@ -329,6 +329,11 @@
                 else
                   ;
               end;
+
+              { Optimise any reference-type operands (if Result is True, the
+                instruction will be checked on the next iteration) }
+              if not Result then
+                OptimizeRefs(taicpu(p));
             end;
           else
             ;
@@ -336,13 +341,6 @@
       end;
 
 
-    procedure TCpuAsmOptimizer.PostPeepHoleOpts;
-      begin
-        inherited;
-        OptReferences;
-      end;
-
-
 begin
   casmoptimizer:=TCpuAsmOptimizer;
 end.
Index: compiler/i8086/aoptcpu.pas
===================================================================
--- compiler/i8086/aoptcpu.pas	(revision 43920)
+++ compiler/i8086/aoptcpu.pas	(working copy)
@@ -37,7 +37,6 @@
       TCpuAsmOptimizer = class(TX86AsmOptimizer)
         function PeepHoleOptPass1Cpu(var p : tai) : boolean; override;
         function PostPeepHoleOptsCpu(var p : tai) : boolean; override;
-        procedure PostPeepHoleOpts; override;
       End;
 
   Implementation
@@ -166,6 +165,11 @@
                 else
                   ;
               end;
+
+              { Optimise any reference-type operands (if Result is True, the
+                instruction will be checked on the next iteration) }
+              if not Result then
+                OptimizeRefs(taicpu(p));
             end;
           else
             ;
@@ -172,13 +176,6 @@
         end;
       end;
 
-
-    procedure TCpuAsmOptimizer.PostPeepHoleOpts;
-      begin
-        inherited;
-        OptReferences;
-      end;
-
 begin
   casmoptimizer:=TCpuAsmOptimizer;
 end.
Index: compiler/x86_64/aoptcpu.pas
===================================================================
--- compiler/x86_64/aoptcpu.pas	(revision 43920)
+++ compiler/x86_64/aoptcpu.pas	(working copy)
@@ -35,7 +35,6 @@
     function PeepHoleOptPass1Cpu(var p: tai): boolean; override;
     function PeepHoleOptPass2Cpu(var p: tai): boolean; override;
     function PostPeepHoleOptsCpu(var p : tai) : boolean; override;
-    procedure PostPeepHoleOpts; override;
   end;
 
 implementation
@@ -192,6 +191,12 @@
                 else
                   ;
               end;
+
+              { Optimise any reference-type operands (if Result is True, the
+                instruction will be checked on the next iteration) }
+              if not Result then
+                OptimizeRefs(taicpu(p));
+
             end;
           else
             ;
@@ -199,12 +204,6 @@
       end;
 
 
-    procedure TCpuAsmOptimizer.PostPeepHoleOpts;
-      begin
-        inherited;
-        OptReferences;
-      end;
-
 begin
   casmoptimizer := TCpuAsmOptimizer;
 end.
post-ref-merge.patch (5,507 bytes)

Florian

2020-01-19 21:11

administrator   ~0120560

Thanks, applied.

Issue History

Date Modified Username Field Change
2020-01-15 00:25 J. Gareth Moreton New Issue
2020-01-15 00:25 J. Gareth Moreton File Added: post-ref-merge.patch
2020-01-15 00:26 J. Gareth Moreton Priority normal => low
2020-01-15 00:26 J. Gareth Moreton Severity minor => tweak
2020-01-15 00:26 J. Gareth Moreton FPCTarget => -
2020-01-15 00:26 J. Gareth Moreton Tag Attached: patch
2020-01-15 00:26 J. Gareth Moreton Tag Attached: compiler
2020-01-15 00:26 J. Gareth Moreton Tag Attached: refactor
2020-01-15 00:26 J. Gareth Moreton Tag Attached: i386
2020-01-15 00:26 J. Gareth Moreton Tag Attached: x86_64
2020-01-15 00:26 J. Gareth Moreton Tag Attached: i8086
2020-01-15 00:33 J. Gareth Moreton File Deleted: post-ref-merge.patch
2020-01-15 00:33 J. Gareth Moreton File Added: post-ref-merge.patch
2020-01-19 21:11 Florian Assigned To => Florian
2020-01-19 21:11 Florian Status new => resolved
2020-01-19 21:11 Florian Resolution open => fixed
2020-01-19 21:11 Florian Fixed in Version => 3.3.1
2020-01-19 21:11 Florian Fixed in Revision => 44001
2020-01-19 21:11 Florian Note Added: 0120560