View Issue Details

IDProjectCategoryView StatusLast Update
0038907FPCCompilerpublic2021-05-19 20:29
ReporterJ. Gareth Moreton Assigned ToFlorian  
PrioritylowSeverityminorReproducibilityN/A
Status resolvedResolutionfixed 
Platformi386 and x86_64OSMicrosoft Windows 
Product Version3.3.1 
Fixed in Version3.3.1 
Summary0038907: [Patch] Memory CMP optimisation
DescriptionThis patch looks for combinations where a MOV instruction transfers data from memory into a register, and then immediately compares that register with CMP or TEST, simplifying it into just a CMP instruction that tests against the memory directly if the register in question is not used afterwards.
Steps To ReproduceApply patch and confirm correct compilation with minor speed boost.
Additional InformationFor a simple example (taken from SysUtils):

    movl 32(%rsp),%eax
    testl %eax,%eax

Becomes:

    cmpl $0,32(%rsp)

----

Assemblers seem to struggle if the reference in a TEST or CMP instruction is an absolute address (such as a global variable under i386), represented as "refaddr = addr_full" internally. As such, only references of type "addr_no" (registers and offsets only), "addr_pic" (using %rip under x86_64) and "addr_pic_no_got" are optimised.
Tagscompiler, i386, optimization, patch, x86, x86_64
Fixed in Revision49382
FPCOldBugId
FPCTarget-
Attached Files

Activities

J. Gareth Moreton

2021-05-19 07:39

developer   ~0130949

Just to reassure everyone, I have tested this on i386 and x86_64 with most configurations.

J. Gareth Moreton

2021-05-19 07:45

developer   ~0130951

cmp-mem.patch (3,938 bytes)   
Index: compiler/x86/aoptx86.pas
===================================================================
--- compiler/x86/aoptx86.pas	(revision 49374)
+++ compiler/x86/aoptx86.pas	(working copy)
@@ -2819,24 +2819,73 @@
             Result:=true;
             exit;
           end;
-        if MatchOpType(taicpu(p),top_reg,top_ref) and
-          MatchInstruction(hp1,A_CMP,A_TEST,[taicpu(p).opsize]) and
-          (taicpu(hp1).oper[1]^.typ = top_ref) and
-           RefsEqual(taicpu(p).oper[1]^.ref^, taicpu(hp1).oper[1]^.ref^) then
+        if MatchInstruction(hp1,A_CMP,A_TEST,[taicpu(p).opsize]) then
           begin
-            { change
-                mov reg1, mem1
-                test/cmp x, mem1
+            if MatchOpType(taicpu(p),top_reg,top_ref) and
+              (taicpu(hp1).oper[1]^.typ = top_ref) and
+              RefsEqual(taicpu(p).oper[1]^.ref^, taicpu(hp1).oper[1]^.ref^) then
+              begin
+                { change
+                    mov reg1, mem1
+                    test/cmp x, mem1
 
-                to
+                    to
 
-                mov reg1, mem1
-                test/cmp x, reg1
-            }
-            taicpu(hp1).loadreg(1,taicpu(p).oper[0]^.reg);
-            DebugMsg(SPeepholeOptimization + 'MovTestCmp2MovTestCmp 1',hp1);
-            AllocRegBetween(taicpu(p).oper[0]^.reg,p,hp1,usedregs);
-            exit;
+                    mov reg1, mem1
+                    test/cmp x, reg1
+                }
+                taicpu(hp1).loadreg(1,taicpu(p).oper[0]^.reg);
+                DebugMsg(SPeepholeOptimization + 'MovTestCmp2MovTestCmp 1',hp1);
+                AllocRegBetween(taicpu(p).oper[0]^.reg,p,hp1,usedregs);
+                Result := True;
+                Exit;
+              end;
+
+            if MatchOpType(taicpu(p),top_ref,top_reg) and
+              { The x86 assemblers have difficulty comparing values against absolute addresses }
+              (taicpu(p).oper[0]^.ref^.refaddr in [addr_no, addr_pic, addr_pic_no_got]) and
+              (taicpu(hp1).oper[0]^.typ <> top_ref) and
+              MatchOperand(taicpu(hp1).oper[1]^, taicpu(p).oper[1]^.reg) and
+              (
+                (
+                  (taicpu(hp1).opcode = A_TEST)
+                ) or (
+                  (taicpu(hp1).opcode = A_CMP) and
+                  { A sanity check more than anything }
+                  not MatchOperand(taicpu(hp1).oper[0]^, taicpu(p).oper[1]^.reg)
+                )
+              ) then
+              begin
+                { change
+                    mov      mem, %reg
+                    cmp/test x,   %reg / test %reg,%reg
+                    (reg deallocated)
+
+                    to
+
+                    cmp/test x,   mem  / cmp  0,   mem
+                }
+                TransferUsedRegs(TmpUsedRegs);
+                UpdateUsedRegs(TmpUsedRegs, tai(p.Next));
+                if not RegUsedAfterInstruction(taicpu(p).oper[1]^.reg, hp1, TmpUsedRegs) then
+                  begin
+                    { Convert test %reg,%reg or test $-1,%reg to cmp $0,mem }
+                    if (taicpu(hp1).opcode = A_TEST) and
+                      (
+                        MatchOperand(taicpu(hp1).oper[0]^, taicpu(p).oper[1]^.reg) or
+                        MatchOperand(taicpu(hp1).oper[0]^, -1)
+                      ) then
+                      begin
+                        taicpu(hp1).opcode := A_CMP;
+                        taicpu(hp1).loadconst(0, 0);
+                      end;
+                    taicpu(hp1).loadref(1, taicpu(p).oper[0]^.ref^);
+                    DebugMsg(SPeepholeOptimization + 'MOV/CMP -> CMP (memory check)', p);
+                    RemoveCurrentP(p, hp1);
+                    Result := True;
+                    Exit;
+                  end;
+              end;
           end;
 
         if MatchInstruction(hp1,A_LEA,[S_L{$ifdef x86_64},S_Q{$endif x86_64}]) and
cmp-mem.patch (3,938 bytes)   

440bx

2021-05-19 09:07

reporter   ~0130952

I thought a "thank you" was in order for all the work you put into optimizing code. I'm sure I'm far from the only one who appreciates your efforts.

J. Gareth Moreton

2021-05-19 12:29

developer   ~0130955

Thank you. I do have a Patreon if anyone wants to donate to my work for FPC, and maybe view some of my mathematics articles, but it's in no way mandatory. I just enjoy attempting to make the compiler as efficient as possible.

Florian

2021-05-19 20:29

administrator   ~0130965

Thanks, applied.

Issue History

Date Modified Username Field Change
2021-05-19 07:27 J. Gareth Moreton New Issue
2021-05-19 07:27 J. Gareth Moreton File Added: cmp-mem.patch
2021-05-19 07:27 J. Gareth Moreton Tag Attached: patch
2021-05-19 07:27 J. Gareth Moreton Tag Attached: compiler
2021-05-19 07:27 J. Gareth Moreton Tag Attached: optimization
2021-05-19 07:27 J. Gareth Moreton Tag Attached: i386
2021-05-19 07:27 J. Gareth Moreton Tag Attached: x86
2021-05-19 07:27 J. Gareth Moreton Tag Attached: x86_64
2021-05-19 07:27 J. Gareth Moreton Priority normal => low
2021-05-19 07:27 J. Gareth Moreton FPCTarget => -
2021-05-19 07:34 J. Gareth Moreton Description Updated View Revisions
2021-05-19 07:39 J. Gareth Moreton Note Added: 0130949
2021-05-19 07:44 J. Gareth Moreton File Deleted: cmp-mem.patch
2021-05-19 07:45 J. Gareth Moreton Note Added: 0130951
2021-05-19 07:45 J. Gareth Moreton File Added: cmp-mem.patch
2021-05-19 09:07 440bx Note Added: 0130952
2021-05-19 12:29 J. Gareth Moreton Note Added: 0130955
2021-05-19 20:29 Florian Assigned To => Florian
2021-05-19 20:29 Florian Status new => resolved
2021-05-19 20:29 Florian Resolution open => fixed
2021-05-19 20:29 Florian Fixed in Version => 3.3.1
2021-05-19 20:29 Florian Fixed in Revision => 49382
2021-05-19 20:29 Florian Note Added: 0130965