View Issue Details

IDProjectCategoryView StatusLast Update
0038841FPCCompilerpublic2021-05-06 20:50
ReporterJ. Gareth Moreton Assigned To 
PrioritynormalSeverityminorReproducibilityN/A
Status newResolutionopen 
Platformarm and aarch64OSDebian GNU/LInux (Raspberry Pi) 
Product Version3.3.1 
Summary0038841: [Patch] ARM/AArch64 Some short-range LDR/STR optimisations
DescriptionThe "ldrstr.patch" file provides some short-term optimisations for LDR and STR instructions that removes unnecessary instructions (e.g. storing a register to memory, then loading from the same address to the same register). These optimisations are performed over all ARM platforms, although it fixes a minor bug in the RedundantMovProcess routine for AArch64 (this optimisation often occurs after the new "load/load -> load/move" optimisation is made).

The "peephole-string.patch" seeks to homogenise the optimisation comments that appear when DEBUG_AOPTCPU is declared, prepending all such messages with the SPeepholeOptimization string constant, much like the x86 implementations.
Steps To ReproduceApply patch and confirm correct compilation on all ARM and AArch64 platforms.
Additional InformationThe two patches share a hunk (the declaration of SPeepholeOptimization) and a single rejection will occur when they are applied together. This won't cause a bad merge.

I confess that this patch hasn't been fully tested on Arm-32 platforms due to technical reasons - third-party testing would be required.

Some examples of the optimisations under aarch64-linux:

In the Sysutils unit - before:

    strh w2,[x0]
    ldrh w0,[x0]
    ldp x29,x30,[sp], 16
    ret
.Le429:

After:

    strh w2,[x0]
    uxth w0,w2 <-- ldr changed to a uxth instruction based on the postfixes of the str and ldr instructions (minimises read-after-write penalty).
    ldp x29,x30,[sp], 16
    ret
.Le429:

----

Also in Sysutils - before:

    str x0,[sp, 24]
    ldr x0,[sp, 24]
    str x0,[sp, 32]
.Lj3450:

After:

    stp x0,x0,[sp, 24] <-- the ldr instruction is removed because x0 already contains the value at the address specified (because it was just written there), and then the two str instructions are merged into an stp instruction later on.
.Lj3450:

----

In the Classes unit - before:

    b.ne .Lj947
    ldr x0,[sp, 16]
    ldr x1,[sp, 16]
    ldr x1,[x1, 104]
    blr x1
    str x0,[sp, 0000016]
.Lj947:

After:

    b.ne .Lj947
    ldr x0,[sp, 16]
    ldr x1,[x0, 104] <-- Second ldr was changed to "mov x1,x0", which was then optimised by RedundantMovProcess and merged into the 3rd ldr.
    blr x1
    str x0,[sp, 16]
.Lj947:

----

Longer-range optimisations of this kind are still being researched because of the fact that references are involved - watch this space!
Tagsaarch64, arm, compiler, optimization, patch
Fixed in Revision
FPCOldBugId
FPCTarget-
Attached Files

Activities

J. Gareth Moreton

2021-05-02 00:04

developer  

ldrstr.patch (14,469 bytes)   
Index: compiler/aarch64/aoptcpu.pas
===================================================================
--- compiler/aarch64/aoptcpu.pas	(revision 49298)
+++ compiler/aarch64/aoptcpu.pas	(working copy)
@@ -44,6 +44,10 @@
         function RegLoadedWithNewValue(reg: tregister; hp: tai): boolean;override;
         function InstructionLoadsFromReg(const reg: TRegister; const hp: tai): boolean;override;
         function LookForPostindexedPattern(var p : tai) : boolean;
+      public
+        { With these routines, there's optimisation code that's general for all ARM platforms }
+        function OptPass1LDR(var p: tai): Boolean; override;
+        function OptPass1STR(var p: tai): Boolean; override;
       private
         function RemoveSuperfluousFMov(const p: tai; movp: tai; const optimizer: string): boolean;
         function OptPass1Shift(var p: tai): boolean;
@@ -291,6 +295,24 @@
     end;
 
 
+  function TCpuAsmOptimizer.OptPass1LDR(var p: tai): Boolean;
+    begin
+      Result := False;
+      if inherited OptPass1LDR(p) or
+        LookForPostindexedPattern(p) then
+        Exit(True);
+    end;
+
+
+  function TCpuAsmOptimizer.OptPass1STR(var p: tai): Boolean;
+    begin
+      Result := False;
+      if inherited OptPass1STR(p) or
+        LookForPostindexedPattern(p) then
+        Exit(True);
+    end;
+
+
   function TCpuAsmOptimizer.OptPass1Shift(var p : tai): boolean;
     var
       hp1,hp2: tai;
@@ -764,9 +786,10 @@
       if p.typ=ait_instruction then
         begin
           case taicpu(p).opcode of
-            A_LDR,
+            A_LDR:
+              Result:=OptPass1LDR(p);
             A_STR:
-              Result:=LookForPostindexedPattern(p);
+              Result:=OptPass1STR(p);
             A_MOV:
               Result:=OptPass1Mov(p);
             A_STP:
Index: compiler/arm/aoptcpu.pas
===================================================================
--- compiler/arm/aoptcpu.pas	(revision 49298)
+++ compiler/arm/aoptcpu.pas	(working copy)
@@ -59,7 +59,11 @@
     function InstructionLoadsFromReg(const reg : TRegister; const hp : tai) : boolean; override;
 
     function RegLoadedWithNewValue(reg : tregister; hp : tai) : boolean; override;
-    function OptPass1And(var p: tai): Boolean; override; { There's optimisation code that's general for all ARM platforms }
+
+     { With these routines, there's optimisation code that's general for all ARM platforms }
+    function OptPass1And(var p: tai): Boolean; override;
+    function OptPass1LDR(var p: tai): Boolean; override;
+    function OptPass1STR(var p: tai): Boolean; override;
   protected
     function LookForPreindexedPattern(p: taicpu): boolean;
     function LookForPostindexedPattern(p: taicpu): boolean;
@@ -69,9 +73,7 @@
     function OptPass1DataCheckMov(var p: tai): Boolean;
     function OptPass1ADDSUB(var p: tai): Boolean;
     function OptPass1CMP(var p: tai): Boolean;
-    function OptPass1LDR(var p: tai): Boolean;
     function OptPass1STM(var p: tai): Boolean;
-    function OptPass1STR(var p: tai): Boolean;
     function OptPass1MOV(var p: tai): Boolean;
     function OptPass1MUL(var p: tai): Boolean;
     function OptPass1MVN(var p: tai): Boolean;
@@ -834,7 +836,9 @@
     var
       hp1: tai;
     begin
-      Result := False;
+      Result := inherited OptPass1LDR(p);
+      if Result then
+        Exit;
 
       { change
         ldr reg1,ref
@@ -1022,7 +1026,9 @@
     var
       hp1: tai;
     begin
-      Result := False;
+      Result := inherited OptPass1STR(p);
+      if Result then
+        Exit;
 
       { Common conditions }
       if (taicpu(p).oper[1]^.typ = top_ref) and
Index: compiler/armgen/aoptarm.pas
===================================================================
--- compiler/armgen/aoptarm.pas	(revision 49298)
+++ compiler/armgen/aoptarm.pas	(working copy)
@@ -41,12 +41,15 @@
 
     function RemoveSuperfluousMove(const p: tai; movp: tai; const optimizer: string): boolean;
     function RedundantMovProcess(var p: tai; var hp1: tai): boolean;
-    function GetNextInstructionUsingReg(Current: tai; out Next: tai; reg: TRegister): Boolean;
+    function GetNextInstructionUsingReg(Current: tai; out Next: tai; const reg: TRegister): Boolean;
 
     function OptPass1UXTB(var p: tai): Boolean;
     function OptPass1UXTH(var p: tai): Boolean;
     function OptPass1SXTB(var p: tai): Boolean;
     function OptPass1SXTH(var p: tai): Boolean;
+
+    function OptPass1LDR(var p: tai): Boolean; virtual;
+    function OptPass1STR(var p: tai): Boolean; virtual;
     function OptPass1And(var p: tai): Boolean; virtual;
   End;
 
@@ -69,15 +72,23 @@
     systems,
     cpuinfo,
     cgobj,procinfo,
-    aasmbase,aasmdata;
+    aasmbase,aasmdata,itcpugas;
 
 
 {$ifdef DEBUG_AOPTCPU}
+  const
+    SPeepholeOptimization: shortstring = 'Peephole Optimization: ';
+
   procedure TARMAsmOptimizer.DebugMsg(const s: string;p : tai);
     begin
       asml.insertbefore(tai_comment.Create(strpnew(s)), p);
     end;
 {$else DEBUG_AOPTCPU}
+  { Empty strings help the optimizer to remove string concatenations that won't
+    ever appear to the user on release builds. [Kit] }
+  const
+    SPeepholeOptimization = '';
+
   procedure TARMAsmOptimizer.DebugMsg(const s: string;p : tai);inline;
     begin
     end;
@@ -179,7 +190,7 @@
 
 
   function TARMAsmOptimizer.GetNextInstructionUsingReg(Current: tai;
-    Out Next: tai; reg: TRegister): Boolean;
+    Out Next: tai; const reg: TRegister): Boolean;
     var
       gniResult: Boolean;
     begin
@@ -395,7 +406,14 @@
                   UpdateUsedRegs(TmpUsedRegs, tai(current_hp.Next));
                   LDRChange := False;
 
-                  if (taicpu(next_hp).opcode in [A_LDR,A_STR]) and (taicpu(next_hp).ops = 2) then
+                  if (taicpu(next_hp).opcode in [A_LDR,A_STR]) and (taicpu(next_hp).ops = 2)
+{$ifdef AARCH64}
+                    { If r0 is the zero register, then this sequence of instructions will cause
+                      an access violation, but that's better than an assembler error caused by
+                      changing r0 to xzr inside the reference (Where it's illegal). [Kit] }
+                    and (getsupreg(taicpu(p).oper[1]^.reg) <> RS_XZR)
+{$endif AARCH64}
+                    then
                     begin
 
                       { Change the registers from r1 to r0 }
@@ -1018,6 +1036,196 @@
     end;
 
 
+  function TARMAsmOptimizer.OptPass1LDR(var p : tai) : Boolean;
+    var
+      hp1: tai;
+      Reference: TReference;
+      NewOp: TAsmOp;
+    begin
+      Result := False;
+      if (taicpu(p).ops <> 2) or (taicpu(p).condition <> C_None) then
+        Exit;
+
+      Reference := taicpu(p).oper[1]^.ref^;
+      if (Reference.addressmode = AM_OFFSET) and
+        not RegInRef(taicpu(p).oper[0]^.reg, Reference) and
+        { Delay calling GetNextInstruction for as long as possible }
+        GetNextInstruction(p, hp1) and
+        (hp1.typ = ait_instruction) and
+        (taicpu(hp1).condition = C_None) and
+        (taicpu(hp1).oppostfix = taicpu(p).oppostfix) then
+        begin
+
+          if (taicpu(hp1).opcode = A_STR) and
+            RefsEqual(taicpu(hp1).oper[1]^.ref^, Reference) and
+            (getregtype(taicpu(p).oper[0]^.reg) = getregtype(taicpu(hp1).oper[0]^.reg)) then
+            begin
+              { With:
+                  ldr reg1,[ref]
+                  str reg2,[ref]
+
+                If reg1 = reg2, Remove str
+              }
+              if taicpu(p).oper[0]^.reg = taicpu(hp1).oper[0]^.reg then
+                begin
+                  DebugMsg(SPeepholeOptimization + 'Removed redundant store instruction (load/store -> load/nop)', hp1);
+                  RemoveInstruction(hp1);
+                  Result := True;
+                  Exit;
+                end;
+            end
+          else if (taicpu(hp1).opcode = A_LDR) and
+            RefsEqual(taicpu(hp1).oper[1]^.ref^, Reference) then
+            begin
+              { With:
+                  ldr reg1,[ref]
+                  ldr reg2,[ref]
+
+                If reg1 = reg2, delete the second ldr
+                If reg1 <> reg2, changing the 2nd ldr to a mov might introduce
+                  a dependency, but it will likely open up new optimisations, so
+                  do it for now and handle any new dependencies later.
+              }
+              if taicpu(p).oper[0]^.reg = taicpu(hp1).oper[0]^.reg then
+                begin
+                  DebugMsg(SPeepholeOptimization + 'Removed duplicate load instruction (load/load -> load/nop)', hp1);
+                  RemoveInstruction(hp1);
+                  Result := True;
+                  Exit;
+                end
+              else if
+                (getregtype(taicpu(p).oper[0]^.reg) = R_INTREGISTER) and
+                (getregtype(taicpu(hp1).oper[0]^.reg) = R_INTREGISTER) and
+                (getsubreg(taicpu(p).oper[0]^.reg) = getsubreg(taicpu(hp1).oper[0]^.reg)) then
+                begin
+                  DebugMsg(SPeepholeOptimization + 'Changed second ldr' + oppostfix2str[taicpu(hp1).oppostfix] + ' to mov (load/load -> load/move)', hp1);
+                  taicpu(hp1).opcode := A_MOV;
+                  taicpu(hp1).oppostfix := PF_None;
+                  taicpu(hp1).loadreg(1, taicpu(p).oper[0]^.reg);
+                  AllocRegBetween(taicpu(p).oper[0]^.reg, p, hp1, UsedRegs);
+                  Result := True;
+                  Exit;
+                end;
+            end;
+        end;
+    end;
+
+
+    function TARMAsmOptimizer.OptPass1STR(var p : tai) : Boolean;
+      var
+        hp1: tai;
+        Reference: TReference;
+        SizeMismatch: Boolean;
+        SrcReg: TRegister;
+        NewOp: TAsmOp;
+      begin
+        Result := False;
+        if (taicpu(p).ops <> 2) or (taicpu(p).condition <> C_None) then
+          Exit;
+
+        Reference := taicpu(p).oper[1]^.ref^;
+        if (Reference.addressmode = AM_OFFSET) and
+          not RegInRef(taicpu(p).oper[0]^.reg, Reference) and
+          { Delay calling GetNextInstruction for as long as possible }
+          GetNextInstruction(p, hp1) and
+          (hp1.typ = ait_instruction) and
+          (taicpu(hp1).condition = C_None) and
+          (taicpu(hp1).oppostfix = taicpu(p).oppostfix) then
+
+        if GetNextInstruction(p, hp1) and
+          (hp1.typ = ait_instruction) and
+          (taicpu(hp1).condition = C_None) then
+          begin
+            { Saves constant dereferencing and makes it easier to change the size if necessary }
+            SrcReg := taicpu(p).oper[0]^.reg;
+
+            if (taicpu(hp1).opcode = A_LDR) and
+              RefsEqual(taicpu(hp1).oper[1]^.ref^, Reference) and
+              (
+                (taicpu(hp1).oppostfix = taicpu(p).oppostfix) or
+                ((taicpu(p).oppostfix = PF_B) and (taicpu(hp1).oppostfix = PF_SB)) or
+                ((taicpu(p).oppostfix = PF_H) and (taicpu(hp1).oppostfix = PF_SH)) or
+                ((taicpu(p).oppostfix = PF_W) and (taicpu(hp1).oppostfix = PF_SW))
+              ) then
+              begin
+                { With:
+                    str reg1,[ref]
+                    ldr reg2,[ref]
+
+                  If reg1 = reg2, Remove ldr.
+                  If reg1 <> reg2, replace ldr with "mov reg2,reg1"
+                }
+
+                if SrcReg = taicpu(hp1).oper[0]^.reg then
+                  begin
+                    DebugMsg(SPeepholeOptimization + 'Removed redundant load instruction (store/load -> store/nop)', hp1);
+                    RemoveInstruction(hp1);
+                    Result := True;
+                    Exit;
+                  end
+                else if (getregtype(taicpu(p).oper[0]^.reg) = R_INTREGISTER) and
+                  (getregtype(taicpu(hp1).oper[0]^.reg) = R_INTREGISTER) and
+                  (getsubreg(taicpu(p).oper[0]^.reg) = getsubreg(taicpu(hp1).oper[0]^.reg)) then
+                  begin
+                    case taicpu(hp1).oppostfix of
+                      PF_B:
+                        NewOp := A_UXTB;
+                      PF_SB:
+                        NewOp := A_SXTB;
+                      PF_H:
+                        NewOp := A_UXTH;
+                      PF_SH:
+                        NewOp := A_SXTH;
+                      PF_SW:
+                        NewOp := A_SXTW;
+                      PF_W,
+                      PF_None:
+                        NewOp := A_MOV;
+                      else
+                        InternalError(2021043001);
+                    end;
+
+                    DebugMsg(SPeepholeOptimization + 'Changed ldr' + oppostfix2str[taicpu(hp1).oppostfix] + ' to ' + gas_op2str[NewOp] + ' (store/load -> store/move)', hp1);
+
+                    taicpu(hp1).oppostfix := PF_None;
+                    taicpu(hp1).opcode := NewOp;
+                    taicpu(hp1).loadreg(1, taicpu(p).oper[0]^.reg);
+                    AllocRegBetween(taicpu(p).oper[0]^.reg, p, hp1, UsedRegs);
+                    Result := True;
+                    Exit;
+                  end;
+              end
+            else if (taicpu(hp1).opcode = A_STR) and
+              RefsEqual(taicpu(hp1).oper[1]^.ref^, Reference) then
+              begin
+                { With:
+                    str reg1,[ref]
+                    str reg2,[ref]
+
+                  If reg1 <> reg2, delete the first str
+                  IF reg1 = reg2, delete the second str
+                }
+                if SrcReg = taicpu(hp1).oper[0]^.reg then
+                  begin
+                    DebugMsg(SPeepholeOptimization + 'Removed duplicate store instruction (store/store -> store/nop)', hp1);
+                    RemoveInstruction(hp1);
+                    Result := True;
+                    Exit;
+                  end
+                else if
+                  { Registers same byte size? }
+                  (tcgsize2size[reg_cgsize(taicpu(p).oper[0]^.reg)] = tcgsize2size[reg_cgsize(taicpu(hp1).oper[0]^.reg)]) then
+                  begin
+                    DebugMsg(SPeepholeOptimization + 'Removed dominated store instruction (store/store -> nop/store)', p);
+                    RemoveCurrentP(p, hp1);
+                    Result := True;
+                    Exit;
+                  end;
+              end;
+          end;
+      end;
+
+
   function TARMAsmOptimizer.OptPass1And(var p : tai) : Boolean;
     var
       hp1, hp2: tai;
ldrstr.patch (14,469 bytes)   
peephole-string.patch (4,770 bytes)   
Index: compiler/aarch64/aoptcpu.pas
===================================================================
--- compiler/aarch64/aoptcpu.pas	(revision 49298)
+++ compiler/aarch64/aoptcpu.pas	(working copy)
@@ -199,9 +199,9 @@
         not(RegModifiedBetween(taicpu(hp1).oper[2]^.reg,p,hp1)) then
         begin
           if taicpu(p).opcode = A_LDR then
-            DebugMsg('Peephole LdrAdd/Sub2Ldr Postindex done', p)
+            DebugMsg(SPeepholeOptimization + 'LdrAdd/Sub2Ldr Postindex done', p)
           else
-            DebugMsg('Peephole StrAdd/Sub2Str Postindex done', p);
+            DebugMsg(SPeepholeOptimization + 'StrAdd/Sub2Str Postindex done', p);
 
           taicpu(p).oper[1]^.ref^.addressmode:=AM_POSTINDEXED;
           if taicpu(hp1).opcode=A_ADD then
@@ -244,7 +244,7 @@
           dealloc:=FindRegDeAlloc(taicpu(p).oper[0]^.reg,tai(movp.Next));
           if assigned(dealloc) then
             begin
-              DebugMsg('Peephole '+optimizer+' removed superfluous vmov', movp);
+              DebugMsg(SPeepholeOptimization + optimizer+' removed superfluous vmov', movp);
               result:=true;
 
               { taicpu(p).oper[0]^.reg is not used anymore, try to find its allocation
@@ -395,7 +395,7 @@
                 RemoveInstruction(hp1);
                 RemoveCurrentp(p);
 
-                DebugMsg('Peephole FoldShiftProcess done', hp2);
+                DebugMsg(SPeepholeOptimization + 'FoldShiftProcess done', hp2);
                 Result:=true;
                 break;
               end;
@@ -488,7 +488,7 @@
           hp3.free;
           hp4.free;
           p:=hp2;
-          DebugMsg('Peephole Bl2B done', p);
+          DebugMsg(SPeepholeOptimization + 'Bl2B done', p);
           Result:=true;
         end;
     end;
@@ -503,7 +503,7 @@
        (taicpu(p).oppostfix=PF_None) then
        begin
          RemoveCurrentP(p);
-         DebugMsg('Peephole Mov2None done', p);
+         DebugMsg(SPeepholeOptimization + 'Mov2None done', p);
          Result:=true;
        end
 
@@ -669,9 +669,9 @@
                                 }
                                 taicpu(p).opcode := TargetOpcode;
                                 if TargetOpcode = A_STP then
-                                  DebugMsg('Peephole Optimization: StrStr2Stp', p)
+                                  DebugMsg(SPeepholeOptimization + 'StrStr2Stp', p)
                                 else
-                                  DebugMsg('Peephole Optimization: LdrLdr2Ldp', p);
+                                  DebugMsg(SPeepholeOptimization + 'LdrLdr2Ldp', p);
                                 taicpu(p).ops := 3;
                                 taicpu(p).loadref(2, taicpu(p).oper[1]^.ref^);
                                 taicpu(p).loadreg(1, taicpu(hp1).oper[0]^.reg);
@@ -695,9 +695,9 @@
                                 }
                                 taicpu(p).opcode := TargetOpcode;
                                 if TargetOpcode = A_STP then
-                                  DebugMsg('Peephole Optimization: StrStr2Stp (reverse)', p)
+                                  DebugMsg(SPeepholeOptimization + 'StrStr2Stp (reverse)', p)
                                 else
-                                  DebugMsg('Peephole Optimization: LdrLdr2Ldp (reverse)', p);
+                                  DebugMsg(SPeepholeOptimization + 'LdrLdr2Ldp (reverse)', p);
                                 taicpu(p).ops := 3;
                                 taicpu(p).loadref(2, taicpu(hp1).oper[1]^.ref^);
                                 taicpu(p).loadreg(1, taicpu(p).oper[0]^.reg);
@@ -752,7 +752,7 @@
           p.free;
           hp1.free;
           p:=hp2;
-          DebugMsg('Peephole CMPB.E/NE2CBNZ/CBZ done', p);
+          DebugMsg(SPeepholeOptimization + 'CMPB.E/NE2CBNZ/CBZ done', p);
           Result:=true;
         end;
     end;
Index: compiler/armgen/aoptarm.pas
===================================================================
--- compiler/armgen/aoptarm.pas	(revision 49298)
+++ compiler/armgen/aoptarm.pas	(working copy)
@@ -69,15 +69,23 @@
     systems,
     cpuinfo,
     cgobj,procinfo,
-    aasmbase,aasmdata;
+    aasmbase,aasmdata,itcpugas;
 
 
 {$ifdef DEBUG_AOPTCPU}
+  const
+    SPeepholeOptimization: shortstring = 'Peephole Optimization: ';
+
   procedure TARMAsmOptimizer.DebugMsg(const s: string;p : tai);
     begin
       asml.insertbefore(tai_comment.Create(strpnew(s)), p);
     end;
 {$else DEBUG_AOPTCPU}
+  { Empty strings help the optimizer to remove string concatenations that won't
+    ever appear to the user on release builds. [Kit] }
+  const
+    SPeepholeOptimization = '';
+
   procedure TARMAsmOptimizer.DebugMsg(const s: string;p : tai);inline;
     begin
     end;
peephole-string.patch (4,770 bytes)   

J. Gareth Moreton

2021-05-06 18:08

developer   ~0130773

So as I feared, ARM testing yielded the fact that UXTB and the like was only introduced with ARMv6. I'll try to fix it.

Florian

2021-05-06 20:45

administrator   ~0130774

I applied to second patch as well.

Regarding the armv6 issue: if one wants to be perfect, we could use and ...,...$ff for uxtb on armv5 or older ...

J. Gareth Moreton

2021-05-06 20:50

developer   ~0130775

That's something I can set up in a future patch - sure. I think even with an additional AND instruction, it will be faster than the memory latency with the read-after-write penalty.

Issue History

Date Modified Username Field Change
2021-05-02 00:04 J. Gareth Moreton New Issue
2021-05-02 00:04 J. Gareth Moreton File Added: ldrstr.patch
2021-05-02 00:04 J. Gareth Moreton File Added: peephole-string.patch
2021-05-02 00:04 J. Gareth Moreton Description Updated View Revisions
2021-05-02 00:04 J. Gareth Moreton Additional Information Updated View Revisions
2021-05-02 00:04 J. Gareth Moreton FPCTarget => -
2021-05-02 00:05 J. Gareth Moreton Tag Attached: patch
2021-05-02 00:05 J. Gareth Moreton Tag Attached: arm
2021-05-02 00:05 J. Gareth Moreton Tag Attached: aarch64
2021-05-02 00:05 J. Gareth Moreton Tag Attached: optimization
2021-05-02 00:05 J. Gareth Moreton Tag Attached: compiler
2021-05-02 00:08 J. Gareth Moreton Additional Information Updated View Revisions
2021-05-02 00:09 J. Gareth Moreton Additional Information Updated View Revisions
2021-05-06 18:08 J. Gareth Moreton Note Added: 0130773
2021-05-06 20:45 Florian Note Added: 0130774
2021-05-06 20:50 J. Gareth Moreton Note Added: 0130775