[Patch] Long-range MOV + MOVS/Z optimisation
Original Reporter info from Mantis: CuriousKit @CuriousKit
-
Reporter name: J. Gareth Moreton
Original Reporter info from Mantis: CuriousKit @CuriousKit
- Reporter name: J. Gareth Moreton
Description:
There are some situations where a register is written to, and then it is zero- or sign-extended a few instructions later, without the previous value having been read or modified. Using GetNextInstructionUsingReg, this patch modifies these pairs so the MOV instruction is changed to the following MOVS/Z instruction (or kept as MOV if a constant is being written), and the MOVS/Z instruction removed. For example:
movw %cx,%dx
movq U_IDECOMMANDS_
$_IDECOMMANDLIST(%rip),%rcx (%rcx is reused)
movzwl %dx,%edx
Becomes...
movzwl %cx,%edx
movq U_IDECOMMANDS_
$_IDECOMMANDLIST(%rip),%rcx
Steps to reproduce:
Apply patch and confirm correct compilation.
Additional information:
When coupled with #37389 (closed) (and can also help its potential false dependency problems), this single optimisation can sometimes permit a cascade of additional operations. For example, observed in the TDependencyGraphOptDialog.UpdateInfo method in the Lazarus source:
movw %cx,%dx
.Ll1277:
movq U_$IDECOMMANDS_$$_IDECOMMANDLIST(%rip),%rcx
movzwl %dx,%edx
movq U_$IDECOMMANDS_$$_IDECOMMANDLIST(%rip),%rax
movq (%rax),%rax
Becomes:
# Peephole Optimization: MovMovs/z2Mov/s/z done movzwl %cx,%edx # Peephole Optimization: MovMov2MovMov 2 .Ll1277: movq U_$IDECOMMANDS_$$_IDECOMMANDLIST(%rip),%rcx # Peephole Optimization: Removed movs/z instruction and extended earlier write (MovMovs/z2Mov/s/z) # Peephole Optimization: Mov2Nop 3 done # Peephole Optimization: %rax = %rcx; changed to minimise pipeline stall (MovXXX2MovXXX) movq (%rcx),%rax
(5 instructions become 3)
Mantis conversion info:
- Mantis ID: 37390
- OS: Microsoft Windows
- OS Build: 10 Professional
- Build: r45802
- Platform: i386 and x86_64
- Version: 3.3.1
- Fixed in version: 3.3.1
- Fixed in revision: 46346 (#87615458)