[Patch] MovzxCmp2CmpMovzx oversight
Original Reporter info from Mantis: CuriousKit @CuriousKit
-
Reporter name: J. Gareth Moreton
Original Reporter info from Mantis: CuriousKit @CuriousKit
- Reporter name: J. Gareth Moreton
Description:
An apparent optimisation bug as listed in #38334 (closed) is actually due to a different optimisation bug. The MovzxCmp2CmpMovzx optimisation didn't check to see if the cmp instruction actually used the same register as the movzx instruction. This patch corrects that.
Steps to reproduce:
Apply patch and confirm correct compilation and no regressions.
Additional information:
In aoptx86.s before:
# [5806] if (reg in paramanager.get_volatile_registers_int(current_procinfo.procdef.proccalloption)) and
movl U_$PROCINFO_$$_CURRENT_PROCINFO,%eax
movl 24(%eax),%eax
# Register dl allocated
movb 85(%eax),%dl
# Register eax released
# Register eax allocated
movl U_$PARAMGR_$$_PARAMANAGER,%eax
# Register ecx allocated
call CPUPARA$_$TCPUPARAMANAGER_$__$$_GET_VOLATILE_REGISTERS_INT$TPROCCALLOPTION$$TCPUREGISTERSET
# Register ecx released
movzwl %di,%edx
# Peephole Optimization: cmpl $15,%edx -> cmpw $15,%ax (smaller and minimises pipeline stall - MovzxCmp2CmpMovzx)
cmpw $15,%ax
movzwl %ax,%eax
# Register eflags allocated
jbe .Lj3446
# Register eflags released
clc
jmp .Lj3447
After:
# [5806] if (reg in paramanager.get_volatile_registers_int(current_procinfo.procdef.proccalloption)) and
movl U_$PROCINFO_$$_CURRENT_PROCINFO,%eax
movl 24(%eax),%eax
# Register dl allocated
movb 85(%eax),%dl
# Register eax released
# Register eax allocated
movl U_$PARAMGR_$$_PARAMANAGER,%eax
# Register ecx allocated
call CPUPARA$_$TCPUPARAMANAGER_$__$$_GET_VOLATILE_REGISTERS_INT$TPROCCALLOPTION$$TCPUREGISTERSET
# Register ecx released
movzwl %di,%edx
movzwl %ax,%eax
# Register eflags allocated
cmpl $15,%edx
jbe .Lj3446
# Register eflags released
clc
jmp .Lj3447
----
This shows how useful the more complex comments are, because it shows that the instruction "cmpw $15,%ax" was originally "cmpl $15,%edx", which should instantly ring alarm bells because the register changed.
It was a complete coincidence that this fault only seemed to affect the JccAdd2SetccAdd optimisation (JccAdd2SetccAdd also affected a function in cutils, but this seemed to go undetected).
Mantis conversion info:
- Mantis ID: 38339
- OS: Microsoft Windows
- OS Build: 10 Home
- Build: r48117
- Platform: i386 and x86_64
- Version: 3.3.1
- Fixed in version: 3.3.1
- Fixed in revision: 48124 (#eb81b981)