[Patch] x86 "OptPass1MOV" improvements
Original Reporter info from Mantis: CuriousKit @CuriousKit
-
Reporter name: J. Gareth Moreton
Original Reporter info from Mantis: CuriousKit @CuriousKit
- Reporter name: J. Gareth Moreton
Description:
"OptPass1MOV-Improvement.patch" applies some optimisations to the peephole optimisation function of the same name. Repeated checks to the "GetNextInstruction_p Boolean" variable are factored out, since all of the optimisations bar the first one require a succeeding instruction.
More importantly, the deep 'MovMov2Mov' optimisations have been improved to be slightly faster (by not checking the immediate next instruction, as that's done elsewhere), allowing it to be performed even if the temporary register remains in use (it just preserves the first instruction in this case) and allowing the optimiser to check the 2nd next instruction under -O1 and -O2, since "GetNextInstructionUsingReg" only checks one instruction ahead unless -O3 is specified (which means we can remove the -O3 check that was present in OptPass1MOV). This optimisation was only made possible thanks to the x86-specific implementation of "RegModifiedByInstruction" over at #36376 (closed).
"x86_mov_call.patch" is a subtle performance boost for the compiler: Pass 1 of the peephole optimiser under i386 and x86_64 will now check if the instruction is a MOV before any other (it has been removed from the case block and added to its own if statement). The reason this has been done is because MOV instructions appear disproportionately more frequently than any other instruction. About 25% of all instructions are MOVs of some kind.
Steps to reproduce:
Apply patches and confirm correct compilation and successful regression testing under i386 and x86_64 platforms.
Additional information:
The improved optimisations produce smaller and faster code and can sometimes reduce the number of passes required in the peephole optimizer.
Optimizer performance would be greatly improved if register virtualisation is delayed, since the MovMov2Mov optimisation can often remove registers completely, but which have already been reserved earlier when the registers were devirtualised. Also, the optimisation cannot be performed if the stack is used for temporary storage (due to a lack of free registers), something that wouldn't be a problem when the registers are still virtual and assumed to not be modified by another thread. This would be an area of future research.
Mantis conversion info:
- Mantis ID: 36382
- OS: Microsoft Windows
- OS Build: 10 Professional
- Build: r43611
- Platform: i386 and x86_64
- Version: 3.3.1
- Monitored by: » Vincent (Vincent Snijders)