[Patch] "MOV REG, -1" -> "OR REG, -1" optimisation
Original Reporter info from Mantis: CuriousKit @CuriousKit
-
Reporter name: J. Gareth Moreton
Original Reporter info from Mantis: CuriousKit @CuriousKit
- Reporter name: J. Gareth Moreton
Description:
This patch adds an extra optimisation to "PostPeepholeOptMov" in compiler/x86/aoptx86.pas:
If the instruction "MOV REG, -1" (Intel notation) is found, where REG is either a 32- or 64-bit register, it is changed to "OR REG, -1" instead. The effect is the same and takes exactly the same speed to execute, but the encoding is much smaller.
For 16-bit registers, only AX is optimised this way because it has its own encoding for OR that takes fewer bytes.
Though it only saves a handful of bytes per occurrance, -1 is a common value to indicate an error or as a means of initialising a for-loop that starts at zero, so the cumulative effect can be quite substantial (about 30 KiB was noted to have been shaved off the binary for the x86_64-win64 build of Lazarus).
Steps to reproduce:
Apply patch and confirm correct compilation and also observe size saving
Additional information:
- The optimisation is not applied if the FLAGS register is in use at the same time, as OR scrambles it.
- This particular optimisation has been observed in GCC as well, so it has a proven track record.
Mantis conversion info:
- Mantis ID: 36308
- OS: Microsoft Windows
- OS Build: 10 Professional
- Build: 43455
- Platform: i386 and x86_64
- Version: 3.3.1
- Fixed in version: 3.3.1
- Fixed in revision: 43579 (#c6116258)