View Revisions: Issue #38837

Summary 0038837: [Patch] AArch64 Improved speed and efficiency with constant generation
Revision 2021-05-01 03:09 by J. Gareth Moreton
Description After the introduction of "magic division", a new class of constants were revealed... reciprocals of small divisors that were massive 64-bit numbers but which could be encoded with an "orr/movk" pair, since often only the first word differed. This patch therefore permits the encoding of constants where copying the 3rd word onto the 1st word produced a value that's valid for AArch64's barrel shifter, and then using movk to correct the 1st word. For example, instead of encoding $AAAAAAAAAAAAAAAB (reciprocal of 3) as "movz reg,#0xAAAB; movk reg,#0xAAAA,lsl 16; movk reg,#0xAAAA,lsl 32; movk reg,#0xAAAA,lsl 48", it is instead encoded as "orr reg,xzr,#0xAAAAAAAAAAAAAAAA; movk reg,#0xAAAB". Cycle count is identical (2), but overall code size is 8 bytes smaller.

Additionally, 32-bit numbers are now encoded as a single ORR instruction (as the synthetic MOV instruction for clarity) where possible.
Revision 2021-05-01 03:08 by J. Gareth Moreton
Description After the introduction of "magic division", a new class of constants were revealed... reciprocals of small divisors that were massive 64-bit numbers but which could be encoded with an "orr/movk" pair, since often only the first word differed. This patch therefore permits the encoding of constants where copying the 3rd word onto the 1st word produced a constant that's valid for AArch64's barrel shifter, and then using movk to correct the 1st word. e.g. instead of encoding AAAAAAAAAAAAAAAB (reciprocal of 3) as "movz reg,#0xAAAB; movk reg,#0xAAAA,lsl 16; movk reg,#0xAAAA,lsl 32; movk reg,#0xAAAA,lsl 48", it is instead encoded as "orr reg,xzr,#0xAAAAAAAAAAAAAAAA; movk reg,#0xAAAB". Cycle count is identical (2), but overall code size is 8 bytes smaller.

Additionally, 32-bit numbers are now encoded as a single ORR instruction (as the synthetic MOV instruction) where possible.
Revision 2021-05-01 03:07 by J. Gareth Moreton
Description After the introduction of "magic division", a new class of constants were revealed... reciprocals of small divisors that were massive 64-bit numbers but which could be encoded with an "orr/movk" pair, since often only the first word differed. This patch therefore permits the encoding of constants where copying the 3rd word onto the 1st word produced a constant that's valid for AArch64's barrel shifter, and then using movk to correct the 1st word. e.g. instead of encoding AAAAAAAAAAAAAAAB (reciprocal of 3) as "movz reg,#0xAAAB; movk reg,#0xAAAA,lsl 16; movk reg,#0xAAAA,lsl 32; movk reg,#0xAAAA,lsl 48", it is instead encoded as "orr reg,xzr,#0xAAAAAAAAAAAAAAAA; movk reg,#0xAAAB".

Additionally, 32-bit numbers are now encoded as a single ORR instruction (as the synthetic MOV instruction) where possible.
Revision 2021-05-01 03:06 by J. Gareth Moreton
Description After the introduction of "magic division", a new class of constants were revealed... reciprocals of small divisors that were massive 64-bit numbers but which could be encoded with an "orr/movk" pair, since often only the first word differed. This patch therefore permits the encoding of constants where copying the 3rd word onto the 1st word produced a constant that's valid for AArch64's barrel shifter, and then using movk to correct the 1st word. e.g. instead of encoding AAAAAAAAAAAAAAAB (reciprocal of 3) as "movz reg,#0xAAAB; movk reg,#0xAAAA,lsl 0000016; movk reg,#0xAAAA,lsl 0000032; movk reg,#0xAAAA,lsl 0000048", it is instead encoded as "orr reg,xzr,#0xAAAAAAAAAAAAAAAA; movk reg,#0xAAAB".

Additionally, 32-bit numbers are now encoded as a single ORR instruction (as the synthetic MOV instruction) where possible.