[Patch] AArch64 "magic division" (replace division by constant with multiplication)

Original Reporter info from Mantis: CuriousKit @CuriousKit

Reporter name: J. Gareth Moreton

Description:

This patch implements the compile-level speed-up that turns divisions by a constant into a multiplication by a reciprocal, thus providing a speed boost in a wide range of applications.

Steps to reproduce:

Apply patch and confirm correct compilation and no regressions in test suite.

Additional information:

Optimisations are generally not applied if -Os is specified. Code reuses magic number generation code where possible.

A new bench test, which also acts as a regression test, has also been included in a separate patch. This was developed to measure speed gains and to also catch a coding error that other tests were not cleanly detecting (I was adding a carry bit in the wrong place due to an internal overflow).

NOTE: Signed mod operations have not yet been optimised due to incorrect results being returned at times. Signed and unsigned division, and unsigned mod, have been optimised.

Mantis conversion info:

Mantis ID: 38806
OS: Debian GNU/Linux (Raspberry Pi)
OS Build: 10 (buster)
Build: r49247
Platform: aarch64-linux
Version: 3.3.1
Fixed in version: 3.3.1
Fixed in revision: 49290 (#256ca9d2), 49291 (#dc13516d)

To upload designs, you'll need to enable LFS and have an admin enable hashed storage. More information