View Revisions: Issue #36271

Summary 0036271: [Patch] Jump optimisations in code generator
Revision 2019-11-06 15:12 by J. Gareth Moreton
Additional Information Though not intended, there seems to be a major improvement in compile time under x86_64-win64. This is possibly due to label stripping and other optimisations that remove entries from the linked list of instructions etc, thereby reducing the time it takes for subsequent passes to analyse a procedure.

Time to compile (not link) Lazarus under trunk:

[72.883] 1304078 lines compiled, 72.9 sec
[74.453] 1304078 lines compiled, 74.5 sec
[82.164] 1304078 lines compiled, 82.2 sec

Time to compile (not link) Lazarus with jump optimisations:

[66.648] 1304078 lines compiled, 66.6 sec
[65.609] 1304078 lines compiled, 65.6 sec
[64.695] 1304078 lines compiled, 64.7 sec

----

The compiled and linked Lazarus binary is smaller as well due to the additional stripping of unnecessary alignment hints and finding new optimisations etc:

20,110,336 (trunk)
20,092,416 (jump optimisations)

----

The compilation of Lazarus source file "lazarus\components\codetools\basiccodetools.pas" shows a wide range of improvements and is a good showcase for the many jump optimisations - for example:

Trunk:

...
.Lj2729:
    movslq %r8d,%r9
    subq %r9,%rdx
    leaq 1(%rdx),%r9
    cmpl %r9d,%r11d
    jge .Lj2732
    .p2align 4,,10
    .p2align 3
    movl %r11d,%r9d
.Lj2732:
# Peephole Optimization: MovTestJxx2MovTestJxx done
    movq %rcx,%rdx
    testq %rcx,%rcx
...

Jump optimisations;

.Lj2729:
    movslq %r8d,%r9
    subq %r9,%rdx
    leaq 1(%rdx),%r9
    cmpl %r9d,%r11d
    cmovngel %r11d,%r9d
# Peephole Optimization: MovTestJxx2MovTestJxx done
    movq %rcx,%rdx
    testq %rcx,%rcx
...

(Note that label .Lj2732 has been removed completely)
Revision 2019-11-06 15:01 by J. Gareth Moreton
Additional Information Though not intended, there seems to be a major improvement in compile time under x86_64-win64. This is possibly due to label stripping and other optimisations that remove entries from the linked list of instructions etc, thereby reducing the time it takes for subsequent passes to analyse a procedure.

Time to compile (not link) Lazarus under trunk:

[72.883] 1304078 lines compiled, 72.9 sec
[74.453] 1304078 lines compiled, 74.5 sec
[82.164] 1304078 lines compiled, 82.2 sec

Time to compile (not link) Lazarus with jump optimisations:

[66.648] 1304078 lines compiled, 66.6 sec
[65.609] 1304078 lines compiled, 65.6 sec
[64.695] 1304078 lines compiled, 64.7 sec

----

The compiled and linked Lazarus binary is smaller as well due to the additional stripping of unnecessary alignment hints and finding new optimisations etc:

20,110,336 (trunk)
20,092,416 (jump optimisations)

----

The compilation of Lazarus source file "lazarus\components\codetools" shows a wide range of improvements and is a good showcase for the many jump optimisations - for example:

Trunk:

...
.Lj2729:
    movslq %r8d,%r9
    subq %r9,%rdx
    leaq 1(%rdx),%r9
    cmpl %r9d,%r11d
    jge .Lj2732
    .p2align 4,,10
    .p2align 3
    movl %r11d,%r9d
.Lj2732:
# Peephole Optimization: MovTestJxx2MovTestJxx done
    movq %rcx,%rdx
    testq %rcx,%rcx
...

Jump optimisations;

.Lj2729:
    movslq %r8d,%r9
    subq %r9,%rdx
    leaq 1(%rdx),%r9
    cmpl %r9d,%r11d
    cmovngel %r11d,%r9d
# Peephole Optimization: MovTestJxx2MovTestJxx done
    movq %rcx,%rdx
    testq %rcx,%rcx
...

(Note that label .Lj2732 has been removed completely)