View Issue Details

IDProjectCategoryView StatusLast Update
0034653FPCCompilerpublic2019-01-20 13:56
ReporterfrolAssigned ToFlorian 
PriorityhighSeveritymajorReproducibilityalways
Status resolvedResolutionfixed 
PlatformLinux x86_64OSDebianOS Version9.6
Product Version3.0.4Product Build 
Target Version3.3.1Fixed in Version3.3.1 
Summary0034653: Compiler misoptimizes the logical expression
DescriptionThe following code outputs "YES YES" when compiled with `-O0`, `-O1`, `-O3`, `-O4`, but outputs "NO YES" with `-O2`.

Notes:

* the `if` expressions are completely the same, so there is no reason for the output to be different!
* the first `WriteLn;` is essential to reproduce the bug, but it can be replaced with some other computations.
* the issue is reproducible on latest Debian, Ubuntu, CentOS, AlpineLinux x86_64 with FPC installed from the official repositories and the prebuilt-binaries from FreePascal.org
* the issue is NOT reproducible on Arch Linux
* the issue affects FPC 3.0.0+ and is not reproducible on FPC 2.6.4
Steps To Reproduce1. Compile the following code with `fpc -O2 qq.pas` and run the program:

var d: LongInt;
begin
    WriteLn;

    d := 828;
    if ((d mod 2) = 0) xor ((d < 0) and ((d mod 3) = 0)) then
        WriteLn('YES')
    else
        WriteLn('NO');

    if ((d mod 2) = 0) xor ((d < 0) and ((d mod 3) = 0)) then
        WriteLn('YES')
    else
        WriteLn('NO');
end.

2. The expected output is "YES YES", but the output is "NO YES"
TagsNo tags attached.
Fixed in Revision40934
FPCOldBugId
FPCTarget
Attached Files
  • optcse.pas.patch (941 bytes)
    Index: compiler/optcse.pas
    ===================================================================
    --- compiler/optcse.pas	(revision 40515)
    +++ compiler/optcse.pas	(working copy)
    @@ -53,6 +53,7 @@
           pass_1,
           symconst,symdef,symsym,
           defutil,
    +      cgbase,
           optbase;
     
         const
    @@ -74,6 +75,11 @@
               ) or
               ((n.nodetype=callparan) and not(assigned(tcallparanode(n).right))) or
               ((n.nodetype=loadn) and
    +{$if defined(cpu64bitalu)}          
    +            (tabstractnormalvarsym(tloadnode(n).symtableentry).localloc.size in [OS_64,OS_S64]) and
    +{$elseif defined(cpu32bitalu)}
    +            (tabstractnormalvarsym(tloadnode(n).symtableentry).localloc.size in [OS_32,OS_S32]) and
    +{$endif}
                 not((tloadnode(n).symtableentry.typ in [staticvarsym,localvarsym,paravarsym]) and
                     (vo_volatile in tabstractvarsym(tloadnode(n).symtableentry).varoptions))
               ) then
    
    optcse.pas.patch (941 bytes)
  • optcse.pas-clarify.patch (795 bytes)
    Index: compiler/optcse.pas
    ===================================================================
    --- compiler/optcse.pas	(revision 40521)
    +++ compiler/optcse.pas	(working copy)
    @@ -53,6 +53,7 @@
           pass_1,
           symconst,symdef,symsym,
           defutil,
    +      cgbase,
           optbase;
     
         const
    @@ -74,6 +75,7 @@
               ) or
               ((n.nodetype=callparan) and not(assigned(tcallparanode(n).right))) or
               ((n.nodetype=loadn) and
    +            (tcgsize2size[tabstractnormalvarsym(tloadnode(n).symtableentry).localloc.size] = tcgsize2size[def_cgsize(n.resultdef)]) and
                 not((tloadnode(n).symtableentry.typ in [staticvarsym,localvarsym,paravarsym]) and
                     (vo_volatile in tabstractvarsym(tloadnode(n).symtableentry).varoptions))
               ) then
    

Activities

Marģers

2018-12-05 20:11

reporter   ~0112372

Last edited: 2018-12-05 20:12

View 2 revisions

can confirm on x86_64 linux fpc 3.3.1

# [7] d := 828;
    movl $828,%ebx
.Ll5:
# [10] if ((d mod 2) = 0) xor ((d < 0) and ((d mod 3) = 0)) then
    testl %ebx,%ebx
    jnl .Lj4 ----- if taken
    movslq %ebx,%rcx
    movq %rcx,%rax
    cqto
    movl $3,%esi
    idivq %rsi
    testq %rdx,%rdx
    jne .Lj4
    movb $1,%sil
    jmp .Lj6
.Lj4:
    xorb %sil,%sil
.Lj6:
    movq %rcx,%rax ---- then rcx undefined value
    cqto
    movl $2,%ecx
    idivq %rcx
    testq %rdx,%rdx
    seteb %al
    xorb %sil,%al
.Ll6:
    je .Lj8
.Ll7: writeln('yes');
.Lj8: writeln('no');

on second if instruction order the same, but registers reordered and instead of rcx take rbx so result then is correct.

J. Gareth Moreton

2018-12-05 20:56

developer   ~0112374

Let me see what I can discover.

J. Gareth Moreton

2018-12-05 21:27

developer   ~0112375

I admit I don't quite know how to fix this one, but I can confirm that the bug is NOT in the peephole optimiser, because the assembly before it reaches pass 1 still contains the conditional jump and the leaving of %ecx undefined if the branch is taken. The misoptimisation occurs earlier.

frol

2018-12-05 22:52

reporter   ~0112387

There is a slimmer version of the code:

var a: integer;
begin
  WriteLn;
  a := 2;
  WriteLn( (a mod 2 = 0) xor (a<0) and (a mod 5=0) );
  WriteLn( (a mod 2 = 0) xor (a<0) and (a mod 5=0) );
end.

The output is `FALSE TRUE` with `-O2`, `-O3`, and `-O4` even though the expressions are exactly the same. The program compiled without optimization or with `-O1` produces the `TRUE TRUE` output.

J. Gareth Moreton

2018-12-06 03:12

developer   ~0112392

Last edited: 2018-12-06 03:13

View 2 revisions

This is a difficult one to write an automated test for, because it seems that adding anything else other than WriteLn to the if blocks or storing the conditions directly into a Boolean variable causes the error to disappear.

It looks like it's related to how the compiler handles WriteLn.

Jeppe Johansen

2018-12-06 12:04

developer   ~0112394

Could it be related to CSE, and short circuit evaluation still not being completely solid? There were some issues with that a few months ago.

Do-wan Kim

2018-12-06 13:14

reporter   ~0112397

It seems to works ok with Win64.

Benito van der Zander

2018-12-06 17:17

reporter   ~0112402

Last edited: 2018-12-06 18:09

View 2 revisions

I use r37480, Free Pascal Compiler version 3.1.1 [2018/02/12] for x86_64 on linux and it appears to work

rd0x

2018-12-06 19:48

reporter   ~0112410

Both example codes are working fine with Free Pascal Compiler version 3.2.0-beta-r40301 [2018/12/01] for x86_64 on Windows 10 Professional (64bit).

J. Gareth Moreton

2018-12-07 00:32

developer   ~0112416

frol, can you confirm that the misoptimisation occurs or not on the latest trunk version?

frol

2018-12-07 21:54

reporter   ~0112440

I confirm (just like Marģers) that the misoptimization occurs on the latest trunk version (3.3.1 Revision 40495).

The interesting part is that now I have the same FPC sources compiled on ArchLinux, CentOS Linux 7.5.1804, and Ubuntu 18.04, and the bug does not depend on the platform where the compiler is built, that is if I build the provided snippets on Arch Linux with FPC which I built on Arch Linux, then there is no bug (just like before), but if I build the same source file with the same compiler (built on Arch Linux) on CentOS/Ubuntu, I observe the bug!

I run the experiments inside Docker containers on ArchLinux host, so the kernel and the hardware are exactly the same and the only difference is the distribution. I can prepare a Dockerfile if necessary.

Just for my own reference, here is the compilation command I use:
/tmp/fpc/compiler/ppc3 -O2 -Fu/tmp/fpc/rtl/linux -Fi/tmp/fpc/rtl/unix -Fi/tmp/fpc/rtl/inc -Fi/tmp/fpc/rtl/linux -Fi/tmp/fpc/rtl/x86_64 -Fi/tmp/fpc/rtl/linux/x86_64 -Fu/tmp/fpc/rtl/units/x86_64-linux /tmp/qq2.pas -o/tmp/qq2-centos-fpc331

Do-wan Kim

2018-12-08 04:18

reporter   ~0112442

Last edited: 2018-12-08 05:11

View 3 revisions

Under Win64 debugging, it runs ok but RAX Register has invalid value. It is lucky.
Win64 still have problem.

I guess label position is changed after new optimization code insertion.

from
-------------------------------------------------------------------
# [12] d := 828;
    movl $828,%ebx
.Ll5:
# [13] if ((d mod 2) = 0) xor ((d < 0) and ((d mod 3) = 0)) then
    testl %ebx,%ebx
    jnl .Lj4
    movslq %ebx,%rcx
    movq %rcx,%rax
    cqto

to
-------------------------------------------------------------------
# [12] d := 828;
    movl $828,%ebx
.Ll5:
# [13] if ((d mod 2) = 0) xor ((d < 0) and ((d mod 3) = 0)) then
    testl %ebx,%ebx
    movslq %ebx,%rcx
    jnl .Lj4
    movq %rcx,%rax
    cqto

J. Gareth Moreton

2018-12-08 05:56

developer   ~0112443

New optimization code insertion?

Either way it's still a little bit suboptimal since "movslq %ebx,%rcx; movq %rcx,%rax" can be changed into just "movslq %ebx,%rax".

I presume %rax doesn't get written to if the code jumps to .Lj4, but still gets read from.

Do-wan Kim

2018-12-08 08:35

reporter   ~0112444

Last edited: 2018-12-10 03:11

View 13 revisions

It's CSE optimization bug.

upload temporary fix but it is not perfect.
I don't know how to get loadnode destination size.

Do-wan Kim

2018-12-10 03:07

reporter  

optcse.pas.patch (941 bytes)
Index: compiler/optcse.pas
===================================================================
--- compiler/optcse.pas	(revision 40515)
+++ compiler/optcse.pas	(working copy)
@@ -53,6 +53,7 @@
       pass_1,
       symconst,symdef,symsym,
       defutil,
+      cgbase,
       optbase;
 
     const
@@ -74,6 +75,11 @@
           ) or
           ((n.nodetype=callparan) and not(assigned(tcallparanode(n).right))) or
           ((n.nodetype=loadn) and
+{$if defined(cpu64bitalu)}          
+            (tabstractnormalvarsym(tloadnode(n).symtableentry).localloc.size in [OS_64,OS_S64]) and
+{$elseif defined(cpu32bitalu)}
+            (tabstractnormalvarsym(tloadnode(n).symtableentry).localloc.size in [OS_32,OS_S32]) and
+{$endif}
             not((tloadnode(n).symtableentry.typ in [staticvarsym,localvarsym,paravarsym]) and
                 (vo_volatile in tabstractvarsym(tloadnode(n).symtableentry).varoptions))
           ) then
optcse.pas.patch (941 bytes)

frol

2018-12-10 11:55

reporter   ~0112476

Do-wan Kim's patch fixes the issue!

J. Gareth Moreton

2018-12-10 12:52

developer   ~0112478

Referred to Florian for approval.

Do-wan Kim

2018-12-11 10:40

reporter  

optcse.pas-clarify.patch (795 bytes)
Index: compiler/optcse.pas
===================================================================
--- compiler/optcse.pas	(revision 40521)
+++ compiler/optcse.pas	(working copy)
@@ -53,6 +53,7 @@
       pass_1,
       symconst,symdef,symsym,
       defutil,
+      cgbase,
       optbase;
 
     const
@@ -74,6 +75,7 @@
           ) or
           ((n.nodetype=callparan) and not(assigned(tcallparanode(n).right))) or
           ((n.nodetype=loadn) and
+            (tcgsize2size[tabstractnormalvarsym(tloadnode(n).symtableentry).localloc.size] = tcgsize2size[def_cgsize(n.resultdef)]) and
             not((tloadnode(n).symtableentry.typ in [staticvarsym,localvarsym,paravarsym]) and
                 (vo_volatile in tabstractvarsym(tloadnode(n).symtableentry).varoptions))
           ) then

Do-wan Kim

2018-12-11 10:42

reporter   ~0112487

update patch. I find way to compare with result size.

frol

2019-01-02 14:02

reporter   ~0113095

Is there is something I can do to help to resolve this issue sooner rather later?

Florian

2019-01-20 13:56

administrator   ~0113523

The provided patch hide the problem, but didn't fix it. The real problem was very different.

Issue History

Date Modified Username Field Change
2018-12-05 17:51 frol New Issue
2018-12-05 20:11 Marģers Note Added: 0112372
2018-12-05 20:12 Marģers Note Edited: 0112372 View Revisions
2018-12-05 20:56 J. Gareth Moreton Assigned To => J. Gareth Moreton
2018-12-05 20:56 J. Gareth Moreton Status new => assigned
2018-12-05 20:56 J. Gareth Moreton Note Added: 0112374
2018-12-05 21:26 J. Gareth Moreton Assigned To J. Gareth Moreton =>
2018-12-05 21:27 J. Gareth Moreton Note Added: 0112375
2018-12-05 21:27 J. Gareth Moreton Assigned To => J. Gareth Moreton
2018-12-05 21:27 J. Gareth Moreton Status assigned => confirmed
2018-12-05 21:28 J. Gareth Moreton Assigned To J. Gareth Moreton =>
2018-12-05 21:28 J. Gareth Moreton Priority normal => high
2018-12-05 21:28 J. Gareth Moreton Severity minor => major
2018-12-05 22:52 frol Note Added: 0112387
2018-12-06 03:12 J. Gareth Moreton Note Added: 0112392
2018-12-06 03:13 J. Gareth Moreton Note Edited: 0112392 View Revisions
2018-12-06 12:04 Jeppe Johansen Note Added: 0112394
2018-12-06 13:14 Do-wan Kim Note Added: 0112397
2018-12-06 17:17 Benito van der Zander Note Added: 0112402
2018-12-06 18:09 Benito van der Zander Note Edited: 0112402 View Revisions
2018-12-06 19:48 rd0x Note Added: 0112410
2018-12-07 00:32 J. Gareth Moreton Note Added: 0112416
2018-12-07 00:32 J. Gareth Moreton Assigned To => J. Gareth Moreton
2018-12-07 00:32 J. Gareth Moreton Status confirmed => feedback
2018-12-07 00:32 J. Gareth Moreton Target Version => 3.3.1
2018-12-07 00:33 J. Gareth Moreton Assigned To J. Gareth Moreton =>
2018-12-07 21:54 frol Note Added: 0112440
2018-12-07 21:54 frol Status feedback => new
2018-12-07 22:16 J. Gareth Moreton Status new => confirmed
2018-12-08 04:18 Do-wan Kim Note Added: 0112442
2018-12-08 05:09 Do-wan Kim Note Edited: 0112442 View Revisions
2018-12-08 05:11 Do-wan Kim Note Edited: 0112442 View Revisions
2018-12-08 05:56 J. Gareth Moreton Note Added: 0112443
2018-12-08 08:35 Do-wan Kim Note Added: 0112444
2018-12-08 08:46 Do-wan Kim Note Edited: 0112444 View Revisions
2018-12-08 08:47 Do-wan Kim Note Edited: 0112444 View Revisions
2018-12-08 09:05 Do-wan Kim Note Edited: 0112444 View Revisions
2018-12-08 09:06 Do-wan Kim Note Edited: 0112444 View Revisions
2018-12-08 09:11 Do-wan Kim Note Edited: 0112444 View Revisions
2018-12-08 09:16 Do-wan Kim Note Edited: 0112444 View Revisions
2018-12-08 11:44 Do-wan Kim Note Edited: 0112444 View Revisions
2018-12-08 14:40 Do-wan Kim Note Edited: 0112444 View Revisions
2018-12-08 14:49 Do-wan Kim Note Edited: 0112444 View Revisions
2018-12-09 02:17 Do-wan Kim Note Edited: 0112444 View Revisions
2018-12-09 02:23 Do-wan Kim Note Edited: 0112444 View Revisions
2018-12-10 03:07 Do-wan Kim File Added: optcse.pas.patch
2018-12-10 03:11 Do-wan Kim Note Edited: 0112444 View Revisions
2018-12-10 11:55 frol Note Added: 0112476
2018-12-10 12:52 J. Gareth Moreton Note Added: 0112478
2018-12-10 12:52 J. Gareth Moreton Assigned To => Florian
2018-12-10 12:52 J. Gareth Moreton Status confirmed => feedback
2018-12-11 10:40 Do-wan Kim File Added: optcse.pas-clarify.patch
2018-12-11 10:42 Do-wan Kim Note Added: 0112487
2019-01-02 14:02 frol Note Added: 0113095
2019-01-02 14:02 frol Status feedback => assigned
2019-01-20 13:56 Florian Fixed in Revision => 40934
2019-01-20 13:56 Florian Note Added: 0113523
2019-01-20 13:56 Florian Status assigned => resolved
2019-01-20 13:56 Florian Fixed in Version => 3.3.1
2019-01-20 13:56 Florian Resolution open => fixed