View Issue Details

IDProjectCategoryView StatusLast Update
0037785FPCCompilerpublic2020-10-07 23:59
ReporterKlaus1 Assigned To 
PrioritynormalSeverityminorReproducibilityalways
Status newResolutionopen 
Product Version3.2.0 
Summary0037785: error in inline assembler in intel syntax
Descriptionsee the uploaded file error_in_asm.txt
I think is a good idea correct in next fpc release.
regards Klaus
Tagsassembler, compiler, i386, x86, x86_64
Fixed in Revision
FPCOldBugId
FPCTarget-
Attached Files

Activities

Klaus1

2020-09-21 12:13

reporter  

error_in_asm.txt (1,278 bytes)   
follow errors in inline assembler FPC 3.2.0:

# [6771] vcvttpd2dq xmm0,ymmword ptr [rsp];     // convert double in Longint
	vcvttpd2dqx	(%rsp),%xmm0                      // give false result
  
# [9298] vcvtpd2dq xmm0,ymmword ptr [rdx];      // dito
	vcvtpd2dqx	(%rdx),%xmm0  
  
# [9312] vcvtpd2ps xmm0,ymmword ptr [rdx];      // dito 
 	vcvtpd2psx	(%rdx),%xmm0  
  
# [9467] vcvtpd2dq xmm0,ymmword ptr [rcx+r10];
	vcvtpd2dqx	(%rcx,%r10,1),%xmm0

correct value from NASM assembler list
39 000000AF C4A17FE60411              vcvtpd2dq xmm0,yword [rcx+r10];  
40 000000B5 C4A17D5A0411              vcvtpd2ps xmm0,yword [rcx+r10];
42 000000C1 C4A17DE60411              vcvttpd2dq xmm0,yword [rcx+r10];

vbroadcastsd ymm1,qword ptr [rip+Potenz10_D];  
ymmfloat.pas(9026,3) Error: Asm: [vbroadcastsd reg??,mem256] invalid combination of opcode and operands

vmovsd xmm0,qword ptr[rip+Potenz10_D];
ymmfloat.pas(9024,3) Error: Asm: [vmovsd reg??,mem128] invalid combination of opcode and operands

vorpd ymm1,ymm2,ymm3; 
ymmfloat.pas(9024,3) Error: Asm: [vorpd reg??,reg??,reg??] invalid combination of opcode and operands

vmovsd  xmm0,qword ptr [rcx];
ymmfloat.pas(8919,3) Error: Asm: [vmovsd reg??,mem128] invalid combination of opcode and operands

error_in_asm.txt (1,278 bytes)   

Florian

2020-09-23 21:49

administrator   ~0125794

Can you please post small self-containing examples of code? Is this one problem or three different ones? Do have the possibility to test with trunk?

Klaus1

2020-09-29 15:18

reporter   ~0125956

Hallo Florian,
this problem is only in the vex unit. Here is may develop for ymm is assembler for 64 bit aystems. This is a console program. search in the ymm.pas for the "DB" that are the errors. I must input the coding as define byte. Problems is broadcastsd ymm,... in intel mode and the convert functions.
Here is posible from ymm to xmm convert see the intel docu. the develop is with the fpc 3.2 oficial. For this work I not can use the trunk...
Regards Klaus Stöhr
ymm.7z (59,943 bytes)

J. Gareth Moreton

2020-09-30 14:20

developer   ~0125987

Last edited: 2020-09-30 14:32

View 3 revisions

Just tested compilation of the instructions under FPC 3.2.0:

    vcvttpd2dq xmm0,ymmword ptr [rsp]
    vcvtpd2dq xmm0,ymmword ptr [rdx]
    vcvtpd2ps xmm0,ymmword ptr [rdx]
    vcvtpd2dq xmm0,ymmword ptr [rcx+r10];
    vbroadcastsd ymm1,qword ptr [rip+Potenz10_D]
    vmovsd xmm0,qword ptr[rip+Potenz10_D]
    vorpd ymm1,ymm2,ymm3
    vmovsd xmm0,qword ptr [rcx]

(With Potenz10_D being a QWord-type variable)

The first four are correct - they use the 256-bit instructions:

vcvttpd2dqy (%rsp),%xmm0
vcvtpd2dqy (%rdx),%xmm0
vcvtpd2psy (%rdx),%xmm0
vcvtpd2dqy (%rcx,%r10,1),%xmm0

 The last four raise compiler errors:

Error: Asm: [vbroadcastsd reg??,mem256] invalid combination of opcode and operands
Error: Asm: [vmovsd reg??,mem128] invalid combination of opcode and operands
Error: Asm: [vorpd reg??,reg??,reg??] invalid combination of opcode and operands
Error: Asm: [vmovsd reg??,mem128] invalid combination of opcode and operands

Will now move onto testing with the trunk.

J. Gareth Moreton

2020-09-30 14:30

developer   ~0125988

Last edited: 2020-09-30 14:31

View 2 revisions

Under the trunk (r47011), code compiles successfully with no warnings, but incorrect assembly language is produced (as revealed by the -a parameter):

    vcvttpd2dqx (%rsp),%xmm0
    vcvtpd2dqx (%rdx),%xmm0
    vcvtpd2psx (%rdx),%xmm0
    vcvtpd2dqx (%rcx,%r10,1),%xmm0
    vbroadcastsd U_$P$ASM_TEST_$$_POTENZ10_D(%rip),%ymm1
    vmovsd U_$P$ASM_TEST_$$_POTENZ10_D(%rip),%xmm0
    vorpd %ymm3,%ymm2,%ymm1
    vmovsd (%rcx),%xmm0

Last 4 appear correct, but first 4 use 128-bit memory operands (note the x suffix rather than y).

Partially confirmed.

J. Gareth Moreton

2020-09-30 14:53

developer   ~0125991

Also to note, "vcvttpd2dq xmm0,ymm0" (Intel convention) raises a compiler error, even though it's correct according to Intel® 64 and IA-32 Architectures Software Developer’s Manual, Volume 2A, Page 3-238.

In compiler/x86/x86ins.dat, line 4288:

; VCVTTPD2DQ xmmreg_mz,mem256 must come first - map MemRefSize 256bits correct
; map all other MemrefSize (without broasdcast MemRef) to xmmreg, xmmrm
[VCVTTPD2DQ,vcvttpd2dqM]
(Ch_Wop2, Ch_Rop1)
xmmreg_mz,mem256 \350\352\361\362\364\370\1\xE6\110 AVX,SANDYBRIDGE,AVX512,TFV
xmmreg_mz,ymmreg \350\352\361\362\364\370\1\xE6\110 AVX,SANDYBRIDGE
xmmreg_mz,xmmrm \350\352\361\362\370\1\xE6\110 AVX,SANDYBRIDGE,AVX512,TFV
xmmreg_mz,bmem64 \350\352\361\370\1\xE6\110 AVX512,BCST2,TFV
xmmreg_mz,bmem64 \350\352\361\364\370\1\xE6\110 AVX512,BCST4,TFV
ymmreg_mz,mem512 \350\351\352\361\370\1\xE6\110 AVX512,TFV
ymmreg_mz,bmem64 \350\351\352\361\370\1\xE6\110 AVX512,BCST8,TFV
ymmreg_mz,zmmreg_sae \350\351\352\361\370\1\xE6\110 AVX512

Given that entries that appear first should take priority in syntax checking, and the octal encoding is correct (\364 equals VEX.256), something is going wrong with how the compiler is interpreting these table entries.

J. Gareth Moreton

2020-10-01 13:02

developer   ~0126021

One tricky thing is that the code changes if -a is specified, which makes it a bigger bug because -a shouldn't logically change the generated machine code (although there are some minor differences because -a sends the code through an external assembler).

J. Gareth Moreton

2020-10-02 12:12

developer   ~0126032

See if this works. Tests on x86_64-win64 have passed with no regressions, although additional testing is needed on other platforms.
avx-asm-fix.patch (2,929 bytes)   
Index: compiler/x86/rax86.pas
===================================================================
--- compiler/x86/rax86.pas	(revision 47011)
+++ compiler/x86/rax86.pas	(working copy)
@@ -1402,6 +1402,44 @@
 
 
 procedure Tx86Instruction.SetInstructionOpsize;
+
+  function CheckSSEAVX: Boolean;
+    begin
+      Result := False;
+
+      if not MemRefInfo(opcode).ExistsSSEAVX then
+        Exit;
+
+      { This check also covers MMX instructions that move data to and from
+        32-bit and 64-bit registers or memory, since such instructions are
+        replicated in SSE2 for use with XMM registers }
+      if tx86operand(operands[1]).opsize in [S_B,S_W,S_L,S_Q] then
+        begin
+          opsize := S_NO;
+          Exit(True);
+        end;
+
+      if (tx86operand(operands[1]).opsize <> S_NO) and (operands[1].opr.typ = OPR_REFERENCE) then
+        begin
+          { Memory sizes of 64 bits and under are handled above }
+          opsize:=tx86operand(operands[1]).opsize;
+          Exit(True);
+        end;
+
+      { If the source operand is larger than the destination (e.g.
+        "VCVTTPD2DQ XMM0, YMM1" in Intel notation), use the source operand }
+      if ((tx86operand(operands[1]).opsize = S_YMM) and (tx86operand(operands[2]).opsize = S_XMM)) or
+        (tx86operand(operands[1]).opsize = S_ZMM) and (tx86operand(operands[2]).opsize = S_XMM) or
+        (tx86operand(operands[1]).opsize = S_ZMM) and (tx86operand(operands[2]).opsize = S_YMM) then
+        begin
+          opsize:=tx86operand(operands[1]).opsize;
+          Exit(True);
+        end;
+
+      { If none of the conditions are met, this function returns False and the
+        opsize is set to the last operand's opsize }
+    end;
+
 begin
   if opsize<>S_NO then
    exit;
@@ -1466,33 +1504,22 @@
                   ;
               end;
             end;
-          A_MOVSS,
-          A_VMOVSS,
-          A_MOVD : { movd is a move from a mmx register to a
-                     32 bit register or memory, so no opsize is correct here PM }
-            exit;
-          A_MOVQ :
-            opsize:=S_IQ;
-          A_CVTSI2SS,
-          A_CVTSI2SD,
           A_OUT :
             opsize:=tx86operand(operands[1]).opsize;
           else
-            opsize:=tx86operand(operands[2]).opsize;
+            begin
+              if not CheckSSEAVX then
+                opsize:=tx86operand(operands[2]).opsize;
+            end;
         end;
       end;
     3 :
       begin
-        case opcode of
-          A_VCVTSI2SS,
-          A_VCVTSI2SD:
-            opsize:=tx86operand(operands[1]).opsize;
-        else
-          opsize:=tx86operand(operands[ops]).opsize;
-        end;
+        if not CheckSSEAVX then
+          opsize:=tx86operand(operands[3]).opsize;
       end;
-    4 :
-        opsize:=tx86operand(operands[ops]).opsize;
+    else
+      opsize:=tx86operand(operands[ops]).opsize;
 
   end;
 end;
avx-asm-fix.patch (2,929 bytes)   

Klaus1

2020-10-04 12:28

reporter   ~0126075

Hello,
I have tested your patch in original FPC 3.2.0 and it is ok. I have only a error in assembling, when I use the compiler switch -al
for assembler listing then follow (ymmfloat.pas(13903,0) Error: Error while assembling exitcode 1). This is normal
the error message when in assembler coding is a error and the assembler can not determine the error.
Regards Klaus

J. Gareth Moreton

2020-10-04 13:54

developer   ~0126079

Does that mean the patch is okay? If the above error only occurs with -al (and not -a), then it implies something is a little amiss.

Klaus1

2020-10-06 15:51

reporter   ~0126118

Helo,
the patch is ok. When I set the compiler switch -a not error, only when -al is set. Your patch give the correct coding. Thanks for your excellent work.
Regards Klaus

J. Gareth Moreton

2020-10-07 23:59

developer   ~0126135

Well I'm glad you approve. It depends if Florian and the others approve of the patch though, since some questions have been raised regarding the impact on AVX-512 instructions.

The -al issue should probably be looked at though.

Issue History

Date Modified Username Field Change
2020-09-21 12:13 Klaus1 New Issue
2020-09-21 12:13 Klaus1 File Added: error_in_asm.txt
2020-09-23 21:49 Florian Note Added: 0125794
2020-09-26 16:22 Florian Status new => feedback
2020-09-26 16:22 Florian FPCTarget => -
2020-09-29 15:18 Klaus1 Note Added: 0125956
2020-09-29 15:18 Klaus1 File Added: ymm.7z
2020-09-29 15:18 Klaus1 Status feedback => new
2020-09-30 14:20 J. Gareth Moreton Note Added: 0125987
2020-09-30 14:21 J. Gareth Moreton Note Edited: 0125987 View Revisions
2020-09-30 14:30 J. Gareth Moreton Note Added: 0125988
2020-09-30 14:31 J. Gareth Moreton Note Edited: 0125988 View Revisions
2020-09-30 14:31 J. Gareth Moreton Status new => confirmed
2020-09-30 14:32 J. Gareth Moreton Note Edited: 0125987 View Revisions
2020-09-30 14:53 J. Gareth Moreton Note Added: 0125991
2020-10-01 13:02 J. Gareth Moreton Note Added: 0126021
2020-10-01 13:35 J. Gareth Moreton Assigned To => J. Gareth Moreton
2020-10-01 13:35 J. Gareth Moreton Status confirmed => assigned
2020-10-02 12:11 J. Gareth Moreton Tag Attached: x86
2020-10-02 12:11 J. Gareth Moreton Tag Attached: x86_64
2020-10-02 12:11 J. Gareth Moreton Tag Attached: assembler
2020-10-02 12:11 J. Gareth Moreton Tag Attached: compiler
2020-10-02 12:11 J. Gareth Moreton Tag Attached: i386
2020-10-02 12:12 J. Gareth Moreton Note Added: 0126032
2020-10-02 12:12 J. Gareth Moreton File Added: avx-asm-fix.patch
2020-10-02 12:12 J. Gareth Moreton Assigned To J. Gareth Moreton =>
2020-10-02 12:12 J. Gareth Moreton Status assigned => feedback
2020-10-02 23:33 J. Gareth Moreton Assigned To => Florian
2020-10-02 23:33 J. Gareth Moreton Status feedback => assigned
2020-10-02 23:33 J. Gareth Moreton Status assigned => feedback
2020-10-03 10:27 Florian Assigned To Florian =>
2020-10-04 12:28 Klaus1 Note Added: 0126075
2020-10-04 12:28 Klaus1 Status feedback => new
2020-10-04 13:54 J. Gareth Moreton Note Added: 0126079
2020-10-06 15:51 Klaus1 Note Added: 0126118
2020-10-07 23:59 J. Gareth Moreton Note Added: 0126135