View Issue Details

IDProjectCategoryView StatusLast Update
0035700FPCCompilerpublic2019-09-14 17:24
ReporterJ. Gareth MoretonAssigned To 
PrioritynormalSeverityminorReproducibilityN/A
Status newResolutionopen 
Platformi386 and x86_64OSMicrosoft WindowsOS Version10 Professional
Product Version3.3.1Product Buildr42196 
Target VersionFixed in Version 
Summary0035700: [Assembler] Fix for SSE/AVX instructions with 32- and 64-bit operands
DescriptionThere is a subtle bug in the compiler that means that instructions with 64-bit operands (e.g. VADDSD) do not behave properly if there's an explicit size set under Intel mode (e.g. "QWORD PTR"). "assembler-operand-fix.patch" repairs the problem for the issue with 64-bit operands.

The other three patches modify the SSE, SSE2, AVX and FMA instructions so that the ones that deal with single (not packed) data will successfully compile if given an explicit memory size, either directly via "QWORD PTR" etc or when specifying a variable.
Steps To ReproduceApply "assembler-operand-fix.patch" and confirm that the compiler builds correctly under i386 and x86_64 on Windows and Linux. When applying the other patches, ensure that "compiler/utils/mkx86ins" and "compiler/utils/mkx86ins x86_64" are executed and the generated .inc files put in the correct locations ("compiler/i386" and "compiler/x86_64" respectively).

After building the compiler, see that the test over at 0032219 builds without incident.
Additional InformationA number of SSE and AVX instructions deal with single, rather than packed, data. Due to how the instructions were initally configured, any memory operands were treated as the full vector size rather than the size of a single element.

Programming in 64-bit memory operands initially failed since there was code in the compiler that forced 64-bit operands to be treated differently outside of FPU instructions.
Tagsassembler, compiler, patch, x86
Fixed in Revision
FPCOldBugId
FPCTarget
Attached Files
  • assembler-sse-instruction-fixes.patch (7,026 bytes)
    Index: compiler/x86/x86ins.dat
    ===================================================================
    --- compiler/x86/x86ins.dat	(revision 42196)
    +++ compiler/x86/x86ins.dat	(working copy)
    @@ -204,11 +204,6 @@
     (Ch_RWESI, Ch_RMemEDI, Ch_RWEDI, Ch_RDirFlag, Ch_WOverflowFlag, Ch_WSignFlag, Ch_WZeroFlag, Ch_WAuxiliaryFlag, Ch_WCarryFlag, Ch_WParityFlag)
     void                  \332\1\xA6                      8086
     
    -[CMPSD,cmpsl]
    -(Ch_RWESI, Ch_RMemEDI, Ch_RWEDI, Ch_RDirFlag, Ch_WOverflowFlag, Ch_WSignFlag, Ch_WZeroFlag, Ch_WAuxiliaryFlag, Ch_WCarryFlag, Ch_WParityFlag)
    -void                  \332\325\1\xA7                  386
    -xmmreg,xmmrm,imm      \334\2\x0F\xC2\110\26           WILLAMETTE,SSE2,SM2,SB,AR2
    -
     [CMPSW]
     (Ch_RWESI, Ch_RMemEDI, Ch_RWEDI, Ch_RDirFlag, Ch_WOverflowFlag, Ch_WSignFlag, Ch_WZeroFlag, Ch_WAuxiliaryFlag, Ch_WCarryFlag, Ch_WParityFlag)
     void                  \332\324\1\xA7                  8086
    @@ -2238,7 +2234,8 @@
     
     [ADDSS]
     (Ch_Mop2, Ch_Rop1)
    -xmmreg,xmmrm          \333\2\x0F\x58\110              KATMAI,SSE
    +xmmreg,mem32          \333\2\x0F\x58\110              KATMAI,SSE
    +xmmreg,xmmreg         \333\2\x0F\x58\110              KATMAI,SSE
     
     [ANDNPS]
     (Ch_Mop2, Ch_Rop1)
    @@ -2254,7 +2251,8 @@
     
     [CMPEQSS]
     (Ch_All)
    -xmmreg,xmmrm          \333\2\x0F\xC2\110\1\x00        KATMAI,SSE
    +xmmreg,mem32          \333\2\x0F\xC2\110\1\x00        KATMAI,SSE
    +xmmreg,xmmreg         \333\2\x0F\xC2\110\1\x00        KATMAI,SSE
     
     [CMPLEPS]
     (Ch_All)
    @@ -2262,7 +2260,8 @@
     
     [CMPLESS]
     (Ch_All)
    -xmmreg,xmmrm          \333\2\x0F\xC2\110\1\x02        KATMAI,SSE
    +xmmreg,mem32          \333\2\x0F\xC2\110\1\x02        KATMAI,SSE
    +xmmreg,xmmreg         \333\2\x0F\xC2\110\1\x02        KATMAI,SSE
     
     [CMPLTPS]
     (Ch_All)
    @@ -2270,7 +2269,8 @@
     
     [CMPLTSS]
     (Ch_All)
    -xmmreg,xmmrm          \333\2\x0F\xC2\110\1\x01        KATMAI,SSE
    +xmmreg,mem32          \333\2\x0F\xC2\110\1\x01        KATMAI,SSE
    +xmmreg,xmmreg         \333\2\x0F\xC2\110\1\x01        KATMAI,SSE
     
     [CMPNEQPS]
     (Ch_All)
    @@ -2278,7 +2278,8 @@
     
     [CMPNEQSS]
     (Ch_All)
    -xmmreg,xmmrm          \333\2\x0F\xC2\110\1\x04        KATMAI,SSE
    +xmmreg,mem32          \333\2\x0F\xC2\110\1\x04        KATMAI,SSE
    +xmmreg,xmmreg         \333\2\x0F\xC2\110\1\x04        KATMAI,SSE
     
     [CMPNLEPS]
     (Ch_All)
    @@ -2286,7 +2287,8 @@
     
     [CMPNLESS]
     (Ch_All)
    -xmmreg,xmmrm          \333\2\x0F\xC2\110\1\x06        KATMAI,SSE
    +xmmreg,mem32          \333\2\x0F\xC2\110\1\x06        KATMAI,SSE
    +xmmreg,xmmreg         \333\2\x0F\xC2\110\1\x06        KATMAI,SSE
     
     [CMPNLTPS]
     (Ch_All)
    @@ -2294,7 +2296,8 @@
     
     [CMPNLTSS]
     (Ch_All)
    -xmmreg,xmmrm          \333\2\x0F\xC2\110\1\x05        KATMAI,SSE
    +xmmreg,mem32          \333\2\x0F\xC2\110\1\x05        KATMAI,SSE
    +xmmreg,xmmreg         \333\2\x0F\xC2\110\1\x05        KATMAI,SSE
     
     [CMPORDPS]
     (Ch_All)
    @@ -2302,7 +2305,8 @@
     
     [CMPORDSS]
     (Ch_All)
    -xmmreg,xmmrm          \333\2\x0F\xC2\110\1\x07        KATMAI,SSE
    +xmmreg,mem32          \333\2\x0F\xC2\110\1\x07        KATMAI,SSE
    +xmmreg,xmmreg         \333\2\x0F\xC2\110\1\x07        KATMAI,SSE
     
     [CMPUNORDPS]
     (Ch_All)
    @@ -2310,7 +2314,8 @@
     
     [CMPUNORDSS]
     (Ch_All)
    -xmmreg,xmmrm          \333\2\x0F\xC2\110\1\x03        KATMAI,SSE
    +xmmreg,mem32          \333\2\x0F\xC2\110\1\x03        KATMAI,SSE
    +xmmreg,xmmreg         \333\2\x0F\xC2\110\1\x03        KATMAI,SSE
     
     ;
     ; CMPPS/CMPSS must come after the specific ops; that way the disassembler will find the
    @@ -2323,15 +2328,17 @@
     
     [CMPSS]
     (Ch_All)
    -xmmreg,xmmrm,imm      \333\2\x0F\xC2\110\22           KATMAI,SSE,SB,AR2
    +xmmreg,mem32,imm      \333\2\x0F\xC2\110\22           KATMAI,SSE,SB,AR2
    +xmmreg,xmmreg,imm     \333\2\x0F\xC2\110\22           KATMAI,SSE,SB,AR2
     
     [COMISS]
     (Ch_Rop1, Ch_Rop2, Ch_WFlags)
    -xmmreg,xmmrm          \2\x0F\x2F\110                  KATMAI,SSE
    +xmmreg,mem32          \2\x0F\x2F\110                  KATMAI,SSE
    +xmmreg,xmmreg         \2\x0F\x2F\110                  KATMAI,SSE
     
     [CVTPI2PS]
     (Ch_Wop2, Ch_Rop1)
    -xmmreg,mmxrm         \331\2\x0F\x2A\110             KATMAI,SSE,MMX
    +xmmreg,mmxrm          \331\2\x0F\x2A\110              KATMAI,SSE,MMX
     
     [CVTPS2PI]
     (Ch_Wop2, Ch_Rop1)
    @@ -2364,7 +2371,8 @@
     
     [DIVSS]
     (Ch_Mop2, Ch_Rop1)
    -xmmreg,xmmrm          \333\2\x0F\x5E\110              KATMAI,SSE
    +xmmreg,mem32          \333\2\x0F\x5E\110              KATMAI,SSE
    +xmmreg,xmmreg         \333\2\x0F\x5E\110              KATMAI,SSE
     
     [LDMXCSR]
     (Ch_All)
    @@ -2376,7 +2384,8 @@
     
     [MAXSS]
     (Ch_All)
    -xmmreg,xmmrm          \333\2\x0F\x5F\110              KATMAI,SSE
    +xmmreg,mem32          \333\2\x0F\x5F\110              KATMAI,SSE
    +xmmreg,xmmreg         \333\2\x0F\x5F\110              KATMAI,SSE
     
     [MINPS]
     (Ch_All)
    @@ -2384,7 +2393,8 @@
     
     [MINSS]
     (Ch_All)
    -xmmreg,xmmrm          \333\2\x0F\x5D\110              KATMAI,SSE
    +xmmreg,mem32          \333\2\x0F\x5D\110              KATMAI,SSE
    +xmmreg,xmmreg         \333\2\x0F\x5D\110              KATMAI,SSE
     
     [MOVAPS]
     (Ch_ROp1, Ch_WOp2)
    @@ -2421,7 +2431,6 @@
     (Ch_Wop2, Ch_Rop1)
     xmmreg,xmmreg         \333\2\x0F\x10\110              KATMAI,SSE
     xmmreg,mem32          \333\2\x0F\x10\110              KATMAI,SSE
    -xmmreg,xmmreg         \333\2\x0F\x11\101              KATMAI,SSE
     mem32,xmmreg          \333\2\x0F\x11\101              KATMAI,SSE
     
     [MOVUPS]
    @@ -2435,7 +2444,8 @@
     
     [MULSS]
     (Ch_Mop2, Ch_Rop1)
    -xmmreg,xmmrm          \333\2\x0F\x59\110              KATMAI,SSE
    +xmmreg,mem32          \333\2\x0F\x59\110              KATMAI,SSE
    +xmmreg,xmmreg         \333\2\x0F\x59\110              KATMAI,SSE
     
     [ORPS]
     (Ch_Mop2, Ch_Rop1)
    @@ -2447,7 +2457,8 @@
     
     [RCPSS]
     (Ch_Wop2, Ch_Rop1)
    -xmmreg,xmmrm          \333\2\x0F\x53\110              KATMAI,SSE
    +xmmreg,mem32          \333\2\x0F\x53\110              KATMAI,SSE
    +xmmreg,xmmreg         \333\2\x0F\x53\110              KATMAI,SSE
     
     [RSQRTPS]
     (Ch_Wop2, Ch_Rop1)
    @@ -2455,7 +2466,8 @@
     
     [RSQRTSS]
     (Ch_Wop2, Ch_Rop1)
    -xmmreg,xmmrm          \333\2\x0F\x52\110              KATMAI,SSE
    +xmmreg,mem32          \333\2\x0F\x52\110              KATMAI,SSE
    +xmmreg,xmmreg         \333\2\x0F\x52\110              KATMAI,SSE
     
     [SHUFPS]
     (Ch_Mop3, Ch_Rop2)
    @@ -2467,7 +2479,8 @@
     
     [SQRTSS]
     (Ch_Wop2, Ch_Rop1)
    -xmmreg,xmmrm          \333\2\x0F\x51\110              KATMAI,SSE
    +xmmreg,mem32          \333\2\x0F\x51\110              KATMAI,SSE
    +xmmreg,xmmreg         \333\2\x0F\x51\110              KATMAI,SSE
     
     [STMXCSR]
     (Ch_All)
    @@ -2479,11 +2492,13 @@
     
     [SUBSS]
     (Ch_Mop2, Ch_Rop1)
    -xmmreg,xmmrm          \333\2\x0F\x5C\110              KATMAI,SSE
    +xmmreg,mem32          \333\2\x0F\x5C\110              KATMAI,SSE
    +xmmreg,xmmreg         \333\2\x0F\x5C\110              KATMAI,SSE
     
     [UCOMISS]
     (Ch_Rop1, Ch_Rop2, Ch_WZeroFlag, Ch_WParityFlag, Ch_WCarryFlag, Ch_W0OverflowFlag, Ch_W0SignFlag, Ch_W0AuxiliaryFlag)
    -xmmreg,xmmrm          \2\x0F\x2E\110                  KATMAI,SSE
    +xmmreg,mem32          \2\x0F\x2E\110                  KATMAI,SSE
    +xmmreg,xmmreg         \2\x0F\x2E\110                  KATMAI,SSE
     
     [UNPCKHPS]
     (Ch_Mop2, Ch_Rop1)
    
  • assembler-sse2-instruction-fixes.patch (9,059 bytes)
    Index: compiler/x86/x86ins.dat
    ===================================================================
    --- compiler/x86/x86ins.dat	(revision 42196)
    +++ compiler/x86/x86ins.dat	(working copy)
    @@ -1121,8 +1116,9 @@
     ; Change flags aren't correct for the sse move, so it is handled as a special case in the compiler code
     (Ch_RWESI, Ch_WMemEDI, Ch_RWEDI, Ch_RDirFlag)
     void                  \325\1\xA5                      386
    -xmmreg,xmmrm          \334\2\x0F\x10\110              WILLAMETTE,SSE2
    -xmmrm,xmmreg          \334\2\x0F\x11\101              WILLAMETTE,SSE2
    +xmmreg,mem64          \334\2\x0F\x10\110              WILLAMETTE,SSE2
    +xmmreg,xmmreg         \334\2\x0F\x10\110              WILLAMETTE,SSE2
    +mem64,xmmreg          \334\2\x0F\x11\101              WILLAMETTE,SSE2
     
     [MOVSQ]
     (Ch_RWRSI, Ch_WMemEDI, Ch_RWRDI, Ch_RDirFlag)
    @@ -2746,7 +2761,8 @@
     
     [ADDSD]
     (Ch_Mop2, Ch_Rop1)
    -xmmreg,xmmrm            \334\2\x0F\x58\110              WILLAMETTE,SSE2
    +xmmreg,mem64            \334\2\x0F\x58\110              WILLAMETTE,SSE2
    +xmmreg,xmmreg           \334\2\x0F\x58\110              WILLAMETTE,SSE2
     
     [ANDNPD]
     (Ch_Mop2, Ch_Rop1)
    @@ -2763,7 +2779,8 @@
     ; note: no SM flag on CMPxxSD, they use 64-bit memory location, not 128-bit
     [CMPEQSD]
     (Ch_All)
    -xmmreg,xmmrm            \334\2\x0F\xC2\110\1\x00        WILLAMETTE,SSE2
    +xmmreg,mem64            \334\2\x0F\xC2\110\1\x00        WILLAMETTE,SSE2
    +xmmreg,xmmreg           \334\2\x0F\xC2\110\1\x00        WILLAMETTE,SSE2
     
     [CMPLEPD]
     (Ch_All)
    @@ -2771,7 +2788,8 @@
     
     [CMPLESD]
     (Ch_All)
    -xmmreg,xmmrm            \334\2\x0F\xC2\110\1\x02        WILLAMETTE,SSE2
    +xmmreg,mem64            \334\2\x0F\xC2\110\1\x02        WILLAMETTE,SSE2
    +xmmreg,xmmreg           \334\2\x0F\xC2\110\1\x02        WILLAMETTE,SSE2
     
     [CMPLTPD]
     (Ch_All)
    @@ -2779,7 +2797,8 @@
     
     [CMPLTSD]
     (Ch_All)
    -xmmreg,xmmrm            \334\2\x0F\xC2\110\1\x01        WILLAMETTE,SSE2
    +xmmreg,mem64            \334\2\x0F\xC2\110\1\x01        WILLAMETTE,SSE2
    +xmmreg,xmmreg           \334\2\x0F\xC2\110\1\x01        WILLAMETTE,SSE2
     
     [CMPNEQPD]
     (Ch_All)
    @@ -2787,7 +2806,8 @@
     
     [CMPNEQSD]
     (Ch_All)
    -xmmreg,xmmrm            \334\2\x0F\xC2\110\1\x04        WILLAMETTE,SSE2
    +xmmreg,mem64            \334\2\x0F\xC2\110\1\x04        WILLAMETTE,SSE2
    +xmmreg,xmmreg           \334\2\x0F\xC2\110\1\x04        WILLAMETTE,SSE2
     
     [CMPNLEPD]
     (Ch_All)
    @@ -2795,7 +2815,8 @@
     
     [CMPNLESD]
     (Ch_All)
    -xmmreg,xmmrm            \334\2\x0F\xC2\110\1\x06        WILLAMETTE,SSE2
    +xmmreg,mem64            \334\2\x0F\xC2\110\1\x06        WILLAMETTE,SSE2
    +xmmreg,xmmreg           \334\2\x0F\xC2\110\1\x06        WILLAMETTE,SSE2
     
     [CMPNLTPD]
     (Ch_All)
    @@ -2803,7 +2824,8 @@
     
     [CMPNLTSD]
     (Ch_All)
    -xmmreg,xmmrm            \334\2\x0F\xC2\110\1\x05        WILLAMETTE,SSE2
    +xmmreg,mem64            \334\2\x0F\xC2\110\1\x05        WILLAMETTE,SSE2
    +xmmreg,xmmreg           \334\2\x0F\xC2\110\1\x05        WILLAMETTE,SSE2
     
     [CMPORDPD]
     (Ch_All)
    @@ -2811,7 +2833,8 @@
     
     [CMPORDSD]
     (Ch_All)
    -xmmreg,xmmrm            \334\2\x0F\xC2\110\1\x07        WILLAMETTE,SSE2
    +xmmreg,mem64            \334\2\x0F\xC2\110\1\x07        WILLAMETTE,SSE2
    +xmmreg,xmmreg           \334\2\x0F\xC2\110\1\x07        WILLAMETTE,SSE2
     
     [CMPUNORDPD]
     (Ch_All)
    @@ -2819,7 +2842,8 @@
     
     [CMPUNORDSD]
     (Ch_All)
    -xmmreg,xmmrm            \334\2\x0F\xC2\110\1\x03        WILLAMETTE,SSE2
    +xmmreg,mem64            \334\2\x0F\xC2\110\1\x03        WILLAMETTE,SSE2
    +xmmreg,xmmreg           \334\2\x0F\xC2\110\1\x03        WILLAMETTE,SSE2
     
     ; CMPPD/CMPSD must come after the specific ops; that way the disassembler will find the
     ; specific ops first and only disassemble illegal ones as cmppd/cmpsd.
    @@ -2827,14 +2851,21 @@
     (Ch_All)
     xmmreg,xmmrm,imm        \361\2\x0F\xC2\110\26           WILLAMETTE,SSE2,SM2,SB,AR2
     
    +[CMPSD,cmpsl]
    +(Ch_RWESI, Ch_RMemEDI, Ch_RWEDI, Ch_RDirFlag, Ch_WOverflowFlag, Ch_WSignFlag, Ch_WZeroFlag, Ch_WAuxiliaryFlag, Ch_WCarryFlag, Ch_WParityFlag)
    +void                  \332\325\1\xA7                    386
    +xmmreg,mem64,imm      \334\2\x0F\xC2\110\26             WILLAMETTE,SSE2,SM2,SB,AR2
    +xmmreg,xmmrm,imm      \334\2\x0F\xC2\110\26             WILLAMETTE,SSE2,SM2,SB,AR2
    +
     [COMISD]
     (Ch_Rop1, Ch_Rop2, Ch_WFlags)
    -xmmreg,xmmrm            \361\2\x0F\x2F\110              WILLAMETTE,SSE2
    +xmmreg,mem64            \361\2\x0F\x2F\110              WILLAMETTE,SSE2
    +xmmreg,xmmreg           \361\2\x0F\x2F\110              WILLAMETTE,SSE2
     
     [CVTDQ2PD]
     (Ch_Wop2, Ch_Rop1)
    -xmmreg,xmmreg            \333\2\x0F\xE6\110              WILLAMETTE,SSE2
    -xmmreg,mem64             \333\2\x0F\xE6\110              WILLAMETTE,SSE2
    +xmmreg,xmmreg            \333\2\x0F\xE6\110             WILLAMETTE,SSE2
    +xmmreg,mem64             \333\2\x0F\xE6\110             WILLAMETTE,SSE2
     
     [CVTDQ2PS]
     (Ch_Wop2, Ch_Rop1)
    @@ -2862,15 +2893,15 @@
     
     [CVTPS2PD]
     (Ch_Wop2, Ch_Rop1)
    -xmmreg,xmmreg            \2\x0F\x5A\110                WILLAMETTE,SSE2 ;,SQ
    -xmmreg,mem64             \2\x0F\x5A\110                WILLAMETTE,SSE2 ;,SQ
    +xmmreg,xmmreg            \2\x0F\x5A\110                 WILLAMETTE,SSE2 ;,SQ
    +xmmreg,mem64             \2\x0F\x5A\110                 WILLAMETTE,SSE2 ;,SQ
     
     [CVTSD2SI,cvtsd2siX]
     (Ch_Wop2, Ch_Rop1)
    -reg32,xmmreg 	        \334\2\x0F\x2D\110        WILLAMETTE,SSE2
    -reg32,mem64          	\334\2\x0F\x2D\110        WILLAMETTE,SSE2
    -reg64,xmmreg 	        \334\320\2\x0F\x2D\110        WILLAMETTE,SSE2,X86_64
    -reg64,mem64 	        \334\320\2\x0F\x2D\110        WILLAMETTE,SSE2,X86_64
    +reg32,xmmreg 	        \334\2\x0F\x2D\110              WILLAMETTE,SSE2
    +reg32,mem64          	\334\2\x0F\x2D\110              WILLAMETTE,SSE2
    +reg64,xmmreg 	        \334\320\2\x0F\x2D\110          WILLAMETTE,SSE2,X86_64
    +reg64,mem64 	        \334\320\2\x0F\x2D\110          WILLAMETTE,SSE2,X86_64
     
     [CVTSD2SS]
     (Ch_Wop2, Ch_Rop1)
    @@ -2884,8 +2915,8 @@
     
     [CVTSS2SD]
     (Ch_Wop2, Ch_Rop1)
    -xmmreg,xmmreg            \333\2\x0F\x5A\110            WILLAMETTE,SSE2 ;,SD
    -xmmreg,mem32             \333\2\x0F\x5A\110            WILLAMETTE,SSE2 ;,SD
    +xmmreg,xmmreg            \333\2\x0F\x5A\110             WILLAMETTE,SSE2 ;,SD
    +xmmreg,mem32             \333\2\x0F\x5A\110             WILLAMETTE,SSE2 ;,SD
     
     [CVTTPD2PI]
     (Ch_Wop2, Ch_Rop1)
    @@ -2910,7 +2941,8 @@
     
     [DIVSD]
     (Ch_Mop2, Ch_Rop1)
    -xmmreg,xmmrm            \334\2\x0F\x5E\110          WILLAMETTE,SSE2
    +xmmreg,mem64            \334\2\x0F\x5E\110          WILLAMETTE,SSE2
    +xmmreg,xmmreg           \334\2\x0F\x5E\110          WILLAMETTE,SSE2
     
     [MAXPD]
     (Ch_All)
    @@ -2918,7 +2950,8 @@
     
     [MAXSD]
     (Ch_All)
    -xmmreg,xmmrm            \334\2\x0F\x5F\110          WILLAMETTE,SSE2
    +xmmreg,mem64            \334\2\x0F\x5F\110          WILLAMETTE,SSE2
    +xmmreg,xmmreg           \334\2\x0F\x5F\110          WILLAMETTE,SSE2
     
     [MINPD]
     (Ch_All)
    @@ -2926,7 +2959,8 @@
     
     [MINSD]
     (Ch_All)
    -xmmreg,xmmrm            \334\2\x0F\x5D\110          WILLAMETTE,SSE2
    +xmmreg,mem64            \334\2\x0F\x5D\110          WILLAMETTE,SSE2
    +xmmreg,xmmreg           \334\2\x0F\x5D\110          WILLAMETTE,SSE2
     
     [MOVAPD]
     (Ch_ROp1, Ch_WOp2)
    @@ -2958,7 +2992,8 @@
     
     [MULSD]
     (Ch_Mop2, Ch_Rop1)
    -xmmreg,xmmrm            \334\2\x0F\x59\110        WILLAMETTE,SSE2
    +xmmreg,mem64            \334\2\x0F\x59\110        WILLAMETTE,SSE2
    +xmmreg,xmmreg           \334\2\x0F\x59\110        WILLAMETTE,SSE2
     
     [ORPD]
     (Ch_Mop2, Ch_Rop1)
    @@ -2974,9 +3009,9 @@
     
     [SQRTSD]
     (Ch_Wop2, Ch_Rop1)
    -xmmreg,xmmrm            \334\2\x0F\x51\110        WILLAMETTE,SSE2
    +xmmreg,mem64            \334\2\x0F\x51\110        WILLAMETTE,SSE2
    +xmmreg,xmmreg           \334\2\x0F\x51\110        WILLAMETTE,SSE2
     
    -
     [SUBPD]
     (Ch_Mop2, Ch_Rop1)
     xmmreg,xmmrm            \361\2\x0F\x5C\110        WILLAMETTE,SSE2,SM
    @@ -2983,11 +3018,13 @@
     
     [SUBSD]
     (Ch_Mop2, Ch_Rop1)
    -xmmreg,xmmrm            \334\2\x0F\x5C\110        WILLAMETTE,SSE2
    +xmmreg,mem64            \334\2\x0F\x5C\110        WILLAMETTE,SSE2
    +xmmreg,xmmreg           \334\2\x0F\x5C\110        WILLAMETTE,SSE2
     
     [UCOMISD]
     (Ch_Rop1, Ch_Rop2, Ch_WZeroFlag, Ch_WParityFlag, Ch_WCarryFlag, Ch_W0OverflowFlag, Ch_W0SignFlag, Ch_W0AuxiliaryFlag)
    -xmmreg,xmmrm            \361\2\x0F\x2E\110        WILLAMETTE,SSE2
    +xmmreg,mem64            \361\2\x0F\x2E\110        WILLAMETTE,SSE2
    +xmmreg,xmmreg           \361\2\x0F\x2E\110        WILLAMETTE,SSE2
     
     [UNPCKHPD]
     (Ch_All)
    @@ -3487,11 +3524,13 @@
     
     [ROUNDSS]
     (Ch_Wop2, Ch_Rop1)
    -xmmreg,xmmrm,imm      \361\3\x0F\x3A\x0A\110\26            SSE41,SM2,SB,AR2
    +xmmreg,mem32,imm      \361\3\x0F\x3A\x0A\110\26            SSE41,SM2,SB,AR2
    +xmmreg,xmmreg,imm     \361\3\x0F\x3A\x0A\110\26            SSE41,SM2,SB,AR2
     
     [ROUNDSD]
     (Ch_Wop2, Ch_Rop1)
    -xmmreg,xmmrm,imm      \361\3\x0F\x3A\x0B\110\26            SSE41,SM2,SB,AR2
    +xmmreg,mem64,imm      \361\3\x0F\x3A\x0B\110\26            SSE41,SM2,SB,AR2
    +xmmreg,xmmreg,imm     \361\3\x0F\x3A\x0B\110\26            SSE41,SM2,SB,AR2
     
     ;*******************************************************************************
     ;**********SSE4.2***************************************************************
    
  • assembler-avx-instruction-fixes.patch (8,577 bytes)
    Index: compiler/x86/x86ins.dat
    ===================================================================
    --- compiler/x86/x86ins.dat	(revision 42196)
    +++ compiler/x86/x86ins.dat	(working copy)
    @@ -4055,7 +4094,7 @@
     
     [VCMPSS]
     (Ch_All)
    -xmmreg,xmmreg,mem64,imm8             \333\362\370\1\xC2\75\120\27         AVX,SANDYBRIDGE
    +xmmreg,xmmreg,mem32,imm8             \333\362\370\1\xC2\75\120\27         AVX,SANDYBRIDGE
     xmmreg,xmmreg,xmmreg,imm8            \333\362\370\1\xC2\75\120\27         AVX,SANDYBRIDGE
     
     [VCOMISD]
    @@ -4407,7 +4446,6 @@
     (Ch_Wop2, Ch_Rop1)
     xmmreg,xmmreg,xmmreg                 \334\362\370\1\x10\75\120            AVX,SANDYBRIDGE
     xmmreg,mem64                         \334\362\370\1\x10\110               AVX,SANDYBRIDGE
    -xmmreg,xmmreg,xmmreg                 \334\362\370\1\x11\75\102            AVX,SANDYBRIDGE
     mem64,xmmreg                         \334\362\370\1\x11\101               AVX,SANDYBRIDGE
     
     [VMOVSHDUP]
    @@ -4425,7 +4463,6 @@
     (Ch_Wop2, Ch_Rop1)
     xmmreg,xmmreg,xmmreg                 \333\362\370\1\x10\75\120            AVX,SANDYBRIDGE
     xmmreg,mem32                         \333\362\370\1\x10\110               AVX,SANDYBRIDGE
    -xmmreg,xmmreg,xmmreg                 \333\362\370\1\x11\75\102            AVX,SANDYBRIDGE
     mem32,xmmreg                         \333\362\370\1\x11\101               AVX,SANDYBRIDGE
     
     [VMOVUPD]
    @@ -5641,27 +5678,33 @@
     
     [VFMADD132SD]
     (Ch_Mop3, Ch_Rop2, Ch_Rop1)
    -xmmreg,xmmreg,xmmrm                  \361\362\371\363\1\x99\75\120        FMA
    +xmmreg,xmmreg,mem64                  \361\362\371\363\1\x99\75\120        FMA
    +xmmreg,xmmreg,xmmreg                 \361\362\371\363\1\x99\75\120        FMA
     
     [VFMADD213SD]
     (Ch_Mop3, Ch_Rop2, Ch_Rop1)
    -xmmreg,xmmreg,xmmrm                  \361\362\371\363\1\xa9\75\120        FMA
    +xmmreg,xmmreg,mem64                  \361\362\371\363\1\xa9\75\120        FMA
    +xmmreg,xmmreg,xmmreg                 \361\362\371\363\1\xa9\75\120        FMA
     
     [VFMADD231SD]
     (Ch_Mop3, Ch_Rop2, Ch_Rop1)
    -xmmreg,xmmreg,xmmrm                  \361\362\371\363\1\xb9\75\120        FMA
    +xmmreg,xmmreg,mem64                  \361\362\371\363\1\xb9\75\120        FMA
    +xmmreg,xmmreg,xmmreg                 \361\362\371\363\1\xb9\75\120        FMA
     
     [VFMADD132SS]
     (Ch_Mop3, Ch_Rop2, Ch_Rop1)
    -xmmreg,xmmreg,xmmrm                  \361\362\371\1\x99\75\120            FMA
    +xmmreg,xmmreg,mem32                  \361\362\371\1\x99\75\120            FMA
    +xmmreg,xmmreg,xmmreg                 \361\362\371\1\x99\75\120            FMA
     
     [VFMADD213SS]
     (Ch_Mop3, Ch_Rop2, Ch_Rop1)
    -xmmreg,xmmreg,xmmrm                  \361\362\371\1\xA9\75\120            FMA
    +xmmreg,xmmreg,mem32                  \361\362\371\1\xA9\75\120            FMA
    +xmmreg,xmmreg,xmmreg                 \361\362\371\1\xA9\75\120            FMA
     
     [VFMADD231SS]
     (Ch_Mop3, Ch_Rop2, Ch_Rop1)
    -xmmreg,xmmreg,xmmrm                  \361\362\371\1\xb9\75\120            FMA
    +xmmreg,xmmreg,mem32                  \361\362\371\1\xb9\75\120            FMA
    +xmmreg,xmmreg,xmmreg                 \361\362\371\1\xb9\75\120            FMA
     
     [VFMADDSUB132PD]
     (Ch_Mop3, Ch_Rop2, Ch_Rop1)
    @@ -5755,27 +5798,33 @@
     
     [VFMSUB132SD]
     (Ch_Mop3, Ch_Rop2, Ch_Rop1)
    -xmmreg,xmmreg,xmmrm                  \361\362\371\363\1\x9B\75\120        FMA
    +xmmreg,xmmreg,mem64                  \361\362\371\363\1\x9B\75\120        FMA
    +xmmreg,xmmreg,xmmreg                 \361\362\371\363\1\x9B\75\120        FMA
     
     [VFMSUB213SD]
     (Ch_Mop3, Ch_Rop2, Ch_Rop1)
    -xmmreg,xmmreg,xmmrm                  \361\362\371\363\1\xAB\75\120        FMA
    +xmmreg,xmmreg,mem64                  \361\362\371\363\1\xAB\75\120        FMA
    +xmmreg,xmmreg,xmmreg                 \361\362\371\363\1\xAB\75\120        FMA
     
     [VFMSUB231SD]
     (Ch_Mop3, Ch_Rop2, Ch_Rop1)
    -xmmreg,xmmreg,xmmrm                  \361\362\371\363\1\xBB\75\120        FMA
    +xmmreg,xmmreg,mem64                  \361\362\371\363\1\xBB\75\120        FMA
    +xmmreg,xmmreg,xmmreg                 \361\362\371\363\1\xBB\75\120        FMA
     
     [VFMSUB132SS]
     (Ch_Mop3, Ch_Rop2, Ch_Rop1)
    -xmmreg,xmmreg,xmmrm                  \361\362\371\1\x9B\75\120            FMA
    +xmmreg,xmmreg,mem32                  \361\362\371\1\x9B\75\120            FMA
    +xmmreg,xmmreg,xmmreg                 \361\362\371\1\x9B\75\120            FMA
     
     [VFMSUB213SS]
     (Ch_Mop3, Ch_Rop2, Ch_Rop1)
    -xmmreg,xmmreg,xmmrm                  \361\362\371\1\xAB\75\120            FMA
    +xmmreg,xmmreg,mem32                  \361\362\371\1\xAB\75\120            FMA
    +xmmreg,xmmreg,xmmreg                 \361\362\371\1\xAB\75\120            FMA
     
     [VFMSUB231SS]
     (Ch_Mop3, Ch_Rop2, Ch_Rop1)
    -xmmreg,xmmreg,xmmrm                  \361\362\371\1\xBB\75\120            FMA
    +xmmreg,xmmreg,mem32                  \361\362\371\1\xBB\75\120            FMA
    +xmmreg,xmmreg,xmmreg                 \361\362\371\1\xBB\75\120            FMA
     
     [VFNMADD132PD]
     (Ch_Mop3, Ch_Rop2, Ch_Rop1)
    @@ -5809,27 +5858,33 @@
     
     [VFNMADD132SD]
     (Ch_Mop3, Ch_Rop2, Ch_Rop1)
    -xmmreg,xmmreg,xmmrm                  \361\362\371\363\1\x9D\75\120        FMA
    +xmmreg,xmmreg,mem64                  \361\362\371\363\1\x9D\75\120        FMA
    +xmmreg,xmmreg,xmmreg                 \361\362\371\363\1\x9D\75\120        FMA
     
     [VFNMADD213SD]
     (Ch_Mop3, Ch_Rop2, Ch_Rop1)
    -xmmreg,xmmreg,xmmrm                  \361\362\371\363\1\xAD\75\120        FMA
    +xmmreg,xmmreg,mem64                  \361\362\371\363\1\xAD\75\120        FMA
    +xmmreg,xmmreg,xmmreg                 \361\362\371\363\1\xAD\75\120        FMA
     
     [VFNMADD231SD]
     (Ch_Mop3, Ch_Rop2, Ch_Rop1)
    -xmmreg,xmmreg,xmmrm                  \361\362\371\363\1\xBD\75\120        FMA
    +xmmreg,xmmreg,mem64                  \361\362\371\363\1\xBD\75\120        FMA
    +xmmreg,xmmreg,xmmreg                 \361\362\371\363\1\xBD\75\120        FMA
     
     [VFNMADD132SS]
     (Ch_Mop3, Ch_Rop2, Ch_Rop1)
    -xmmreg,xmmreg,xmmrm                  \361\362\371\1\x9D\75\120            FMA
    +xmmreg,xmmreg,mem32                  \361\362\371\1\x9D\75\120            FMA
    +xmmreg,xmmreg,xmmreg                 \361\362\371\1\x9D\75\120            FMA
     
     [VFNMADD213SS]
     (Ch_Mop3, Ch_Rop2, Ch_Rop1)
    -xmmreg,xmmreg,xmmrm                  \361\362\371\1\xAD\75\120            FMA
    +xmmreg,xmmreg,mem32                  \361\362\371\1\xAD\75\120            FMA
    +xmmreg,xmmreg,xmmreg                 \361\362\371\1\xAD\75\120            FMA
     
     [VFNMADD231SS]
     (Ch_Mop3, Ch_Rop2, Ch_Rop1)
    -xmmreg,xmmreg,xmmrm                  \361\362\371\1\xBD\75\120            FMA
    +xmmreg,xmmreg,mem32                  \361\362\371\1\xBD\75\120            FMA
    +xmmreg,xmmreg,xmmreg                 \361\362\371\1\xBD\75\120            FMA
     
     [VFNMSUB132PD]
     (Ch_Mop3, Ch_Rop2, Ch_Rop1)
    @@ -5863,27 +5918,33 @@
     
     [VFNMSUB132SD]
     (Ch_Mop3, Ch_Rop2, Ch_Rop1)
    -xmmreg,xmmreg,xmmrm                  \361\362\371\363\1\x9F\75\120        FMA
    +xmmreg,xmmreg,mem64                  \361\362\371\363\1\x9F\75\120        FMA
    +xmmreg,xmmreg,xmmreg                 \361\362\371\363\1\x9F\75\120        FMA
     
     [VFNMSUB213SD]
     (Ch_Mop3, Ch_Rop2, Ch_Rop1)
    -xmmreg,xmmreg,xmmrm                  \361\362\371\363\1\xAF\75\120        FMA
    +xmmreg,xmmreg,mem64                  \361\362\371\363\1\xAF\75\120        FMA
    +xmmreg,xmmreg,xmmreg                 \361\362\371\363\1\xAF\75\120        FMA
     
     [VFNMSUB231SD]
     (Ch_Mop3, Ch_Rop2, Ch_Rop1)
    -xmmreg,xmmreg,xmmrm                  \361\362\371\363\1\xBF\75\120        FMA
    +xmmreg,xmmreg,mem64                  \361\362\371\363\1\xBF\75\120        FMA
    +xmmreg,xmmreg,xmmreg                 \361\362\371\363\1\xBF\75\120        FMA
     
     [VFNMSUB132SS]
     (Ch_Mop3, Ch_Rop2, Ch_Rop1)
    -xmmreg,xmmreg,xmmrm                  \361\362\371\1\x9F\75\120            FMA
    +xmmreg,xmmreg,mem32                  \361\362\371\1\x9F\75\120            FMA
    +xmmreg,xmmreg,xmmreg                 \361\362\371\1\x9F\75\120            FMA
     
     [VFNMSUB213SS]
     (Ch_Mop3, Ch_Rop2, Ch_Rop1)
    -xmmreg,xmmreg,xmmrm                  \361\362\371\1\xAF\75\120            FMA
    +xmmreg,xmmreg,mem32                  \361\362\371\1\xAF\75\120            FMA
    +xmmreg,xmmreg,xmmreg                 \361\362\371\1\xAF\75\120            FMA
     
     [VFNMSUB231SS]
     (Ch_Mop3, Ch_Rop2, Ch_Rop1)
    -xmmreg,xmmreg,xmmrm                  \361\362\371\1\xBF\75\120            FMA
    +xmmreg,xmmreg,mem32                  \361\362\371\1\xBF\75\120            FMA
    +xmmreg,xmmreg,xmmreg                 \361\362\371\1\xBF\75\120            FMA
     
     ;*******************************************************************************
     ;********** TSX ****************************************************************
    
  • assembler-operand-fix.patch (2,211 bytes)
    Index: compiler/x86/rax86.pas
    ===================================================================
    --- compiler/x86/rax86.pas	(revision 42196)
    +++ compiler/x86/rax86.pas	(working copy)
    @@ -239,11 +239,13 @@
               ;
           end;
         end;
    +{$ifndef x86_64}
       end
       else
         begin
           if size=OS_64 then
             opsize:=S_Q;
    +{$endif not x86_64}
         end;
     end;
     
    @@ -530,7 +532,22 @@
                     ;
                 end;
     
    -            if memopsize = 0 then memopsize := topsize2memsize[tx86operand(operands[i]).opsize];
    +            if memopsize = 0 then
    +              begin
    +{$ifdef i386}
    +                { 64-bit operands are allowed for SSE and AVX instructions, so
    +                  go by the byte size instead for these families of opcodes }
    +                if (MemRefInfo(opcode).ExistsSSEAVX) then
    +                  begin
    +                    memopsize := tx86operand(operands[i]).typesize * 8;
    +                    if tx86operand(operands[i]).typesize = 8 then
    +                      { Will be S_L otherwise and won't be corrected in time }
    +                      tx86operand(operands[i]).opsize := S_Q;
    +                  end
    +                else
    +{$endif i386}
    +                  memopsize := topsize2memsize[tx86operand(operands[i]).opsize];
    +              end;
     
                 if (memopsize > 0) and
                    (memrefsize > 0) then
    @@ -1398,12 +1415,12 @@
                          asize:=OT_BITS32;
                        OS_64,OS_S64:
                          begin
    -                       { Only FPU operations know about 64bit values, for all
    -                         integer operations it is seen as 32bit
    +                       { Only FPU and SSE/AVX operations know about 64bit
    +                         values, for all integer operations it is seen as 32bit
     
                              this applies only to i386, see tw16622}
     
    -                       if gas_needsuffix[opcode] in [attsufFPU,attsufFPUint] then
    +                       if (gas_needsuffix[opcode] in [attsufFPU,attsufFPUint]) or (MemRefInfo(opcode).ExistsSSEAVX) then
                              asize:=OT_BITS64
     {$ifdef i386}
                            else
    

Relationships

parent of 0032219 feedbackFlorian AVX addition does not compile 
parent of 0035701 resolvedJonas Maebe [Test] Test "tests/webtbs/tw13294" is possibly invalid 
Not all the children of this issue are yet resolved or closed.

Activities

J. Gareth Moreton

2019-06-10 22:35

developer  

assembler-sse-instruction-fixes.patch (7,026 bytes)
Index: compiler/x86/x86ins.dat
===================================================================
--- compiler/x86/x86ins.dat	(revision 42196)
+++ compiler/x86/x86ins.dat	(working copy)
@@ -204,11 +204,6 @@
 (Ch_RWESI, Ch_RMemEDI, Ch_RWEDI, Ch_RDirFlag, Ch_WOverflowFlag, Ch_WSignFlag, Ch_WZeroFlag, Ch_WAuxiliaryFlag, Ch_WCarryFlag, Ch_WParityFlag)
 void                  \332\1\xA6                      8086
 
-[CMPSD,cmpsl]
-(Ch_RWESI, Ch_RMemEDI, Ch_RWEDI, Ch_RDirFlag, Ch_WOverflowFlag, Ch_WSignFlag, Ch_WZeroFlag, Ch_WAuxiliaryFlag, Ch_WCarryFlag, Ch_WParityFlag)
-void                  \332\325\1\xA7                  386
-xmmreg,xmmrm,imm      \334\2\x0F\xC2\110\26           WILLAMETTE,SSE2,SM2,SB,AR2
-
 [CMPSW]
 (Ch_RWESI, Ch_RMemEDI, Ch_RWEDI, Ch_RDirFlag, Ch_WOverflowFlag, Ch_WSignFlag, Ch_WZeroFlag, Ch_WAuxiliaryFlag, Ch_WCarryFlag, Ch_WParityFlag)
 void                  \332\324\1\xA7                  8086
@@ -2238,7 +2234,8 @@
 
 [ADDSS]
 (Ch_Mop2, Ch_Rop1)
-xmmreg,xmmrm          \333\2\x0F\x58\110              KATMAI,SSE
+xmmreg,mem32          \333\2\x0F\x58\110              KATMAI,SSE
+xmmreg,xmmreg         \333\2\x0F\x58\110              KATMAI,SSE
 
 [ANDNPS]
 (Ch_Mop2, Ch_Rop1)
@@ -2254,7 +2251,8 @@
 
 [CMPEQSS]
 (Ch_All)
-xmmreg,xmmrm          \333\2\x0F\xC2\110\1\x00        KATMAI,SSE
+xmmreg,mem32          \333\2\x0F\xC2\110\1\x00        KATMAI,SSE
+xmmreg,xmmreg         \333\2\x0F\xC2\110\1\x00        KATMAI,SSE
 
 [CMPLEPS]
 (Ch_All)
@@ -2262,7 +2260,8 @@
 
 [CMPLESS]
 (Ch_All)
-xmmreg,xmmrm          \333\2\x0F\xC2\110\1\x02        KATMAI,SSE
+xmmreg,mem32          \333\2\x0F\xC2\110\1\x02        KATMAI,SSE
+xmmreg,xmmreg         \333\2\x0F\xC2\110\1\x02        KATMAI,SSE
 
 [CMPLTPS]
 (Ch_All)
@@ -2270,7 +2269,8 @@
 
 [CMPLTSS]
 (Ch_All)
-xmmreg,xmmrm          \333\2\x0F\xC2\110\1\x01        KATMAI,SSE
+xmmreg,mem32          \333\2\x0F\xC2\110\1\x01        KATMAI,SSE
+xmmreg,xmmreg         \333\2\x0F\xC2\110\1\x01        KATMAI,SSE
 
 [CMPNEQPS]
 (Ch_All)
@@ -2278,7 +2278,8 @@
 
 [CMPNEQSS]
 (Ch_All)
-xmmreg,xmmrm          \333\2\x0F\xC2\110\1\x04        KATMAI,SSE
+xmmreg,mem32          \333\2\x0F\xC2\110\1\x04        KATMAI,SSE
+xmmreg,xmmreg         \333\2\x0F\xC2\110\1\x04        KATMAI,SSE
 
 [CMPNLEPS]
 (Ch_All)
@@ -2286,7 +2287,8 @@
 
 [CMPNLESS]
 (Ch_All)
-xmmreg,xmmrm          \333\2\x0F\xC2\110\1\x06        KATMAI,SSE
+xmmreg,mem32          \333\2\x0F\xC2\110\1\x06        KATMAI,SSE
+xmmreg,xmmreg         \333\2\x0F\xC2\110\1\x06        KATMAI,SSE
 
 [CMPNLTPS]
 (Ch_All)
@@ -2294,7 +2296,8 @@
 
 [CMPNLTSS]
 (Ch_All)
-xmmreg,xmmrm          \333\2\x0F\xC2\110\1\x05        KATMAI,SSE
+xmmreg,mem32          \333\2\x0F\xC2\110\1\x05        KATMAI,SSE
+xmmreg,xmmreg         \333\2\x0F\xC2\110\1\x05        KATMAI,SSE
 
 [CMPORDPS]
 (Ch_All)
@@ -2302,7 +2305,8 @@
 
 [CMPORDSS]
 (Ch_All)
-xmmreg,xmmrm          \333\2\x0F\xC2\110\1\x07        KATMAI,SSE
+xmmreg,mem32          \333\2\x0F\xC2\110\1\x07        KATMAI,SSE
+xmmreg,xmmreg         \333\2\x0F\xC2\110\1\x07        KATMAI,SSE
 
 [CMPUNORDPS]
 (Ch_All)
@@ -2310,7 +2314,8 @@
 
 [CMPUNORDSS]
 (Ch_All)
-xmmreg,xmmrm          \333\2\x0F\xC2\110\1\x03        KATMAI,SSE
+xmmreg,mem32          \333\2\x0F\xC2\110\1\x03        KATMAI,SSE
+xmmreg,xmmreg         \333\2\x0F\xC2\110\1\x03        KATMAI,SSE
 
 ;
 ; CMPPS/CMPSS must come after the specific ops; that way the disassembler will find the
@@ -2323,15 +2328,17 @@
 
 [CMPSS]
 (Ch_All)
-xmmreg,xmmrm,imm      \333\2\x0F\xC2\110\22           KATMAI,SSE,SB,AR2
+xmmreg,mem32,imm      \333\2\x0F\xC2\110\22           KATMAI,SSE,SB,AR2
+xmmreg,xmmreg,imm     \333\2\x0F\xC2\110\22           KATMAI,SSE,SB,AR2
 
 [COMISS]
 (Ch_Rop1, Ch_Rop2, Ch_WFlags)
-xmmreg,xmmrm          \2\x0F\x2F\110                  KATMAI,SSE
+xmmreg,mem32          \2\x0F\x2F\110                  KATMAI,SSE
+xmmreg,xmmreg         \2\x0F\x2F\110                  KATMAI,SSE
 
 [CVTPI2PS]
 (Ch_Wop2, Ch_Rop1)
-xmmreg,mmxrm         \331\2\x0F\x2A\110             KATMAI,SSE,MMX
+xmmreg,mmxrm          \331\2\x0F\x2A\110              KATMAI,SSE,MMX
 
 [CVTPS2PI]
 (Ch_Wop2, Ch_Rop1)
@@ -2364,7 +2371,8 @@
 
 [DIVSS]
 (Ch_Mop2, Ch_Rop1)
-xmmreg,xmmrm          \333\2\x0F\x5E\110              KATMAI,SSE
+xmmreg,mem32          \333\2\x0F\x5E\110              KATMAI,SSE
+xmmreg,xmmreg         \333\2\x0F\x5E\110              KATMAI,SSE
 
 [LDMXCSR]
 (Ch_All)
@@ -2376,7 +2384,8 @@
 
 [MAXSS]
 (Ch_All)
-xmmreg,xmmrm          \333\2\x0F\x5F\110              KATMAI,SSE
+xmmreg,mem32          \333\2\x0F\x5F\110              KATMAI,SSE
+xmmreg,xmmreg         \333\2\x0F\x5F\110              KATMAI,SSE
 
 [MINPS]
 (Ch_All)
@@ -2384,7 +2393,8 @@
 
 [MINSS]
 (Ch_All)
-xmmreg,xmmrm          \333\2\x0F\x5D\110              KATMAI,SSE
+xmmreg,mem32          \333\2\x0F\x5D\110              KATMAI,SSE
+xmmreg,xmmreg         \333\2\x0F\x5D\110              KATMAI,SSE
 
 [MOVAPS]
 (Ch_ROp1, Ch_WOp2)
@@ -2421,7 +2431,6 @@
 (Ch_Wop2, Ch_Rop1)
 xmmreg,xmmreg         \333\2\x0F\x10\110              KATMAI,SSE
 xmmreg,mem32          \333\2\x0F\x10\110              KATMAI,SSE
-xmmreg,xmmreg         \333\2\x0F\x11\101              KATMAI,SSE
 mem32,xmmreg          \333\2\x0F\x11\101              KATMAI,SSE
 
 [MOVUPS]
@@ -2435,7 +2444,8 @@
 
 [MULSS]
 (Ch_Mop2, Ch_Rop1)
-xmmreg,xmmrm          \333\2\x0F\x59\110              KATMAI,SSE
+xmmreg,mem32          \333\2\x0F\x59\110              KATMAI,SSE
+xmmreg,xmmreg         \333\2\x0F\x59\110              KATMAI,SSE
 
 [ORPS]
 (Ch_Mop2, Ch_Rop1)
@@ -2447,7 +2457,8 @@
 
 [RCPSS]
 (Ch_Wop2, Ch_Rop1)
-xmmreg,xmmrm          \333\2\x0F\x53\110              KATMAI,SSE
+xmmreg,mem32          \333\2\x0F\x53\110              KATMAI,SSE
+xmmreg,xmmreg         \333\2\x0F\x53\110              KATMAI,SSE
 
 [RSQRTPS]
 (Ch_Wop2, Ch_Rop1)
@@ -2455,7 +2466,8 @@
 
 [RSQRTSS]
 (Ch_Wop2, Ch_Rop1)
-xmmreg,xmmrm          \333\2\x0F\x52\110              KATMAI,SSE
+xmmreg,mem32          \333\2\x0F\x52\110              KATMAI,SSE
+xmmreg,xmmreg         \333\2\x0F\x52\110              KATMAI,SSE
 
 [SHUFPS]
 (Ch_Mop3, Ch_Rop2)
@@ -2467,7 +2479,8 @@
 
 [SQRTSS]
 (Ch_Wop2, Ch_Rop1)
-xmmreg,xmmrm          \333\2\x0F\x51\110              KATMAI,SSE
+xmmreg,mem32          \333\2\x0F\x51\110              KATMAI,SSE
+xmmreg,xmmreg         \333\2\x0F\x51\110              KATMAI,SSE
 
 [STMXCSR]
 (Ch_All)
@@ -2479,11 +2492,13 @@
 
 [SUBSS]
 (Ch_Mop2, Ch_Rop1)
-xmmreg,xmmrm          \333\2\x0F\x5C\110              KATMAI,SSE
+xmmreg,mem32          \333\2\x0F\x5C\110              KATMAI,SSE
+xmmreg,xmmreg         \333\2\x0F\x5C\110              KATMAI,SSE
 
 [UCOMISS]
 (Ch_Rop1, Ch_Rop2, Ch_WZeroFlag, Ch_WParityFlag, Ch_WCarryFlag, Ch_W0OverflowFlag, Ch_W0SignFlag, Ch_W0AuxiliaryFlag)
-xmmreg,xmmrm          \2\x0F\x2E\110                  KATMAI,SSE
+xmmreg,mem32          \2\x0F\x2E\110                  KATMAI,SSE
+xmmreg,xmmreg         \2\x0F\x2E\110                  KATMAI,SSE
 
 [UNPCKHPS]
 (Ch_Mop2, Ch_Rop1)
assembler-sse2-instruction-fixes.patch (9,059 bytes)
Index: compiler/x86/x86ins.dat
===================================================================
--- compiler/x86/x86ins.dat	(revision 42196)
+++ compiler/x86/x86ins.dat	(working copy)
@@ -1121,8 +1116,9 @@
 ; Change flags aren't correct for the sse move, so it is handled as a special case in the compiler code
 (Ch_RWESI, Ch_WMemEDI, Ch_RWEDI, Ch_RDirFlag)
 void                  \325\1\xA5                      386
-xmmreg,xmmrm          \334\2\x0F\x10\110              WILLAMETTE,SSE2
-xmmrm,xmmreg          \334\2\x0F\x11\101              WILLAMETTE,SSE2
+xmmreg,mem64          \334\2\x0F\x10\110              WILLAMETTE,SSE2
+xmmreg,xmmreg         \334\2\x0F\x10\110              WILLAMETTE,SSE2
+mem64,xmmreg          \334\2\x0F\x11\101              WILLAMETTE,SSE2
 
 [MOVSQ]
 (Ch_RWRSI, Ch_WMemEDI, Ch_RWRDI, Ch_RDirFlag)
@@ -2746,7 +2761,8 @@
 
 [ADDSD]
 (Ch_Mop2, Ch_Rop1)
-xmmreg,xmmrm            \334\2\x0F\x58\110              WILLAMETTE,SSE2
+xmmreg,mem64            \334\2\x0F\x58\110              WILLAMETTE,SSE2
+xmmreg,xmmreg           \334\2\x0F\x58\110              WILLAMETTE,SSE2
 
 [ANDNPD]
 (Ch_Mop2, Ch_Rop1)
@@ -2763,7 +2779,8 @@
 ; note: no SM flag on CMPxxSD, they use 64-bit memory location, not 128-bit
 [CMPEQSD]
 (Ch_All)
-xmmreg,xmmrm            \334\2\x0F\xC2\110\1\x00        WILLAMETTE,SSE2
+xmmreg,mem64            \334\2\x0F\xC2\110\1\x00        WILLAMETTE,SSE2
+xmmreg,xmmreg           \334\2\x0F\xC2\110\1\x00        WILLAMETTE,SSE2
 
 [CMPLEPD]
 (Ch_All)
@@ -2771,7 +2788,8 @@
 
 [CMPLESD]
 (Ch_All)
-xmmreg,xmmrm            \334\2\x0F\xC2\110\1\x02        WILLAMETTE,SSE2
+xmmreg,mem64            \334\2\x0F\xC2\110\1\x02        WILLAMETTE,SSE2
+xmmreg,xmmreg           \334\2\x0F\xC2\110\1\x02        WILLAMETTE,SSE2
 
 [CMPLTPD]
 (Ch_All)
@@ -2779,7 +2797,8 @@
 
 [CMPLTSD]
 (Ch_All)
-xmmreg,xmmrm            \334\2\x0F\xC2\110\1\x01        WILLAMETTE,SSE2
+xmmreg,mem64            \334\2\x0F\xC2\110\1\x01        WILLAMETTE,SSE2
+xmmreg,xmmreg           \334\2\x0F\xC2\110\1\x01        WILLAMETTE,SSE2
 
 [CMPNEQPD]
 (Ch_All)
@@ -2787,7 +2806,8 @@
 
 [CMPNEQSD]
 (Ch_All)
-xmmreg,xmmrm            \334\2\x0F\xC2\110\1\x04        WILLAMETTE,SSE2
+xmmreg,mem64            \334\2\x0F\xC2\110\1\x04        WILLAMETTE,SSE2
+xmmreg,xmmreg           \334\2\x0F\xC2\110\1\x04        WILLAMETTE,SSE2
 
 [CMPNLEPD]
 (Ch_All)
@@ -2795,7 +2815,8 @@
 
 [CMPNLESD]
 (Ch_All)
-xmmreg,xmmrm            \334\2\x0F\xC2\110\1\x06        WILLAMETTE,SSE2
+xmmreg,mem64            \334\2\x0F\xC2\110\1\x06        WILLAMETTE,SSE2
+xmmreg,xmmreg           \334\2\x0F\xC2\110\1\x06        WILLAMETTE,SSE2
 
 [CMPNLTPD]
 (Ch_All)
@@ -2803,7 +2824,8 @@
 
 [CMPNLTSD]
 (Ch_All)
-xmmreg,xmmrm            \334\2\x0F\xC2\110\1\x05        WILLAMETTE,SSE2
+xmmreg,mem64            \334\2\x0F\xC2\110\1\x05        WILLAMETTE,SSE2
+xmmreg,xmmreg           \334\2\x0F\xC2\110\1\x05        WILLAMETTE,SSE2
 
 [CMPORDPD]
 (Ch_All)
@@ -2811,7 +2833,8 @@
 
 [CMPORDSD]
 (Ch_All)
-xmmreg,xmmrm            \334\2\x0F\xC2\110\1\x07        WILLAMETTE,SSE2
+xmmreg,mem64            \334\2\x0F\xC2\110\1\x07        WILLAMETTE,SSE2
+xmmreg,xmmreg           \334\2\x0F\xC2\110\1\x07        WILLAMETTE,SSE2
 
 [CMPUNORDPD]
 (Ch_All)
@@ -2819,7 +2842,8 @@
 
 [CMPUNORDSD]
 (Ch_All)
-xmmreg,xmmrm            \334\2\x0F\xC2\110\1\x03        WILLAMETTE,SSE2
+xmmreg,mem64            \334\2\x0F\xC2\110\1\x03        WILLAMETTE,SSE2
+xmmreg,xmmreg           \334\2\x0F\xC2\110\1\x03        WILLAMETTE,SSE2
 
 ; CMPPD/CMPSD must come after the specific ops; that way the disassembler will find the
 ; specific ops first and only disassemble illegal ones as cmppd/cmpsd.
@@ -2827,14 +2851,21 @@
 (Ch_All)
 xmmreg,xmmrm,imm        \361\2\x0F\xC2\110\26           WILLAMETTE,SSE2,SM2,SB,AR2
 
+[CMPSD,cmpsl]
+(Ch_RWESI, Ch_RMemEDI, Ch_RWEDI, Ch_RDirFlag, Ch_WOverflowFlag, Ch_WSignFlag, Ch_WZeroFlag, Ch_WAuxiliaryFlag, Ch_WCarryFlag, Ch_WParityFlag)
+void                  \332\325\1\xA7                    386
+xmmreg,mem64,imm      \334\2\x0F\xC2\110\26             WILLAMETTE,SSE2,SM2,SB,AR2
+xmmreg,xmmrm,imm      \334\2\x0F\xC2\110\26             WILLAMETTE,SSE2,SM2,SB,AR2
+
 [COMISD]
 (Ch_Rop1, Ch_Rop2, Ch_WFlags)
-xmmreg,xmmrm            \361\2\x0F\x2F\110              WILLAMETTE,SSE2
+xmmreg,mem64            \361\2\x0F\x2F\110              WILLAMETTE,SSE2
+xmmreg,xmmreg           \361\2\x0F\x2F\110              WILLAMETTE,SSE2
 
 [CVTDQ2PD]
 (Ch_Wop2, Ch_Rop1)
-xmmreg,xmmreg            \333\2\x0F\xE6\110              WILLAMETTE,SSE2
-xmmreg,mem64             \333\2\x0F\xE6\110              WILLAMETTE,SSE2
+xmmreg,xmmreg            \333\2\x0F\xE6\110             WILLAMETTE,SSE2
+xmmreg,mem64             \333\2\x0F\xE6\110             WILLAMETTE,SSE2
 
 [CVTDQ2PS]
 (Ch_Wop2, Ch_Rop1)
@@ -2862,15 +2893,15 @@
 
 [CVTPS2PD]
 (Ch_Wop2, Ch_Rop1)
-xmmreg,xmmreg            \2\x0F\x5A\110                WILLAMETTE,SSE2 ;,SQ
-xmmreg,mem64             \2\x0F\x5A\110                WILLAMETTE,SSE2 ;,SQ
+xmmreg,xmmreg            \2\x0F\x5A\110                 WILLAMETTE,SSE2 ;,SQ
+xmmreg,mem64             \2\x0F\x5A\110                 WILLAMETTE,SSE2 ;,SQ
 
 [CVTSD2SI,cvtsd2siX]
 (Ch_Wop2, Ch_Rop1)
-reg32,xmmreg 	        \334\2\x0F\x2D\110        WILLAMETTE,SSE2
-reg32,mem64          	\334\2\x0F\x2D\110        WILLAMETTE,SSE2
-reg64,xmmreg 	        \334\320\2\x0F\x2D\110        WILLAMETTE,SSE2,X86_64
-reg64,mem64 	        \334\320\2\x0F\x2D\110        WILLAMETTE,SSE2,X86_64
+reg32,xmmreg 	        \334\2\x0F\x2D\110              WILLAMETTE,SSE2
+reg32,mem64          	\334\2\x0F\x2D\110              WILLAMETTE,SSE2
+reg64,xmmreg 	        \334\320\2\x0F\x2D\110          WILLAMETTE,SSE2,X86_64
+reg64,mem64 	        \334\320\2\x0F\x2D\110          WILLAMETTE,SSE2,X86_64
 
 [CVTSD2SS]
 (Ch_Wop2, Ch_Rop1)
@@ -2884,8 +2915,8 @@
 
 [CVTSS2SD]
 (Ch_Wop2, Ch_Rop1)
-xmmreg,xmmreg            \333\2\x0F\x5A\110            WILLAMETTE,SSE2 ;,SD
-xmmreg,mem32             \333\2\x0F\x5A\110            WILLAMETTE,SSE2 ;,SD
+xmmreg,xmmreg            \333\2\x0F\x5A\110             WILLAMETTE,SSE2 ;,SD
+xmmreg,mem32             \333\2\x0F\x5A\110             WILLAMETTE,SSE2 ;,SD
 
 [CVTTPD2PI]
 (Ch_Wop2, Ch_Rop1)
@@ -2910,7 +2941,8 @@
 
 [DIVSD]
 (Ch_Mop2, Ch_Rop1)
-xmmreg,xmmrm            \334\2\x0F\x5E\110          WILLAMETTE,SSE2
+xmmreg,mem64            \334\2\x0F\x5E\110          WILLAMETTE,SSE2
+xmmreg,xmmreg           \334\2\x0F\x5E\110          WILLAMETTE,SSE2
 
 [MAXPD]
 (Ch_All)
@@ -2918,7 +2950,8 @@
 
 [MAXSD]
 (Ch_All)
-xmmreg,xmmrm            \334\2\x0F\x5F\110          WILLAMETTE,SSE2
+xmmreg,mem64            \334\2\x0F\x5F\110          WILLAMETTE,SSE2
+xmmreg,xmmreg           \334\2\x0F\x5F\110          WILLAMETTE,SSE2
 
 [MINPD]
 (Ch_All)
@@ -2926,7 +2959,8 @@
 
 [MINSD]
 (Ch_All)
-xmmreg,xmmrm            \334\2\x0F\x5D\110          WILLAMETTE,SSE2
+xmmreg,mem64            \334\2\x0F\x5D\110          WILLAMETTE,SSE2
+xmmreg,xmmreg           \334\2\x0F\x5D\110          WILLAMETTE,SSE2
 
 [MOVAPD]
 (Ch_ROp1, Ch_WOp2)
@@ -2958,7 +2992,8 @@
 
 [MULSD]
 (Ch_Mop2, Ch_Rop1)
-xmmreg,xmmrm            \334\2\x0F\x59\110        WILLAMETTE,SSE2
+xmmreg,mem64            \334\2\x0F\x59\110        WILLAMETTE,SSE2
+xmmreg,xmmreg           \334\2\x0F\x59\110        WILLAMETTE,SSE2
 
 [ORPD]
 (Ch_Mop2, Ch_Rop1)
@@ -2974,9 +3009,9 @@
 
 [SQRTSD]
 (Ch_Wop2, Ch_Rop1)
-xmmreg,xmmrm            \334\2\x0F\x51\110        WILLAMETTE,SSE2
+xmmreg,mem64            \334\2\x0F\x51\110        WILLAMETTE,SSE2
+xmmreg,xmmreg           \334\2\x0F\x51\110        WILLAMETTE,SSE2
 
-
 [SUBPD]
 (Ch_Mop2, Ch_Rop1)
 xmmreg,xmmrm            \361\2\x0F\x5C\110        WILLAMETTE,SSE2,SM
@@ -2983,11 +3018,13 @@
 
 [SUBSD]
 (Ch_Mop2, Ch_Rop1)
-xmmreg,xmmrm            \334\2\x0F\x5C\110        WILLAMETTE,SSE2
+xmmreg,mem64            \334\2\x0F\x5C\110        WILLAMETTE,SSE2
+xmmreg,xmmreg           \334\2\x0F\x5C\110        WILLAMETTE,SSE2
 
 [UCOMISD]
 (Ch_Rop1, Ch_Rop2, Ch_WZeroFlag, Ch_WParityFlag, Ch_WCarryFlag, Ch_W0OverflowFlag, Ch_W0SignFlag, Ch_W0AuxiliaryFlag)
-xmmreg,xmmrm            \361\2\x0F\x2E\110        WILLAMETTE,SSE2
+xmmreg,mem64            \361\2\x0F\x2E\110        WILLAMETTE,SSE2
+xmmreg,xmmreg           \361\2\x0F\x2E\110        WILLAMETTE,SSE2
 
 [UNPCKHPD]
 (Ch_All)
@@ -3487,11 +3524,13 @@
 
 [ROUNDSS]
 (Ch_Wop2, Ch_Rop1)
-xmmreg,xmmrm,imm      \361\3\x0F\x3A\x0A\110\26            SSE41,SM2,SB,AR2
+xmmreg,mem32,imm      \361\3\x0F\x3A\x0A\110\26            SSE41,SM2,SB,AR2
+xmmreg,xmmreg,imm     \361\3\x0F\x3A\x0A\110\26            SSE41,SM2,SB,AR2
 
 [ROUNDSD]
 (Ch_Wop2, Ch_Rop1)
-xmmreg,xmmrm,imm      \361\3\x0F\x3A\x0B\110\26            SSE41,SM2,SB,AR2
+xmmreg,mem64,imm      \361\3\x0F\x3A\x0B\110\26            SSE41,SM2,SB,AR2
+xmmreg,xmmreg,imm     \361\3\x0F\x3A\x0B\110\26            SSE41,SM2,SB,AR2
 
 ;*******************************************************************************
 ;**********SSE4.2***************************************************************
assembler-avx-instruction-fixes.patch (8,577 bytes)
Index: compiler/x86/x86ins.dat
===================================================================
--- compiler/x86/x86ins.dat	(revision 42196)
+++ compiler/x86/x86ins.dat	(working copy)
@@ -4055,7 +4094,7 @@
 
 [VCMPSS]
 (Ch_All)
-xmmreg,xmmreg,mem64,imm8             \333\362\370\1\xC2\75\120\27         AVX,SANDYBRIDGE
+xmmreg,xmmreg,mem32,imm8             \333\362\370\1\xC2\75\120\27         AVX,SANDYBRIDGE
 xmmreg,xmmreg,xmmreg,imm8            \333\362\370\1\xC2\75\120\27         AVX,SANDYBRIDGE
 
 [VCOMISD]
@@ -4407,7 +4446,6 @@
 (Ch_Wop2, Ch_Rop1)
 xmmreg,xmmreg,xmmreg                 \334\362\370\1\x10\75\120            AVX,SANDYBRIDGE
 xmmreg,mem64                         \334\362\370\1\x10\110               AVX,SANDYBRIDGE
-xmmreg,xmmreg,xmmreg                 \334\362\370\1\x11\75\102            AVX,SANDYBRIDGE
 mem64,xmmreg                         \334\362\370\1\x11\101               AVX,SANDYBRIDGE
 
 [VMOVSHDUP]
@@ -4425,7 +4463,6 @@
 (Ch_Wop2, Ch_Rop1)
 xmmreg,xmmreg,xmmreg                 \333\362\370\1\x10\75\120            AVX,SANDYBRIDGE
 xmmreg,mem32                         \333\362\370\1\x10\110               AVX,SANDYBRIDGE
-xmmreg,xmmreg,xmmreg                 \333\362\370\1\x11\75\102            AVX,SANDYBRIDGE
 mem32,xmmreg                         \333\362\370\1\x11\101               AVX,SANDYBRIDGE
 
 [VMOVUPD]
@@ -5641,27 +5678,33 @@
 
 [VFMADD132SD]
 (Ch_Mop3, Ch_Rop2, Ch_Rop1)
-xmmreg,xmmreg,xmmrm                  \361\362\371\363\1\x99\75\120        FMA
+xmmreg,xmmreg,mem64                  \361\362\371\363\1\x99\75\120        FMA
+xmmreg,xmmreg,xmmreg                 \361\362\371\363\1\x99\75\120        FMA
 
 [VFMADD213SD]
 (Ch_Mop3, Ch_Rop2, Ch_Rop1)
-xmmreg,xmmreg,xmmrm                  \361\362\371\363\1\xa9\75\120        FMA
+xmmreg,xmmreg,mem64                  \361\362\371\363\1\xa9\75\120        FMA
+xmmreg,xmmreg,xmmreg                 \361\362\371\363\1\xa9\75\120        FMA
 
 [VFMADD231SD]
 (Ch_Mop3, Ch_Rop2, Ch_Rop1)
-xmmreg,xmmreg,xmmrm                  \361\362\371\363\1\xb9\75\120        FMA
+xmmreg,xmmreg,mem64                  \361\362\371\363\1\xb9\75\120        FMA
+xmmreg,xmmreg,xmmreg                 \361\362\371\363\1\xb9\75\120        FMA
 
 [VFMADD132SS]
 (Ch_Mop3, Ch_Rop2, Ch_Rop1)
-xmmreg,xmmreg,xmmrm                  \361\362\371\1\x99\75\120            FMA
+xmmreg,xmmreg,mem32                  \361\362\371\1\x99\75\120            FMA
+xmmreg,xmmreg,xmmreg                 \361\362\371\1\x99\75\120            FMA
 
 [VFMADD213SS]
 (Ch_Mop3, Ch_Rop2, Ch_Rop1)
-xmmreg,xmmreg,xmmrm                  \361\362\371\1\xA9\75\120            FMA
+xmmreg,xmmreg,mem32                  \361\362\371\1\xA9\75\120            FMA
+xmmreg,xmmreg,xmmreg                 \361\362\371\1\xA9\75\120            FMA
 
 [VFMADD231SS]
 (Ch_Mop3, Ch_Rop2, Ch_Rop1)
-xmmreg,xmmreg,xmmrm                  \361\362\371\1\xb9\75\120            FMA
+xmmreg,xmmreg,mem32                  \361\362\371\1\xb9\75\120            FMA
+xmmreg,xmmreg,xmmreg                 \361\362\371\1\xb9\75\120            FMA
 
 [VFMADDSUB132PD]
 (Ch_Mop3, Ch_Rop2, Ch_Rop1)
@@ -5755,27 +5798,33 @@
 
 [VFMSUB132SD]
 (Ch_Mop3, Ch_Rop2, Ch_Rop1)
-xmmreg,xmmreg,xmmrm                  \361\362\371\363\1\x9B\75\120        FMA
+xmmreg,xmmreg,mem64                  \361\362\371\363\1\x9B\75\120        FMA
+xmmreg,xmmreg,xmmreg                 \361\362\371\363\1\x9B\75\120        FMA
 
 [VFMSUB213SD]
 (Ch_Mop3, Ch_Rop2, Ch_Rop1)
-xmmreg,xmmreg,xmmrm                  \361\362\371\363\1\xAB\75\120        FMA
+xmmreg,xmmreg,mem64                  \361\362\371\363\1\xAB\75\120        FMA
+xmmreg,xmmreg,xmmreg                 \361\362\371\363\1\xAB\75\120        FMA
 
 [VFMSUB231SD]
 (Ch_Mop3, Ch_Rop2, Ch_Rop1)
-xmmreg,xmmreg,xmmrm                  \361\362\371\363\1\xBB\75\120        FMA
+xmmreg,xmmreg,mem64                  \361\362\371\363\1\xBB\75\120        FMA
+xmmreg,xmmreg,xmmreg                 \361\362\371\363\1\xBB\75\120        FMA
 
 [VFMSUB132SS]
 (Ch_Mop3, Ch_Rop2, Ch_Rop1)
-xmmreg,xmmreg,xmmrm                  \361\362\371\1\x9B\75\120            FMA
+xmmreg,xmmreg,mem32                  \361\362\371\1\x9B\75\120            FMA
+xmmreg,xmmreg,xmmreg                 \361\362\371\1\x9B\75\120            FMA
 
 [VFMSUB213SS]
 (Ch_Mop3, Ch_Rop2, Ch_Rop1)
-xmmreg,xmmreg,xmmrm                  \361\362\371\1\xAB\75\120            FMA
+xmmreg,xmmreg,mem32                  \361\362\371\1\xAB\75\120            FMA
+xmmreg,xmmreg,xmmreg                 \361\362\371\1\xAB\75\120            FMA
 
 [VFMSUB231SS]
 (Ch_Mop3, Ch_Rop2, Ch_Rop1)
-xmmreg,xmmreg,xmmrm                  \361\362\371\1\xBB\75\120            FMA
+xmmreg,xmmreg,mem32                  \361\362\371\1\xBB\75\120            FMA
+xmmreg,xmmreg,xmmreg                 \361\362\371\1\xBB\75\120            FMA
 
 [VFNMADD132PD]
 (Ch_Mop3, Ch_Rop2, Ch_Rop1)
@@ -5809,27 +5858,33 @@
 
 [VFNMADD132SD]
 (Ch_Mop3, Ch_Rop2, Ch_Rop1)
-xmmreg,xmmreg,xmmrm                  \361\362\371\363\1\x9D\75\120        FMA
+xmmreg,xmmreg,mem64                  \361\362\371\363\1\x9D\75\120        FMA
+xmmreg,xmmreg,xmmreg                 \361\362\371\363\1\x9D\75\120        FMA
 
 [VFNMADD213SD]
 (Ch_Mop3, Ch_Rop2, Ch_Rop1)
-xmmreg,xmmreg,xmmrm                  \361\362\371\363\1\xAD\75\120        FMA
+xmmreg,xmmreg,mem64                  \361\362\371\363\1\xAD\75\120        FMA
+xmmreg,xmmreg,xmmreg                 \361\362\371\363\1\xAD\75\120        FMA
 
 [VFNMADD231SD]
 (Ch_Mop3, Ch_Rop2, Ch_Rop1)
-xmmreg,xmmreg,xmmrm                  \361\362\371\363\1\xBD\75\120        FMA
+xmmreg,xmmreg,mem64                  \361\362\371\363\1\xBD\75\120        FMA
+xmmreg,xmmreg,xmmreg                 \361\362\371\363\1\xBD\75\120        FMA
 
 [VFNMADD132SS]
 (Ch_Mop3, Ch_Rop2, Ch_Rop1)
-xmmreg,xmmreg,xmmrm                  \361\362\371\1\x9D\75\120            FMA
+xmmreg,xmmreg,mem32                  \361\362\371\1\x9D\75\120            FMA
+xmmreg,xmmreg,xmmreg                 \361\362\371\1\x9D\75\120            FMA
 
 [VFNMADD213SS]
 (Ch_Mop3, Ch_Rop2, Ch_Rop1)
-xmmreg,xmmreg,xmmrm                  \361\362\371\1\xAD\75\120            FMA
+xmmreg,xmmreg,mem32                  \361\362\371\1\xAD\75\120            FMA
+xmmreg,xmmreg,xmmreg                 \361\362\371\1\xAD\75\120            FMA
 
 [VFNMADD231SS]
 (Ch_Mop3, Ch_Rop2, Ch_Rop1)
-xmmreg,xmmreg,xmmrm                  \361\362\371\1\xBD\75\120            FMA
+xmmreg,xmmreg,mem32                  \361\362\371\1\xBD\75\120            FMA
+xmmreg,xmmreg,xmmreg                 \361\362\371\1\xBD\75\120            FMA
 
 [VFNMSUB132PD]
 (Ch_Mop3, Ch_Rop2, Ch_Rop1)
@@ -5863,27 +5918,33 @@
 
 [VFNMSUB132SD]
 (Ch_Mop3, Ch_Rop2, Ch_Rop1)
-xmmreg,xmmreg,xmmrm                  \361\362\371\363\1\x9F\75\120        FMA
+xmmreg,xmmreg,mem64                  \361\362\371\363\1\x9F\75\120        FMA
+xmmreg,xmmreg,xmmreg                 \361\362\371\363\1\x9F\75\120        FMA
 
 [VFNMSUB213SD]
 (Ch_Mop3, Ch_Rop2, Ch_Rop1)
-xmmreg,xmmreg,xmmrm                  \361\362\371\363\1\xAF\75\120        FMA
+xmmreg,xmmreg,mem64                  \361\362\371\363\1\xAF\75\120        FMA
+xmmreg,xmmreg,xmmreg                 \361\362\371\363\1\xAF\75\120        FMA
 
 [VFNMSUB231SD]
 (Ch_Mop3, Ch_Rop2, Ch_Rop1)
-xmmreg,xmmreg,xmmrm                  \361\362\371\363\1\xBF\75\120        FMA
+xmmreg,xmmreg,mem64                  \361\362\371\363\1\xBF\75\120        FMA
+xmmreg,xmmreg,xmmreg                 \361\362\371\363\1\xBF\75\120        FMA
 
 [VFNMSUB132SS]
 (Ch_Mop3, Ch_Rop2, Ch_Rop1)
-xmmreg,xmmreg,xmmrm                  \361\362\371\1\x9F\75\120            FMA
+xmmreg,xmmreg,mem32                  \361\362\371\1\x9F\75\120            FMA
+xmmreg,xmmreg,xmmreg                 \361\362\371\1\x9F\75\120            FMA
 
 [VFNMSUB213SS]
 (Ch_Mop3, Ch_Rop2, Ch_Rop1)
-xmmreg,xmmreg,xmmrm                  \361\362\371\1\xAF\75\120            FMA
+xmmreg,xmmreg,mem32                  \361\362\371\1\xAF\75\120            FMA
+xmmreg,xmmreg,xmmreg                 \361\362\371\1\xAF\75\120            FMA
 
 [VFNMSUB231SS]
 (Ch_Mop3, Ch_Rop2, Ch_Rop1)
-xmmreg,xmmreg,xmmrm                  \361\362\371\1\xBF\75\120            FMA
+xmmreg,xmmreg,mem32                  \361\362\371\1\xBF\75\120            FMA
+xmmreg,xmmreg,xmmreg                 \361\362\371\1\xBF\75\120            FMA
 
 ;*******************************************************************************
 ;********** TSX ****************************************************************

J. Gareth Moreton

2019-06-10 23:12

developer  

assembler-operand-fix.patch (2,211 bytes)
Index: compiler/x86/rax86.pas
===================================================================
--- compiler/x86/rax86.pas	(revision 42196)
+++ compiler/x86/rax86.pas	(working copy)
@@ -239,11 +239,13 @@
           ;
       end;
     end;
+{$ifndef x86_64}
   end
   else
     begin
       if size=OS_64 then
         opsize:=S_Q;
+{$endif not x86_64}
     end;
 end;
 
@@ -530,7 +532,22 @@
                 ;
             end;
 
-            if memopsize = 0 then memopsize := topsize2memsize[tx86operand(operands[i]).opsize];
+            if memopsize = 0 then
+              begin
+{$ifdef i386}
+                { 64-bit operands are allowed for SSE and AVX instructions, so
+                  go by the byte size instead for these families of opcodes }
+                if (MemRefInfo(opcode).ExistsSSEAVX) then
+                  begin
+                    memopsize := tx86operand(operands[i]).typesize * 8;
+                    if tx86operand(operands[i]).typesize = 8 then
+                      { Will be S_L otherwise and won't be corrected in time }
+                      tx86operand(operands[i]).opsize := S_Q;
+                  end
+                else
+{$endif i386}
+                  memopsize := topsize2memsize[tx86operand(operands[i]).opsize];
+              end;
 
             if (memopsize > 0) and
                (memrefsize > 0) then
@@ -1398,12 +1415,12 @@
                      asize:=OT_BITS32;
                    OS_64,OS_S64:
                      begin
-                       { Only FPU operations know about 64bit values, for all
-                         integer operations it is seen as 32bit
+                       { Only FPU and SSE/AVX operations know about 64bit
+                         values, for all integer operations it is seen as 32bit
 
                          this applies only to i386, see tw16622}
 
-                       if gas_needsuffix[opcode] in [attsufFPU,attsufFPUint] then
+                       if (gas_needsuffix[opcode] in [attsufFPU,attsufFPUint]) or (MemRefInfo(opcode).ExistsSSEAVX) then
                          asize:=OT_BITS64
 {$ifdef i386}
                        else

J. Gareth Moreton

2019-07-11 08:37

developer   ~0117166

Any word on the validity of this fix?

rd0x

2019-07-11 09:04

reporter   ~0117167

Last edited: 2019-07-11 09:04

View 4 revisions

Don't forget to also add the test tw32219.patch from https://bugs.freepascal.org/view.php?id=32219

J. Gareth Moreton

2019-07-11 09:06

developer   ~0117168

I haven't. Even better, I made 0032219 a related issue.

J. Gareth Moreton

2019-08-08 17:05

developer   ~0117592

Can it be confirmed that this fix is valid?

Marco van de Voort

2019-09-14 17:24

manager   ~0118077

Bump, ran into this too.

Issue History

Date Modified Username Field Change
2019-06-10 22:35 J. Gareth Moreton New Issue
2019-06-10 22:35 J. Gareth Moreton File Added: assembler-operand-fix.patch
2019-06-10 22:35 J. Gareth Moreton File Added: assembler-sse-instruction-fixes.patch
2019-06-10 22:35 J. Gareth Moreton File Added: assembler-sse2-instruction-fixes.patch
2019-06-10 22:35 J. Gareth Moreton File Added: assembler-avx-instruction-fixes.patch
2019-06-10 22:36 J. Gareth Moreton Relationship added parent of 0032219
2019-06-10 22:51 J. Gareth Moreton Relationship added parent of 0035701
2019-06-10 23:12 J. Gareth Moreton File Deleted: assembler-operand-fix.patch
2019-06-10 23:12 J. Gareth Moreton File Added: assembler-operand-fix.patch
2019-06-14 05:44 J. Gareth Moreton Tag Attached: patch
2019-06-14 05:44 J. Gareth Moreton Tag Attached: compiler
2019-06-14 05:44 J. Gareth Moreton Tag Attached: assembler
2019-06-14 05:44 J. Gareth Moreton Tag Attached: x86
2019-07-11 08:37 J. Gareth Moreton Note Added: 0117166
2019-07-11 09:04 rd0x Note Added: 0117167
2019-07-11 09:04 rd0x Note Edited: 0117167 View Revisions
2019-07-11 09:04 rd0x Note Edited: 0117167 View Revisions
2019-07-11 09:04 rd0x Note Edited: 0117167 View Revisions
2019-07-11 09:06 J. Gareth Moreton Note Added: 0117168
2019-08-08 17:05 J. Gareth Moreton Note Added: 0117592
2019-09-14 17:24 Marco van de Voort Note Added: 0118077