View Issue Details

IDProjectCategoryView StatusLast Update
0037757FPCCompilerpublic2020-09-20 03:34
Reporternanobit Assigned To 
PrioritynormalSeverityminorReproducibilityalways
Status newResolutionopen 
Platformwin32OSWindows 
Product Version3.2.0 
Summary0037757: Untyped hex-literal with (bit63 = 1) should be positive too
DescriptionBug behaviour: Untyped hex-literal values with (bit63 = 1),
thus hex-literals in range [$8000000000000000..$FFFFFFFFFFFFFFFF],
are interpreted as negative int64-values.

Correct behaviour would be:
[$8000000000000000..$FFFFFFFFFFFFFFFF] should mean uint64 (like in Delphi).
See unsigned integer: https://www.freepascal.org/docs-html/ref/refse6.html

Current FPC already supports sufficient hex-notations for all int64 values:
1) [-$8000000000000000..-$0000000000000001..$7FFFFFFFFFFFFFFF]
2) [int64($8000000000000000)..int64($FFFFFFFFFFFFFFFF)..$7FFFFFFFFFFFFFFF]

Additionally, for convenience and Delphi compatible:
Support implicit typecast from hex-notated uint64 value to int64
if targettype int64 is declared explicitly.

For FPC users, the int64 compliance rule is very simple:
Typecast (bit63 = 1)-literals to represent int64 values in all compiler versions.
The new (bug-fixed) FPC needs to track this in wiki FPC_User_Changes:
Typecast old (bit63 = 1)-literals to get compliant with new FPC.

Forum thread with more details and about Delphi compatibility:
https://forum.lazarus.freepascal.org/index.php/topic,50968.0.html
TagsNo tags attached.
Fixed in Revision
FPCOldBugId
FPCTarget
Attached Files

Activities

Florian

2020-09-16 20:58

administrator   ~0125577

The important question is: may val('$ffffffffffffffff',<int64>,code); fail. If yes, it would be possible. If not, we will end up with hacking and inconsistent behaviour.

Jonas Maebe

2020-09-16 21:16

manager   ~0125578

Last edited: 2020-09-16 21:37

View 2 revisions

It also depends on which Delphi version you want to be compatible with.

$ dcc --version
dcc (Borland Delphi for Linux) 14.5
Borland Delphi for Linux Version 14.5
Copyright (c) 1983,2002 Borland Software Corporation

$ int64.pas
{$r+,q+}

const
  c = $ffffffffffffffff;

procedure testbyte(b: byte);
begin
  writeln(b);
end;

procedure testshortint(s: shortint);
begin
  writeln(s);
end;

 begin
  testbyte(c); // line 17
  testshortint(c);
 end.

$ dcc int64.pas
Borland Delphi for Linux Version 14.5
Copyright (c) 1983,2002 Borland Software Corporation
int64.pas(17) Error: Constant expression violates subrange bounds
int64.pas(20)

Maybe we can change it for {$mode delphiunicode}, since just like changing the default "string" in {$h+} from ansistring to unicodestring, this is a compatibility-breaking change. While I don't know which Delphi version changed this particular behaviour, it's probably one that is at least close to the switch to unicodestring. And we can also make it available separately as a modeswitch. All of that provided that Florian's test indeed also fails.

Sven Barth

2020-09-16 22:32

manager   ~0125582

In Delphi 10.2 the following code:

program tval64;

{$APPTYPE CONSOLE}

uses
  SysUtils;

var
  u: UInt64;
  i: Int64;
  code: Integer;
begin
  Val('$ffffffffffffffff', u, code);
  Writeln(IntToHex(u, 16));
  Writeln(code);
  Val('$ffffffffffffffff', i, code);
  Writeln(IntToHex(i, 16));
  Writeln(code);
end.


will print the following:

FFFFFFFFFFFFFFFF
0
FFFFFFFFFFFFFFFF
0


FPC will print the same.

@Jonas: for your test Delphi 10.2 complains about both calls with that error.

Florian

2020-09-16 22:48

administrator   ~0125583

Last edited: 2020-09-16 22:49

View 2 revisions

> FPC will print the same.

Yes. But not after the change which has been proposed. Either $ffffffffffffffff is a valid int64, then val accepts it or it is not, val will not accept anymore (not to mention the fact that the compiler exploits exactly this behaviour to find out what type a constant has).

Edit: Sorry mixed up posters.

Jonas Maebe

2020-09-16 22:54

manager   ~0125584

> for your test Delphi 10.2 complains about both calls with that error.
Yes, that makes sense if $ffffffffffffffff is now high(qword).

Serge Anvarov

2020-09-16 23:16

reporter   ~0125585

In older versions of Delphi, there was no UInt64 type. The documentation said that the maximum value is $7FFFFFFFFFFFFFFF. May also be interested in a quote from the Delphi 7 documentation: "The dollar-sign prefix indicates a hexadecimal numeral--for example, $8F. Hexadecimal numbers without a preceding '-' unary operator are taken to be positive values. During an assignment, if a hexadecimal value lies outside the range of the receiving type an error is raised, except in the case of the Integer (32-bit Integer) where a warning is raised. In this case, values exceeding the positive range for Integer are taken to be negative numbers in a manner consistent with 2's complement integer representation."
New versions of Delphi support UInt64 and the documentation explicitly states what type the constant is: http://docwiki.embarcadero.com/RADStudio/Rio/en/Declared_Constants#True_Constants , and the fact that hexadecimal constants are positive by default is also said: http://docwiki.embarcadero.com/RADStudio/Rio/en/Fundamental_Syntactic_Elements_%28Delphi%29#Numerals

nanobit

2020-09-17 10:28

reporter   ~0125590

In new Delphi:
Val('$ffffffffffffffff', u64, code); // (u64=high(uint64)), (code=0), minimum requirement for new FPC
Val('$ffffffffffffffff', i64, code); // (i64=-1), (code=0), includes implicit typecast from uint64 to int64
Val('$ffffffff', i32, code); // (i32=-1), (code=0), includes implicit typecast from uint32 to int32
Val('$ffffffffffffffff', i32, code); // (i32=-1), (code=10), inputStr[10] is first invalid (ignored)
Val('$ffffffffefffffff', i32, code); // (i32=-1), (code=10), inputStr[10] is first invalid (ignored)

Florian

2020-09-17 21:26

administrator   ~0125597

The second will not work anymore. To summarize: we break people's code, remove exising functionality, as soon as there is an int128 type things change again. So what is the gain of changing this?

nanobit

2020-09-17 23:11

reporter   ~0125602

Last edited: 2020-09-17 23:14

View 2 revisions

I would allow both (like in Delphi):
Val('$ffffffffffffffff', u64, code); // required (means u64 := $ffffffffffffffff)
Val('$ffffffffffffffff', i64, code); // for convenience (means i64 := $ffffffffffffffff), implicit typecast.
From users point of view I don't see any complication, but only all the improvements.

If FPC does not allow the implicit typecast of this literal to int64, then the user has more work:
Compiletime range-error would require him to explicitly cast int64($ffffffffffffffff),
which is just the same as the implicit typecast (no functional gain by explicit typecast).
If you don't like the implicit typecast, then this convenience-option could be disabled by a mode, now or later.

There is no much breakable code (if at all): Users can easily find their (bit63=1)-literals
with "\$[8-9a-fA-F][0-9a-fA-F]{15}" and typecast them. The users don't even need to know the context,
the old literal was int64 and should remain int64. If implicit typecast is supported,
then even less work: all int64-typed locations (arrays, ..) work without change.

The actual gain of all is ($ffffffffffffffff > 0) and all its benefits:
including smooth transition to int128.
Val('$ffffffffffffffff', i128, code); // is in range and would return a positive value
Val('$ffffffffffffffffffffffffffffffff', i128, code); // is out of range, thus -1 with implicit typecast.

Florian

2020-09-17 23:23

administrator   ~0125603

> I would allow both (like in Delphi):

This is not possible (or only with hacking and introducing inconsistencies). The compiler uses val (for consistence between rtl and compiler which delphi apparently does not have) to determine if a certain digit sequence can be a certain type. And the first check is int64, then qword, then the reals. So the only proper solution to change this is to let val(...,int64,...) return an error if $ffffffffffffffff is passed.

nanobit

2020-09-18 10:22

reporter   ~0125608

Last edited: 2020-09-20 03:34

View 5 revisions

Formally, untyped $ffffffffffffffff always is uint64 at first stage, regardless of target.
i64 := $ffffffffffffffff; actually (in sense of internally) is: i64 := int64( uint64($ffffffffffffffff));
uint64() is from language definition of (bit63=1)-literals.
int64() occurs only if target has declared int64 type and the compiler
supports implicit typecast (for (bit63=1)-literals only) to int64 target.

i128 := $ffffffffffffffff; actually is: i128 := uint64($ffffffffffffffff);
i32 := $ffffffffffffffff; actually is: i32 := uint64($ffffffffffffffff); // range-error
i128 := $ffffffffffffffffffffffffffffffff; actually is:
i128 := int128(uint128($ffffffffffffffffffffffffffffffff));
The language has to define individually for every signed target-type (sizeOf())
whether the implicit typecast (only from unsigned source-literal representing the same sizeOf()) is supported or not.
It's a decision only about suppressing larger-than-signed-range error or not.

The rule is: Implicit typecast (for (bit63=1)-literals) is allowable only at locations
where a compile-time range-error (hex-literal value is too large (> high(signedType))) would occur.
If the implicit typecast is unsupported there, then the compile-time range-error occurs.
Increasing the strictness is no problem: Fewer implicit typecasts means more compile-time range-errors.
Every implicit typecast gives compile-time range-warning. Possible reasons for the warning despite functional safety:
1) The large literal could be a programmers mistake: it was is not intended to be a negative int64-bitpattern.
2) The negative int64-bitpattern at that place is supported, but deprecated (better alternatives exist).
In Delphi, implicit-typecast was introduced to make upgrading old source-code more silent.
Pascal compilers are free to reduce the support of implicit typecast to any degree (until zero).
And the degree of support can be different between Pascal compilers.

Internal compiler-work is a different realm (black box), I will have to wait/accept here.

Issue History

Date Modified Username Field Change
2020-09-16 20:42 nanobit New Issue
2020-09-16 20:58 Florian Note Added: 0125577
2020-09-16 21:16 Jonas Maebe Note Added: 0125578
2020-09-16 21:37 Jonas Maebe Note Edited: 0125578 View Revisions
2020-09-16 22:32 Sven Barth Note Added: 0125582
2020-09-16 22:48 Florian Note Added: 0125583
2020-09-16 22:49 Florian Note Edited: 0125583 View Revisions
2020-09-16 22:54 Jonas Maebe Note Added: 0125584
2020-09-16 23:16 Serge Anvarov Note Added: 0125585
2020-09-17 10:28 nanobit Note Added: 0125590
2020-09-17 21:26 Florian Note Added: 0125597
2020-09-17 23:11 nanobit Note Added: 0125602
2020-09-17 23:14 nanobit Note Edited: 0125602 View Revisions
2020-09-17 23:23 Florian Note Added: 0125603
2020-09-18 10:22 nanobit Note Added: 0125608
2020-09-18 10:36 nanobit Note Edited: 0125608 View Revisions
2020-09-20 02:24 nanobit Note Edited: 0125608 View Revisions
2020-09-20 03:04 nanobit Note Edited: 0125608 View Revisions
2020-09-20 03:34 nanobit Note Edited: 0125608 View Revisions