View Issue Details

IDProjectCategoryView StatusLast Update
0026803FPCRTLpublic2018-01-07 17:46
ReporterSergey SmirnovAssigned ToMichael Van Canneyt 
PrioritynormalSeverityminorReproducibilityalways
Status feedbackResolutionopen 
PlatformIntel x64OSMicrosoft WindowsOS Version8.1 Professional
Product Version2.6.4Product Build45510 
Target VersionFixed in Version 
Summary0026803: FloatToStrF inserts wrong thousand separator
DescriptionWhen specified Format parameter as ffnumber, FloatToStrF inserts thousand separators as letter ('a' in my case). My default OS region settings - RU-RU.
When I change it to EN-US, all ok.
RU-RU region settings defined digit grouping symbol as ' ' (blank) while EN-US digit grouping symbol is ',' (comma).
Steps To Reproducevar
  S1: String; //AnsiString
  S2: WideString;
  S3: UnicodeString;
begin
  S1:=FloatToStrF(1.0E+7, ffNumber, 0, 0);
  S2:=FloatToStrF(1.0E+7, ffNumber, 0, 0);
  S3:=FloatToStrF(1.0E+7, ffNumber, 0, 0);
  Writeln('S1 = '+S1);
  Writeln('S2 = '+S2);
  Writeln('S3 = '+S3);
end.
Additional InformationRU-RU Console output:
S1 = 10a000a000
S2 = 10a000a000
S3 = 10a000a000

EN-US Console output:
S1 = 10,000,000
S2 = 10,000,000
S3 = 10,000,000
TagsNo tags attached.
Fixed in Revision
FPCOldBugId
FPCTarget
Attached Files
  • ThSepDemo.zip (56,574 bytes)
  • ThSepDemo_UPDATED.zip (69,978 bytes)
  • ThSepDemo_ru_RU_Screenshot.7z (11,725 bytes)
  • ThSepDemo_UPDATE2.zip (70,013 bytes)
  • bug26803.pp (532 bytes)
    program formatsettings;
    uses
      SysUtils;
    var
      S1: AnsiString;
    
    procedure testsep(lcid:integer;const lang : string);
    begin
      GetLocaleFormatSettings(lcid,defaultformatsettings);
      S1:=FloatToStrF(1.0E+7, ffNumber, 0, 0);
      Writeln(lang,': S1 = '+S1);
    end;
    
    begin
      SetMultiByteConversionCodePage(CP_UTF8);
      testsep(1049,'ru-ru');  // the problem
      testsep(1043,'nl-nl');  // test a locale with ,=decimal and . = thousands
      testsep(1033,'en-us');  // test a locale with .=decimal and , = thousands
      ReadLn;
    end.
    bug26803.pp (532 bytes)

Relationships

related to 0026065 feedback FPC DefaultFormatSettings.ThousandSeparator gives nothing 
related to 0029781 closedMichael Van Canneyt FPC Problems DisplayFormat Lazarus 1.6 fpc 3.0 
related to 0030253 resolvedOndrej Pokorny Lazarus FormatFloat('#,'...); Become working incorrect after changing screen resolution 

Activities

Sergey Smirnov

2014-10-01 03:16

reporter   ~0077833

Last edited: 2014-10-14 14:06

View 3 revisions

RU-RU means that language is russian and country is Russia (ru-RU locale).
EN-US is English (USA) (en-US locale).

Bart Broersma

2014-10-05 11:22

reporter   ~0077997

> RU-RU region settings defined digit grouping symbol as ' ' (blank)
Blanks as in Chr(32) or some other Unicode character?

Sergei Gorelkin

2014-10-05 12:12

developer   ~0077999

It should be a non-breaking space (U+00A0, decimal code 160). At least that's what I have on Windows 7 with Russian locale.

Bart Broersma

2014-10-05 17:44

reporter   ~0078007

Does the console (OEM codepage) support the non-breaking character?
Using CharToOEMW before writing to console might work.
(It outputs a blank on my console (cp850 I think), without it it outputs an "รก").

Sergey Smirnov

2014-10-10 11:33

reporter   ~0078132

Last edited: 2014-10-10 11:34

View 2 revisions

Thousand separator (LOCALE_STHOUSAND) Unicode symbol is U+00A0 in Windows 8.1 (ru-RU locale).

Sergey Smirnov

2014-10-14 13:59

reporter  

ThSepDemo.zip (56,574 bytes)

Sergey Smirnov

2014-10-14 15:29

reporter   ~0078220

Oops. The source code of demo (ThSepDemo) contains numerous errors. The demo example was designated to GUI app initially. The souce code must be rewritten...

Sergey Smirnov

2014-10-20 13:08

reporter  

ThSepDemo_UPDATED.zip (69,978 bytes)

Sergey Smirnov

2014-10-20 13:17

reporter   ~0078417

Last edited: 2014-10-21 06:31

View 2 revisions

Very interesting situation. When one problem was found then another problem was found also.
There are two failures as demo shows.
1. Writeln function fail to provide correct output when console codepage is OEM.
2. FloatToStrF function does not take into account changes when one of the NLS parameters changed.

Sergey Smirnov

2014-10-20 13:25

reporter  

ThSepDemo_ru_RU_Screenshot.7z (11,725 bytes)

Sergey Smirnov

2014-10-21 08:13

reporter  

ThSepDemo_UPDATE2.zip (70,013 bytes)

Sergey Smirnov

2014-10-21 08:28

reporter   ~0078450

ThSepDemo minor changes (mostly cosmetic).

Ondrej Pokorny

2016-06-12 08:18

developer   ~0093143

Use GetFormatSettingsUTF8 from LazUTF8 (LazUtils) as a workaround.

Marco van de Voort

2016-06-12 22:13

manager   ~0093158

I ran the program with 3.0 and trunk, and I see the Russian locale with spaces (which I assume are shifted ($A0), but the program is not very redirect friendly).

So I assume this is resolved?

Ondrej Pokorny

2016-06-13 10:01

developer   ~0093170

Last edited: 2016-06-13 10:03

View 4 revisions

> So I assume this is resolved?

No, unfortunately not. I fixed in LazUtils/LazUTF8 yesterday. If you use a plain FCL program, the bug is still there:

program formatsettings;
uses
  SysUtils;
var
  S1: AnsiString;
begin
  SetMultiByteConversionCodePage(CP_UTF8);
  GetFormatSettings;
  S1:=FloatToStrF(1.0E+7, ffNumber, 0, 0);
  Writeln('S1 = '+S1);
  ReadLn;
end.


Prints:
10?000?000


Whereas this works well:
program formatsettings;
uses
  SysUtils, LazUTF8;
var
  S1: AnsiString;
begin
  SetMultiByteConversionCodePage(CP_UTF8);
  GetFormatSettingsUTF8; // use LazUTF8
  S1:=FloatToStrF(1.0E+7, ffNumber, 0, 0);
  Writeln('S1 = '+S1);
  ReadLn;
end.


You need a similar approach (maybe the same :) ) as used in GetLocaleCharUTF8(). For GetLocaleStr() there's a patch in http://bugs.freepascal.org/view.php?id=27086 from Mattias (0027086 is the same issue).

Michael Van Canneyt

2016-06-13 10:59

administrator   ~0093171

As long as characters in TFormatSettings are ansi characters, this is not fixed.

Any character outside the system codepage will be unrepresentable, even with SetMultiByteConversionCodePage(CP_UTF8);

Ondrej Pokorny

2016-06-13 11:12

developer   ~0093172

> Any character outside the system codepage will be unrepresentable, even with SetMultiByteConversionCodePage(CP_UTF8);

On my locale the thousand separator is set to #$00A0 (nbsp - Windows default for many European languages). It is in my ANSI system codepage (ANSI=#$A0, UTF8=#$C2#$A0).

GetFormatSettings still fails for it when using SetMultiByteConversionCodePage(CP_UTF8). (0026803:0093170)

Marco van de Voort

2018-01-07 17:06

manager  

bug26803.pp (532 bytes)
program formatsettings;
uses
  SysUtils;
var
  S1: AnsiString;

procedure testsep(lcid:integer;const lang : string);
begin
  GetLocaleFormatSettings(lcid,defaultformatsettings);
  S1:=FloatToStrF(1.0E+7, ffNumber, 0, 0);
  Writeln(lang,': S1 = '+S1);
end;

begin
  SetMultiByteConversionCodePage(CP_UTF8);
  testsep(1049,'ru-ru');  // the problem
  testsep(1043,'nl-nl');  // test a locale with ,=decimal and . = thousands
  testsep(1033,'en-us');  // test a locale with .=decimal and , = thousands
  ReadLn;
end.
bug26803.pp (532 bytes)

Marco van de Voort

2018-01-07 17:24

manager   ~0105445

Last edited: 2018-01-07 17:27

View 3 revisions

How can the utf8 one work if tformatsettings still defines thousandseparator as a char ?

(rtl/inc/objpas/sysinth.inc
    ThousandSeparator: Char;
)

I can upgrade the locale getting of win32/64 sysutils quite easily, but the tformatsettings will have to change first.

I also uploaded a test (for windows only) that doesn't need the locale to be set.

Note that the related open bug report is for ffcurrency which uses slightly different locale settings

Ondrej Pokorny

2018-01-07 17:45

developer   ~0105447

Last edited: 2018-01-07 17:46

View 2 revisions

> How can the utf8 one work if tformatsettings still defines thousandseparator as a char ?

Just check the sources: GetLocaleCharUTF8 replaces special utf-8 characters with ascii alternatives (e.g. the unicode non-breaking space is replaced with a simple # 32 space). It also deletes multiple characters from the formatting where only 1 character is needed - e.g. the Slovak date separator is 2 chars: a dot followed by a space: in Windows the settings is "7. 1. 2018" (Lazarus makes it to 7.1.2018 - but this is acceptable).

Issue History

Date Modified Username Field Change
2014-09-30 17:17 Sergey Smirnov New Issue
2014-10-01 03:16 Sergey Smirnov Note Added: 0077833
2014-10-01 03:22 Sergey Smirnov Note Edited: 0077833 View Revisions
2014-10-01 08:36 Reinier Olislagers Relationship added related to 0026065
2014-10-05 11:22 Bart Broersma Note Added: 0077997
2014-10-05 12:12 Sergei Gorelkin Note Added: 0077999
2014-10-05 17:44 Bart Broersma Note Added: 0078007
2014-10-10 11:33 Sergey Smirnov Note Added: 0078132
2014-10-10 11:34 Sergey Smirnov Note Edited: 0078132 View Revisions
2014-10-14 13:59 Sergey Smirnov File Added: ThSepDemo.zip
2014-10-14 14:06 Sergey Smirnov Note Edited: 0077833 View Revisions
2014-10-14 15:29 Sergey Smirnov Note Added: 0078220
2014-10-16 13:55 Michael Van Canneyt Assigned To => Michael Van Canneyt
2014-10-16 13:55 Michael Van Canneyt Status new => assigned
2014-10-20 13:08 Sergey Smirnov File Added: ThSepDemo_UPDATED.zip
2014-10-20 13:17 Sergey Smirnov Note Added: 0078417
2014-10-20 13:25 Sergey Smirnov File Added: ThSepDemo_ru_RU_Screenshot.7z
2014-10-21 06:31 Sergey Smirnov Note Edited: 0078417 View Revisions
2014-10-21 08:13 Sergey Smirnov File Added: ThSepDemo_UPDATE2.zip
2014-10-21 08:28 Sergey Smirnov Note Added: 0078450
2016-03-20 19:21 Michael Van Canneyt Relationship added related to 0029781
2016-06-12 08:18 Ondrej Pokorny Note Added: 0093143
2016-06-12 15:36 Martin Friebe Relationship added related to 0030253
2016-06-12 22:13 Marco van de Voort Note Added: 0093158
2016-06-12 22:13 Marco van de Voort Status assigned => feedback
2016-06-13 10:01 Ondrej Pokorny Note Added: 0093170
2016-06-13 10:02 Ondrej Pokorny Note Edited: 0093170 View Revisions
2016-06-13 10:02 Ondrej Pokorny Note Edited: 0093170 View Revisions
2016-06-13 10:03 Ondrej Pokorny Note Edited: 0093170 View Revisions
2016-06-13 10:59 Michael Van Canneyt Note Added: 0093171
2016-06-13 11:12 Ondrej Pokorny Note Added: 0093172
2018-01-07 17:06 Marco van de Voort File Added: bug26803.pp
2018-01-07 17:24 Marco van de Voort Note Added: 0105445
2018-01-07 17:25 Marco van de Voort Note Edited: 0105445 View Revisions
2018-01-07 17:27 Marco van de Voort Note Edited: 0105445 View Revisions
2018-01-07 17:45 Ondrej Pokorny Note Added: 0105447
2018-01-07 17:46 Ondrej Pokorny Note Edited: 0105447 View Revisions