View Issue Details

IDProjectCategoryView StatusLast Update
0034561LazarusIDEpublic2018-11-19 19:00
ReporterAnton KavalenkaAssigned ToJuha Manninen 
PrioritynormalSeverityminorReproducibilityhave not tried
Status closedResolutionfixed 
Platformx86_64-linuxOSDebianOS Version
Product Version2.1 (SVN)Product Build 
Target VersionFixed in Version 
Summary0034561: propedit: Strange handling of symbol "°" - degree sign in collection item with widestring property
DescriptionSymbol "°" stored in lfm as 0000176 cannot be read back.
Steps To ReproduceInstall attached component package testpack.lpk
Try to edit one of the collection items widestring property - for instance enter:
"Т, °C"

Save the form, close unit1 - then open again.
Compare widestring property content (see the attached picture).

Another collection item with both cyrillic and greek symbols (λ, нм) (upper byte of widechar set to nonzero) storerd/restored properly.
TagsNo tags attached.
Fixed in Revisionr59586
LazTarget-
WidgetsetGTK 2
Attached Files

Relationships

related to 0017746 resolvedJuha Manninen WideString properties not show in Inspector in Lazarus for Mac OS X 

Activities

Anton Kavalenka

2018-11-16 20:59

reporter  

laztest116.zip (67,313 bytes)

Anton Kavalenka

2018-11-16 21:00

reporter  

Anton Kavalenka

2018-11-16 21:01

reporter  

Juha Manninen

2018-11-17 12:24

developer   ~0112016

Yes I can reproduce also with 'äö' found in my keyboard.
As a workaround I recommend using "String" instead of "WideString". It would even be fully Delphi compatible.
Why exactly do you want to use "WideString"?

Anton Kavalenka

2018-11-17 12:51

reporter   ~0112017

Last edited: 2018-11-17 12:53

View 2 revisions

I need widestring to be fully Delphi compatible (lots of legacy).

Btw - why cyrillic ($04**) and greek ($03**) works, but UTF16 ($00B0) that fits 8-bit does not?

Form writer even writes widechar in delphi-compatible way
item
   wi.ws = \0000955', '\0001085\0001084
end

Juha Manninen

2018-11-17 15:03

developer   ~0112018

Then the problem must be in reading the streamed data.
Widechar is written in delphi-compatible way, ok, but where is it read in? Is it in FPC's streaming libs?

My debugging shows that TUnicodeStringPropertyEditor is used while editing the data. TWideStringPropertyEditor is not used but that is OK, they are about identical.

> Btw - why cyrillic ($04**) and greek ($03**) works, but UTF16 ($00B0) that fits 8-bit does not?

Good question, I don't know.

Anton Kavalenka

2018-11-17 18:20

reporter  

laztest116.1.zip (67,843 bytes)

Anton Kavalenka

2018-11-17 18:23

reporter   ~0112024

laztest116.1.zip tests the FPC streaming on button click.

ObjectBinaryToText does the exact dump as lazarus saves lfm

see the dump1.txt

ObjectTextToBinary properly restores binary data

So the problem somewhere in Lazarus.

Anton Kavalenka

2018-11-17 18:24

reporter  

dump1.txt (130 bytes)
object TComp
  Cols = <  
    item
      wi.ws = #1058','#176'C'
    end  
    item
      wi.ws = #955', '#1085#1084
    end>
end
dump1.txt (130 bytes)

Anton Kavalenka

2018-11-17 21:20

reporter   ~0112028

Last edited: 2018-11-17 21:23

View 3 revisions

Lazarus uses its own text reader which in turn uses TUtf8Parser
lresources.pas:5645

which does very brave assumption while reading widestring

function TUTF8Parser.HandleDecimalString(var IsWideString: Boolean): string;


  if not TryStrToInt(Result,i) then
    i:=0;
  if i > 255 then begin
    Result:=UnicodeToUTF8(i); // widestring
    IsWideString:=true;
  end else if i > 127 then
    Result:=SysToUTF8(chr(i)) // windows codepage
  else
    Result:=chr(i); // ascii, does not happen

Decimal string are always Delphi-like and should be considered as Wide

Lazarus writes UTF8 as-is (using #-magic only for codes less than 0x20 - i.e. 0x9 0x0d 0x0a)

This code effectively maps 128 - 255 UNICODE to windows single-byte code page

Anton Kavalenka

2018-11-17 21:34

reporter  

lresources.diff (486 bytes)
--- /tmp/meld-tmpu9fqzx0d
+++ /projects/lazarus/lcl/lresources.pp
@@ -5642,13 +5642,8 @@
   end;
   if not TryStrToInt(Result,i) then
     i:=0;
-  if i > 255 then begin
-    Result:=UnicodeToUTF8(i); // widestring
-    IsWideString:=true;
-  end else if i > 127 then
-    Result:=SysToUTF8(chr(i)) // windows codepage
-  else
-    Result:=chr(i); // ascii, does not happen
+  Result:=UnicodeToUTF8(i); // widestring
+  IsWideString:=true;
 end;
 
 procedure TUTF8Parser.HandleString;

lresources.diff (486 bytes)

Anton Kavalenka

2018-11-17 21:35

reporter   ~0112029

patch provided

Juha Manninen

2018-11-18 11:36

developer   ~0112036

Last edited: 2018-11-18 11:51

View 2 revisions

You have studied the code! I had no memory of where the relevant code is.
Applied with minor modifications, thanks.
Please check the related issue, too.

Anton Kavalenka

2018-11-18 20:19

reporter   ~0112053

Last edited: 2018-11-18 20:23

View 2 revisions

in rev 18304 the widestring writing were fixed

now the widestring reading is fixed.

if there were some tests for LRSObject(TextToBinary|BinaryToText) for checking most corner cases in .lfm streaming
these problem would be caught 9 (NINE) years ago.

I think this patch resolves 0017746
I also try to check on MacOS/X, but - i'm sure this bug is platform-independent.

Anton Kavalenka

2018-11-19 19:00

reporter   ~0112078

MacOS/X tested

Issue History

Date Modified Username Field Change
2018-11-16 20:58 Anton Kavalenka New Issue
2018-11-16 20:59 Anton Kavalenka File Added: laztest116.zip
2018-11-16 21:00 Anton Kavalenka File Added: Здымак экрана, 2018-11-16 22-44-14.png
2018-11-16 21:01 Anton Kavalenka File Added: Здымак экрана, 2018-11-16 22-56-36.png
2018-11-17 12:24 Juha Manninen Note Added: 0112016
2018-11-17 12:51 Anton Kavalenka Note Added: 0112017
2018-11-17 12:53 Anton Kavalenka Note Edited: 0112017 View Revisions
2018-11-17 15:03 Juha Manninen Note Added: 0112018
2018-11-17 18:20 Anton Kavalenka File Added: laztest116.1.zip
2018-11-17 18:23 Anton Kavalenka Note Added: 0112024
2018-11-17 18:24 Anton Kavalenka File Added: dump1.txt
2018-11-17 21:20 Anton Kavalenka Note Added: 0112028
2018-11-17 21:21 Anton Kavalenka Note Edited: 0112028 View Revisions
2018-11-17 21:23 Anton Kavalenka Note Edited: 0112028 View Revisions
2018-11-17 21:34 Anton Kavalenka File Added: lresources.diff
2018-11-17 21:35 Anton Kavalenka Note Added: 0112029
2018-11-18 00:00 Juha Manninen Assigned To => Juha Manninen
2018-11-18 00:00 Juha Manninen Status new => assigned
2018-11-18 00:35 Juha Manninen Relationship added related to 0017746
2018-11-18 11:36 Juha Manninen Fixed in Revision => r59586
2018-11-18 11:36 Juha Manninen LazTarget => -
2018-11-18 11:36 Juha Manninen Note Added: 0112036
2018-11-18 11:36 Juha Manninen Status assigned => resolved
2018-11-18 11:36 Juha Manninen Resolution open => fixed
2018-11-18 11:51 Juha Manninen Note Edited: 0112036 View Revisions
2018-11-18 20:19 Anton Kavalenka Note Added: 0112053
2018-11-18 20:23 Anton Kavalenka Note Edited: 0112053 View Revisions
2018-11-19 19:00 Anton Kavalenka Note Added: 0112078
2018-11-19 19:00 Anton Kavalenka Status resolved => closed