View Issue Details

IDProjectCategoryView StatusLast Update
0032731LazarusLCLpublic2017-12-10 11:57
ReporterFedon Kadifeli Assigned To 
PrioritynormalSeverityminorReproducibilityalways
Status newResolutionopen 
Platformx86_64OSWindows  
Product Version1.6.4 
Summary0032731: SelLength and SelText return incorrect values when there are codepoints outside Unicode BMP
DescriptionFor TEdit or TMemo (and maybe other controls) the SelLength property returns wrong information for selections containing codepoints with codes larger than $FFFF.


Environment: Windows 7 + Lazarus 1.6.4 (64 bit).
Steps To ReproduceFor example, create a TMemo or TEdit object and during run time paste the character 𝛁 (code: $1D6C1 = 'MATHEMATICAL BOLD NABLA') to it; then do a select all (Ctrl+A) and run the code:

    showmessage( inttostr(utf8length(Memo1.Text))
     + ' ' + inttostr(Memo1.SelLength)
     + ' ' + inttostr(utf8length(Memo1.SelText))
    );


It will display 1 2 1.

Strangely enough, if you add

    Memo1.SelectAll;

before the code above, it will display 1 1 1.
Additional InformationPlease, run the code attached in a Windows + Lazarus (1.6.4) environment and try to select from the beginning of the Editbox 1, 2, and 3 characters. The relevant values will be displayed on the right.

I tested the same code in Linux + Lazarus (1.8.0~rc4) and it did not have the problem described above.

Forum post: http://forum.lazarus.freepascal.org/index.php/topic,39045.0.html
TagsNo tags attached.
Fixed in Revision
LazTarget-
WidgetsetWin32/Win64
Attached Files

Relationships

related to 0032101 new Memo,Edit Controls : lost input charactors when you enter Unicode via IME. On Windows 

Activities

Fedon Kadifeli

2017-11-26 12:53

reporter  

SelLength.zip (2,207 bytes)

Fedon Kadifeli

2017-11-26 13:13

reporter  

1.png (3,349 bytes)   
1.png (3,349 bytes)   

Fedon Kadifeli

2017-11-26 13:13

reporter  

2.png (3,235 bytes)   
2.png (3,235 bytes)   

Juha Manninen

2017-11-26 13:22

developer   ~0104284

Yes, surrogate pairs are not handled correctly.

Thaddy de Koning

2017-12-05 21:26

reporter   ~0104478

Depends on windows version..... XP+ is true UTF16, below is UCS-2. Hornet nest.
For dev's: UTF16 is a variable type, UCS is fixed to two bytes.
http://www.differencebetween.net/technology/software-technology/difference-between-ucs-2-and-utf-16/

Fedon Kadifeli

2017-12-06 10:36

reporter   ~0104509

Last edited: 2017-12-10 11:57

View 3 revisions

My environment was Windows 7 (64-bit) + Lazarus 1.6.4 (64-bit). After the new release (2017-12-03) I tested the code and saw that the problem still persists on Lazarus 1.8.0.

I tested the code also in Windows 10 (1709) + Lazarus 1.8.0 (64-bit) and it had the same problem.

In Linux this problem *does not* occur.

Juha Manninen

2017-12-07 14:44

developer   ~0104565

I edited summary and description to be more accurate.
I believe Windows 7 and 10 behave identically in this case.

Issue History

Date Modified Username Field Change
2017-11-26 12:53 Fedon Kadifeli New Issue
2017-11-26 12:53 Fedon Kadifeli File Added: SelLength.zip
2017-11-26 13:13 Fedon Kadifeli File Added: 1.png
2017-11-26 13:13 Fedon Kadifeli File Added: 2.png
2017-11-26 13:22 Juha Manninen Relationship added related to 0030478
2017-11-26 13:22 Juha Manninen Note Added: 0104284
2017-12-05 19:43 Juha Manninen Relationship deleted related to 0030478
2017-12-05 19:43 Juha Manninen Relationship added related to 0032101
2017-12-05 21:26 Thaddy de Koning Note Added: 0104478
2017-12-06 10:36 Fedon Kadifeli Note Added: 0104509
2017-12-07 09:24 Fedon Kadifeli Note Edited: 0104509 View Revisions
2017-12-07 14:41 Juha Manninen LazTarget => -
2017-12-07 14:41 Juha Manninen Summary SelLength and SelText return incorrect values when there are UTF8 characters > $FFFF in the selection => SelLength and SelText return incorrect values when there are codepoints outside Unicode BMP
2017-12-07 14:41 Juha Manninen Description Updated View Revisions
2017-12-07 14:44 Juha Manninen Note Added: 0104565
2017-12-10 11:57 Fedon Kadifeli Note Edited: 0104509 View Revisions