View Issue Details

IDProjectCategoryView StatusLast Update
0014189LazarusIDEpublic2009-10-23 00:40
ReporterSam Bliss Assigned ToVincent Snijders  
PrioritynormalSeverityminorReproducibilityalways
Status closedResolutionfixed 
OSWindows with Chinese language 
Product Version0.9.27 (SVN) 
Fixed in Version0.9.27 (SVN) 
Summary0014189: ANSI charactor map does not work on some computer.
DescriptionThe language of the computer is Chinese.
When I use ANSI charactor map and put the mouse on 0000128..0000255,
the dialog shows.
                  Lazarus
============================================
Access voilation.

Press OK to ignore and risk data corruption.
Press Cancel to kill the program.
============================================
Tagsmulti-language support, utf8
Fixed in Revision21089
LazTarget-
WidgetsetWin32/Win64
Attached Files

Activities

2009-07-25 08:51

 

Bug.PNG (27,197 bytes)   
Bug.PNG (27,197 bytes)   

Sam Bliss

2009-07-25 08:53

reporter   ~0029241

Sorry, that line is:
"When I use ANSI charactor map and put the mouse on chr(128)..chr(255),
the dialog shows."

Dmitry Boyarintsev

2009-07-25 12:06

developer   ~0029242

Last edited: 2009-07-25 12:07

*For some system encodings it's impossible to convert ansi chr(128) to utf8. (for example, if system encoding is UTF8. There's no single chr(128) character!)
* It's obvious (from the screenshot) that character "decoding" should fail. Because grid cells are empty. Cells[x,y] = '';


please test this patch.

Sam Bliss

2009-07-27 07:47

reporter   ~0029272

Thanks a lot.
But you gave me a ".diff" file.
It is hard to read and use.
Could you give me a ".pas" file, please?

(Sorry, my English is not good. I might made gramma mistakes.)

Dmitry Boyarintsev

2009-07-27 09:15

developer   ~0029273

ok. you need to replace charactermapdlg.pas at lazarus\ide dir and rebuild the IDE

2009-07-27 09:16

 

Sam Bliss

2009-07-28 07:58

reporter   ~0029304

I used your patch.
It still broken.
The grid cells chr(128)..chr(255) and chr(127) are empty.
When I move my mouse on chr(128)..chr(254),
CharacterMapDialog.PageControl1.TabSheet1.CharInfoLabel.Text is '-'.
But it did not raise an error.
And I can not Insert chr(128)..chr(255) and chr(127) to document.
See attached file "fixed.PNG".

Dmitry Boyarintsev

2009-07-28 08:05

developer   ~0029305

what's windows code page? GB18030?

Sam Bliss

2009-07-28 08:20

reporter   ~0029307

I used your patch.
It still broken.
The grid cells chr(128)..chr(255) and chr(127) are empty.
When I move my mouse on chr(128)..chr(254),
CharacterMapDialog.PageControl1.TabSheet1.CharInfoLabel.Text is '-'.
But it did not raise an error.
And I can not Insert chr(128)..chr(255) and chr(127) to document.
See attached file "fixed.PNG".

2009-07-28 08:21

 

Fixed.PNG (19,856 bytes)   
Fixed.PNG (19,856 bytes)   

Dmitry Boyarintsev

2009-07-28 08:40

developer   ~0029308

try this fix, then

2009-07-28 08:41

 

Sam Bliss

2009-07-28 09:30

reporter   ~0029312

Last edited: 2009-07-29 10:38

re:0029305,
Windows code page is GB18030 or GB2312. But lazarus IDE use UTF-8, it does not matter.

I write this code that just now. It works! I hope this code can add into Lazarus 0.9.28 in future:

// Replace from Lazarus IDE v0.9.27 beta r20911 FPC 2.2.5
// ide/charactermapdlg.pas (170,1)..(193,1)
// By Brilliant <m13253 at hotmail dot com> ©2009
procedure TCharacterMapDialog.StringGrid1MouseMove(Sender: TObject;
  Shift: TShiftState; X, Y: Integer);
var Row, Col, i: Integer;
     S:Cardinal;
     tmp,tmp2:String;
begin
  if StringGrid1.MouseToGridZone(X, Y) = gzNormal then
  begin
    StringGrid1.MouseToCell(X, Y, Col, Row);
    tmp:=StringGrid1.Cells[Col, Row];
    S:=ord(UTF8Decode(tmp)[1]);
    tmp2:='';
    for i:=1 to Length(tmp) do tmp2:=tmp2+'$'+IntToHex(Ord(tmp[i]),2);
    CharInfoLabel.Caption:=inttohex(S,2)+', UTF-8 = '+tmp2;
  end
  else
  begin
    CharInfoLabel.Caption := '-';
  end;
end;

// Replace from Lazarus IDE v0.9.27 beta r20911 FPC 2.2.5
// ide/charactermapdlg.pas (229,1)..(258,1)
// By Brilliant <m13253 at hotmail dot com> ©2009
procedure TCharacterMapDialog.FillCharMap;
var
  i: Byte;
  j: Byte;
  c: Byte;
begin
  StringGrid1.ColCount:=17;
  StringGrid1.RowCount:=17;
  for i:=0 to 15 do begin
    for j:=0 to 15 do begin
      c:=i shl 4 or j;
      if (c>0) and (c<128) then
        StringGrid1.Cells[j+1, i+1]:=chr(c)
      else
        StringGrid1.Cells[j+1, i+1]:=chr($C0 or (i div $4))+chr($80 or c mod $40);
    end;
    StringGrid1.Cells[0, i+1]:=Format('%.1x', [i]);
    StringGrid1.Cells[i+1, 0]:=Format('%.1x', [i]);
  end;
  StringGrid1.Cells[0, 0]:=' ';
end;
// TODO: I have not tried in other OS or the OS that use big endian.

Dmitry Boyarintsev

2009-07-28 10:06

developer   ~0029314

charactermapdlg.pas-2.zip contains the similar code.
could you test it as well?

Sam Bliss

2009-07-28 10:15

reporter   ~0029315

No, charactermapdlg.pas-2.zip does not work.
The problem see 0029307.
I modified it to 0029312.

2009-07-28 10:34

 

Sam Bliss

2009-07-28 10:36

reporter   ~0029316

I upload it as attached file "charactermapdlg.pas_Patch.zip".
I hope you can add into new version of lazarus IDE.
Do not forget Sam Bliss wrote that! :-)

Dmitry Boyarintsev

2009-07-28 10:38

developer   ~0029317

please check out:

The symbols (#$0A..#$FF) in the ANSI table you've got should be identical to Unicode - Latin-1 Supplement symbols table.

Could you also do the following:
* back up charactermapdlg.pas
* download and unzip charactermapdlg.pas-3.zip
* replace charactermapdlg.pas with the new one
* rebuild IDE and make a screenshot of Character Map dialog.
* restore backed up charactermapdlg.pas and rebuild IDE

thanks

2009-07-28 10:39

 

Sam Bliss

2009-07-28 11:15

reporter   ~0029318

No, That is bad!

Sam Bliss

2009-07-28 11:15

reporter   ~0029319

No, That is bad!

2009-07-28 11:17

 

charactermapdlg.pas-3.PNG (22,253 bytes)   
charactermapdlg.pas-3.PNG (22,253 bytes)   

Sam Bliss

2009-07-28 11:18

reporter   ~0029320

I uploaded the screenshot "charactermapdlg.pas-3.PNG"

2009-07-28 11:22

 

charactermapdlg-4.zip (2,828 bytes)

Sam Bliss

2009-07-28 11:23

reporter   ~0029321

Last edited: 2009-07-29 10:45

I uploaded the newer patch "charactermapdlg-4.zip".
The new version can show the Decimal and Hexagon of a character.

These codes have changed:

// By Brilliant <m13253 at hotmail dot com> ©2009
// Replace from Lazarus IDE v0.9.27 beta r20911 FPC 2.2.5
// ide/charactermapdlg.pas (170,1)..(193,1)
procedure TCharacterMapDialog.StringGrid1MouseMove(Sender: TObject;
  Shift: TShiftState; X, Y: Integer);
var Row, Col, i: Integer;
     S:Cardinal;
     tmp,tmp2:String;
begin
  if StringGrid1.MouseToGridZone(X, Y) = gzNormal then
  begin
    StringGrid1.MouseToCell(X, Y, Col, Row);
    tmp:=StringGrid1.Cells[Col, Row];
    S:=Row*$10+Col-$11;
    tmp2:='';
    for i:=1 to Length(tmp) do tmp2:=tmp2+'$'+IntToHex(Ord(tmp[i]),2);
    CharInfoLabel.Caption:='Decimal = '+IntToStr(S)+', Hex = $'+inttohex(S,2)
    +', UTF-8 = '+tmp2;
  end
  else
  begin
    CharInfoLabel.Caption := '-';
  end;
end;

// Replace from Lazarus IDE v0.9.27 beta r20911 FPC 2.2.5
// ide/charactermapdlg.pas (229,1)..(258,1)

procedure TCharacterMapDialog.FillCharMap;
var
  i: Byte;
  j: Byte;
  c: Byte;
begin
  StringGrid1.ColCount:=17;
  StringGrid1.RowCount:=17;
  for i:=0 to 15 do begin
    for j:=0 to 15 do begin
      c:=i shl 4 or j;
      if (c>0) and (c<128) then
        StringGrid1.Cells[j+1, i+1]:=chr(c)
      else
        StringGrid1.Cells[j+1, i+1]:=chr($C0 or (i div $4))+chr($80 or c mod $40);
    end;
    StringGrid1.Cells[0, i+1]:=Format('%.1x', [i]);
    StringGrid1.Cells[i+1, 0]:=Format('%.1x', [i]);
  end;
  StringGrid1.Cells[0, 0]:=' ';
end;
// TODO: I have not tried in other OS or the OS that use big endian.

theo

2009-07-28 12:00

reporter   ~0029322

Be careful. For this reason the Unicode Tab was introduced.
See also: http://bugs.freepascal.org/view.php?id=12992

Dmitry Boyarintsev

2009-07-28 12:55

developer   ~0029323

Thanks for the help Sam Bliss.

3d version were not to fix, but to test, why 2nd version failed to work. I'm uploading 5th version, please test.

The problem with your patches, is that they replace system ANSI code page, by the first 0..255 Unicode characters.
It *might* be accaptable for multibyte encodings, because there might be no 128..255 characters. But for European pages, it cannot be done so.
For example, Russian characters must present in cp1251 ANSI table.

2009-07-28 12:58

 

Dmitry Boyarintsev

2009-07-28 12:59

developer   ~0029324

also, please provide a screen shot of the correct character table.

2009-07-29 08:59

 

charactermapdlg.pas-5.PNG (22,438 bytes)   
charactermapdlg.pas-5.PNG (22,438 bytes)   

Sam Bliss

2009-07-29 09:01

reporter   ~0029356

Attached the screen shot charactermapdlg.pas-5.PNG

theo

2009-07-29 09:44

reporter   ~0029358

I would apply the patch from Dmitry Boyarintsev against IDE crash and leave everything else as is.

It doesn't even make a lot of sense imho to show the UTF-hex-sequence in the ANSI Tab. There is the Unicode Tab for Unicode.

Sam Bliss

2009-07-29 10:15

reporter   ~0029359

(charactermapdlg.pas-5.pas)
But when I move my mouse on a cell in the stringgrid,
the caption of this form changed!
Look at the screen shot, please.

Sam Bliss

2009-07-29 10:26

reporter   ~0029361

Re: 0029358
"It doesn't even make a lot of sense imho to show the UTF-hex-sequence in the ANSI Tab. There is the Unicode Tab for Unicode. "
Do you mean there are too many informations show in CharInfoLabel?
If you think that, you can delete something.
But I need that for querying UTF-8 of an ANSI character.

And a little suggestion of that button panel,
why I can not press ESC for close that form?
There is not a property like this ButtonPanel.something.Cancel

2009-07-29 10:26

 

ANSIOSX.png (72,766 bytes)   
ANSIOSX.png (72,766 bytes)   

Dmitry Boyarintsev

2009-07-29 10:28

developer   ~0029362

To fix the crash the patch is simple (charactermapdlg.pas.diff.fixcrash.zip).

But, this would leave the ANSI table empty for Chinese codepage. (BUG.png) ANSI table on OSX (with system encoding UTF8) is filled with incorrect characters (ANSIOSX.png).

Of course it may tell the user, that current system encoding is multi-byte, so Unicode table should be used instead of ANSI

Index: charactermapdlg.pas
===================================================================
--- charactermapdlg.pas (revision 20985)
+++ charactermapdlg.pas (working copy)
@@ -161,7 +161,7 @@
   if (Button = mbLeft) and (StringGrid1.MouseToGridZone(X, Y) = gzNormal) then
   begin
     StringGrid1.MouseToCell(X, Y, Col, Row);
- if Assigned(OnInsertCharacter) then
+ if (StringGrid1.Cells[Col, Row] <> '') and (Assigned(OnInsertCharacter)) then
       OnInsertCharacter(StringGrid1.Cells[Col, Row]);
   end;
 end;
@@ -176,9 +176,14 @@
   begin
     StringGrid1.MouseToCell(X, Y, Col, Row);
     
- CharOrd := Ord(UTF8ToAnsi(StringGrid1.Cells[Col, Row])[1]);
- CharInfoLabel.Caption := 'Decimal = ' + IntToStr(CharOrd) +
- ', Hex = $' + HexStr(CharOrd, 2);
+ if StringGrid1.Cells[Col, Row] <> '' then
+ begin
+ CharOrd := Ord(UTF8ToAnsi(StringGrid1.Cells[Col, Row])[1]);
+ CharInfoLabel.Caption := 'Decimal = ' + IntToStr(CharOrd) +
+ ', Hex = $' + HexStr(CharOrd, 2);
+ end
+ else
+ CharInfoLabel.Caption := '-';
   end
   else
   begin

2009-07-29 10:43

 

Sam Bliss

2009-07-29 10:43

reporter   ~0029364

Maybe this problem happened because MacOSX uses big endian.
I said "// TODO: I have not tried in other OS or the OS that uses big endian."
Please help me:
Rewrite the code and be careful that problem, can it run as well?

Sam Bliss

2009-07-29 10:54

reporter   ~0029365

Oh, Dmitry Boyarintsev 0029362,
what version of charactermapdlg.pas did you ran in OSX?
The oringal version of laz0.9.27?
Or version 4 or 5?

Dmitry Boyarintsev

2009-07-29 11:04

developer   ~0029366

Last edited: 2009-07-29 11:09

both of them. Both patchers are working as expected: 128..255 characters are replaced by Unicode equivalents.

The ANSIOSX screenshot shows, how OSX is working without any patches.

I'm using Intel Mac. There's no difference in UTF-8 encoding for Big Endian and Little Endian byte order.

theo

2009-07-29 13:59

reporter   ~0029375

@Sam Bliss
> Do you mean there are too many informations show in CharInfoLabel?

No, I mean that ANSI is ANSI and unicode is unicode. UTF-8 is unicode.
You can find the values you are looking for under the "Latin 1 Supplement" selection in the unicode tab.

Sam Bliss

2009-07-30 07:19

reporter   ~0029394

re: theo
Well, I think so.
Delete UTF-8 information in ANSI tab is better.

Sam Bliss

2009-07-30 07:23

reporter   ~0029395

Re: Dmitry Boyarintsev
You means the patch does not work on OSX?

Sam Bliss

2009-07-30 07:37

reporter   ~0029396

UTF-8 Encoding Table (Binary)

ANSI = Unicode/UTF-16/UCS-2 = UTF-8
0xxx xxxx = 0000 0000 0xxx xxxx = 0xxx xxxx
yyxx xxxx = 0000 0000 yyxx xxxx = 1100 00yy 10xx xxxx

For this table,
I think do not use UnicodeToUtf8 or Utf8ToUnicode is better.
Because use these functions will call WinAPI (only for Windows) or other function in system (for others).
The results of these functions might different, if you choose different default codepages.
Calculate them manual is better and does not matter for default codepage.

See more at http://en.wikipedia.org/wiki/UTF-8

Sam Bliss

2009-07-30 07:43

reporter   ~0029397

I wrote my patch with this way,
so, we will convert it to Unicode.
We use UnicodeLE or BE?
So I worried about this problem.

You said, "I'm using Intel Mac. There's no difference in UTF-8 encoding for Big Endian and Little Endian byte order."

But it does not work! How do you explan it?

Dmitry Boyarintsev

2009-07-30 08:15

developer   ~0029398

Last edited: 2009-07-30 08:16

Sam Bliss. The 4th & 5th patches work for MacOSX ('?' chars are replaced with proper unicode characters)

The ANSIOSX.png screenshot shows Character Table without any patches applied.

Also. Please keep discussions out of the bug tracker. Open the discussion about ANSI table on the lazarus forum or lazarus mailling list.

Sam Bliss

2009-07-30 08:26

reporter   ~0029399

For charactermapdlg.pas-5,
when I move my mouse on #$80..#$FE,
it shows incorrect Decimal and Hex number!
ASCII Dec Hex UTF-8 Display as (see the screen shot)
129 63 3F C281 square
130 63 3F C282 square
131 63 3F C283 square
...
159 63 3F C29F quuare
160 63 3F C2A0 space
161 63 3F C2A1 letter i
162 51 33 C2A2 letter c with a line
163 52 34 C2A3 letter L with a line
164 161 A1 C2A4 small square
165 54 36 C2A5 letter Y with two lines
166 63 3F C2A6 spilliter
167 161 A1 C2A7 two letter S's
168 161 A1 C2A8 two dots
169 63 3F C2A9 letter C with circle
170 63 3F C2AA small number 2
171 63 3F C2AB two "less than" characters
...

Sam Bliss

2009-07-30 08:58

reporter   ~0029403

Last edited: 2009-07-30 09:18

How to do that?
"Open the discussion about ANSI table on the lazarus forum or lazarus mailling list."

Dmitry Boyarintsev

2009-07-30 10:00

developer   ~0029405

Please, visit the site: http://forum.lazarus.freepascal.org/
register there (if you're not already).
You can start a new topic there, about how ANSI table should behave on multibyte code pages.

Thanks.

Sam Bliss

2009-07-30 10:16

reporter   ~0029406

I did that,
http://forum.lazarus.freepascal.org/index.php/topic,7139.0.html

Vincent Snijders

2009-08-03 16:24

manager   ~0029496

I committed the last patch. Thanks.

Issue History

Date Modified Username Field Change
2009-07-25 08:51 Sam Bliss New Issue
2009-07-25 08:51 Sam Bliss File Added: Bug.PNG
2009-07-25 08:51 Sam Bliss Widgetset => Win32/Win64
2009-07-25 08:53 Sam Bliss Note Added: 0029241
2009-07-25 09:05 Sam Bliss Tag Attached: utf8
2009-07-25 09:05 Sam Bliss Tag Attached: multi-language support
2009-07-25 12:06 Dmitry Boyarintsev Note Added: 0029242
2009-07-25 12:06 Dmitry Boyarintsev Note Edited: 0029242
2009-07-25 12:07 Dmitry Boyarintsev File Added: charactermapdlg.diff
2009-07-25 12:07 Dmitry Boyarintsev Note Edited: 0029242
2009-07-27 07:47 Sam Bliss Note Added: 0029272
2009-07-27 09:15 Dmitry Boyarintsev Note Added: 0029273
2009-07-27 09:16 Dmitry Boyarintsev File Added: charactermapdlg.pas.zip
2009-07-28 07:58 Sam Bliss Note Added: 0029304
2009-07-28 08:05 Dmitry Boyarintsev Note Added: 0029305
2009-07-28 08:20 Sam Bliss Note Added: 0029307
2009-07-28 08:21 Sam Bliss File Added: Fixed.PNG
2009-07-28 08:38 Dmitry Boyarintsev File Deleted: charactermapdlg.diff
2009-07-28 08:40 Dmitry Boyarintsev Note Added: 0029308
2009-07-28 08:41 Dmitry Boyarintsev File Added: charactermapdlg.pas-2.zip
2009-07-28 09:30 Sam Bliss Note Added: 0029312
2009-07-28 09:35 Sam Bliss Note Edited: 0029312
2009-07-28 09:35 Sam Bliss Note Edited: 0029312
2009-07-28 09:37 Sam Bliss Note Edited: 0029312
2009-07-28 09:44 Sam Bliss Note Edited: 0029312
2009-07-28 09:55 Sam Bliss Note Edited: 0029312
2009-07-28 09:58 Sam Bliss Note Edited: 0029312
2009-07-28 10:06 Dmitry Boyarintsev Note Added: 0029314
2009-07-28 10:15 Sam Bliss Note Added: 0029315
2009-07-28 10:16 Sam Bliss Note Edited: 0029312
2009-07-28 10:18 Sam Bliss Note Edited: 0029312
2009-07-28 10:22 Sam Bliss Note Edited: 0029312
2009-07-28 10:27 Sam Bliss Note Edited: 0029312
2009-07-28 10:32 Sam Bliss Note Edited: 0029312
2009-07-28 10:34 Sam Bliss File Added: charactermapdlg.pas_Patch.zip
2009-07-28 10:36 Sam Bliss Note Added: 0029316
2009-07-28 10:38 Dmitry Boyarintsev Note Added: 0029317
2009-07-28 10:39 Dmitry Boyarintsev File Added: charactermapdlg.pas-3.zip
2009-07-28 11:15 Sam Bliss Note Added: 0029318
2009-07-28 11:15 Sam Bliss Note Added: 0029319
2009-07-28 11:17 Sam Bliss File Added: charactermapdlg.pas-3.PNG
2009-07-28 11:18 Sam Bliss Note Added: 0029320
2009-07-28 11:22 Sam Bliss File Added: charactermapdlg-4.zip
2009-07-28 11:23 Sam Bliss Note Added: 0029321
2009-07-28 11:26 Sam Bliss Note Edited: 0029321
2009-07-28 12:00 theo Note Added: 0029322
2009-07-28 12:55 Dmitry Boyarintsev Note Added: 0029323
2009-07-28 12:58 Dmitry Boyarintsev File Added: charactermapdlg.pas-5.zip
2009-07-28 12:59 Dmitry Boyarintsev Note Added: 0029324
2009-07-29 08:59 Sam Bliss File Added: charactermapdlg.pas-5.PNG
2009-07-29 09:01 Sam Bliss Note Added: 0029356
2009-07-29 09:44 theo Note Added: 0029358
2009-07-29 10:15 Sam Bliss Note Added: 0029359
2009-07-29 10:26 Sam Bliss Note Added: 0029361
2009-07-29 10:26 Dmitry Boyarintsev File Added: ANSIOSX.png
2009-07-29 10:28 Dmitry Boyarintsev Note Added: 0029362
2009-07-29 10:38 Sam Bliss Note Edited: 0029312
2009-07-29 10:43 Dmitry Boyarintsev File Added: charactermapdlg.pas.diff.fixcrash.zip
2009-07-29 10:43 Sam Bliss Note Added: 0029364
2009-07-29 10:45 Sam Bliss Note Edited: 0029321
2009-07-29 10:54 Sam Bliss Note Added: 0029365
2009-07-29 11:04 Dmitry Boyarintsev Note Added: 0029366
2009-07-29 11:09 Dmitry Boyarintsev Note Edited: 0029366
2009-07-29 13:59 theo Note Added: 0029375
2009-07-30 07:19 Sam Bliss Note Added: 0029394
2009-07-30 07:23 Sam Bliss Note Added: 0029395
2009-07-30 07:37 Sam Bliss Note Added: 0029396
2009-07-30 07:43 Sam Bliss Note Added: 0029397
2009-07-30 08:15 Dmitry Boyarintsev Note Added: 0029398
2009-07-30 08:16 Dmitry Boyarintsev Note Edited: 0029398
2009-07-30 08:26 Sam Bliss Note Added: 0029399
2009-07-30 08:58 Sam Bliss Note Added: 0029403
2009-07-30 09:18 Sam Bliss Note Edited: 0029403
2009-07-30 10:00 Dmitry Boyarintsev Note Added: 0029405
2009-07-30 10:16 Sam Bliss Note Added: 0029406
2009-08-03 13:22 Vincent Snijders Status new => assigned
2009-08-03 13:22 Vincent Snijders Assigned To => Vincent Snijders
2009-08-03 16:24 Vincent Snijders Fixed in Revision => 21089
2009-08-03 16:24 Vincent Snijders LazTarget => -
2009-08-03 16:24 Vincent Snijders Status assigned => resolved
2009-08-03 16:24 Vincent Snijders Fixed in Version => 0.9.27 (SVN)
2009-08-03 16:24 Vincent Snijders Resolution open => fixed
2009-08-03 16:24 Vincent Snijders Note Added: 0029496
2009-10-23 00:40 Marc Weustink Status resolved => closed