| Anonymous | Login | Signup for a new account | 2013-05-21 03:31 CEST | ![]() |
| All Projects | FPC | Lazarus: Packages, Patches | Lazarus CCR | Mantis | fpGUI | fpcprojects: fpprofiler |
| Main | My View | View Issues | Change Log | Roadmap |
| View Issue Details [ Jump to Notes ] | [ Issue History ] [ Print ] | ||||||||
| ID | Project | Category | View Status | Date Submitted | Last Update | ||||
| 0021668 | FPC | Compiler | public | 2012-04-06 18:49 | 2012-09-27 11:03 | ||||
| Reporter | Anton Kavalenka | ||||||||
| Assigned To | Jonas Maebe | ||||||||
| Priority | normal | Severity | major | Reproducibility | always | ||||
| Status | closed | Resolution | suspended | ||||||
| Platform | win32 | OS | Windows XP | OS Version | |||||
| Product Version | 2.7.1 | Product Build | |||||||
| Target Version | Fixed in Version | ||||||||
| Summary | 0021668: Concatenation rules for ansistrings lead to impossibility saving utf-8-encoded strings via TIniFile on windows | ||||||||
| Description | default string type String being assigned during concatenation always converts result to ANSI code page. string := string + utf8string always brings AnsiString(string) Examine the attached test under debugger. | ||||||||
| Tags | No tags attached. | ||||||||
| FPCOldBugId | |||||||||
| Fixed in Revision | |||||||||
| Attached Files | |||||||||
Relationships |
||||||
|
||||||
Notes |
|
|
(0058399) Marco van de Voort (manager) 2012-04-07 22:44 edited on: 2012-04-07 22:44 |
I'm not sure if rawstringbyte rvalues are correct. I directly believe that this program doesn't do what you expect it to do. However the question is why do you think it is correct? |
|
(0058439) Anton Kavalenka (reporter) 2012-04-09 17:53 |
I think if I pass Unicode or rawbyte string into TIniFile, it should save exactly what I passed. Data is lost in any concatenation. None of TIniFile internals use string(CP_ACP), but the result is translated to current ANSI code page. Just minute ago trapped into another example: SomeMenuItem.caption:=format('%d. ',[ANumber])+ UnicodeString(Name); Unicode is lost. I have to manually force concatenation to work in Unicode way. SomeMenuItem.caption:=(UnicodeString(format('%d. ',[ANumber]))+ Unicodestring(Name)); I'd like not to write my own TIniFile :) |
|
(0058440) Anton Kavalenka (reporter) 2012-04-09 17:59 |
http://docwiki.embarcadero.com/RADStudio/XE/en/Unicode_in_RAD_Studio#Code_Constructs_Independent_of_Character_Size [^] |
|
(0058441) Marco van de Voort (manager) 2012-04-09 19:20 edited on: 2012-04-09 19:21 |
The default string type is string in the default system encoding (which is afaik the same as CP_ACP). So any string parameter of TInifile is a string(AP_ACP). This is pretty much the same as pre unicode Delphi, and this goes for the entire classes hierarchy. So this is exactly the result I would expect for your code. |
|
(0058460) Anton Kavalenka (reporter) 2012-04-10 12:20 |
I cannot exactly bisect the release which broken concatenation. But it worked 2 weeks ago. According to your message all Lazarus (LCL) captions declared as String would be work in CP_ACP and LCL strictly becomes non-unicode? I insist: it works when directly assigned as Unicode string i.e. string property := Unicode String value but does not work when string property := string + Unicode string |
|
(0059130) Anton Kavalenka (reporter) 2012-04-30 19:05 edited on: 2012-04-30 19:07 |
The problem becomes even more funny (true anisotropic). I can't write utf8string, but I can READ it! The attached project contains .INI file with utf8-encoded string. Uploaded next version with .INI-file containing UTF-8 string. |
|
(0059142) Marco van de Voort (manager) 2012-04-30 21:11 |
No. Lazarus assumes that they are utf8, and manuallly insert conversions where necessary. FPC does not follow that convention. |
|
(0059148) Anton Kavalenka (reporter) 2012-04-30 23:22 |
Problem demonstrated in pure FPC ObjectPascal example. I can read UTF8 encoded string using TIniFile class, but not write. A month ago I was able both read and write. That is because inside TIniFile absent explicit codepage conversions. It is not codepage-aware at all (agnostic). That was before. Now the single concatenation inside TIniFile during .Write() leads to data loss. Problem is not inside TInifile (ObjectPascal RTL) but inside concatenation, which IMO - broken. |
|
(0059202) Paul Ishenin (developer) 2012-05-03 04:44 |
That's because RTL classes are not yet prepared for codepage strings. First the compiler changes and base RTL routines needs to be finished, then other RTL/FCL classes need to follow. |
|
(0059208) Marco van de Voort (manager) 2012-05-03 09:58 |
Anything that streams isn't done via compiletime. In Delphi all loadfile and stream methods get a parameter that signals the encoding to load/save. Actually there is no reason why this couldn't already be done. Just the default would be different now. |
|
(0059236) Anton Kavalenka (reporter) 2012-05-04 18:51 edited on: 2012-05-04 18:52 |
Yet another test. Compare the results with Delphi XE and FPC 2.7.1 trunk Case No.3 inside test project3.lpr is good for me. But inside TIniFile is working case No.2. Thank you, Paul, you cleared the horizons for me. |
|
(0062638) Anton Kavalenka (reporter) 2012-09-26 12:43 |
Specifying forcibly System.DefaultSystemCodePage:=CP_UTF8; At the early stage of program init (InitUnits) makes my program behave identically in Linux and Windows. |
|
(0062642) Jonas Maebe (manager) 2012-09-26 13:50 |
It's a bit cleaner to use SetMultiByteConversionCodePage(CP_UTF8) instead. It doesn't do anything different than setting DefaultSystemCodePage right now, but maybe that will change in the future or on different platforms. |
|
(0062645) Anton Kavalenka (reporter) 2012-09-26 16:13 |
OK, i did this. Next step for me - make Windows IO (TStream descendants) working with UTF8-encoded filenames supplied to TStream.Create and FileCreate() This issue can be closed. Problem was in short - runtime behaviour of strings when default system encoding is single-byte (ANSI). |
|
(0062646) Jonas Maebe (manager) 2012-09-26 16:23 |
I already started working locally on the sysutils part (I've already done FileCreate for all platforms, although it's only tested for Unix platforms currently). I'll commit what I've got to a separate branch, so you can then also work on it without having to duplicate anything. |
|
(0062670) Jonas Maebe (manager) 2012-09-27 11:03 |
I've committed my current changes in the new cpstrrtl branch |
Issue History |
|||
| Date Modified | Username | Field | Change |
| 2012-04-06 18:49 | Anton Kavalenka | New Issue | |
| 2012-04-06 18:49 | Anton Kavalenka | File Added: project2.lpr | |
| 2012-04-07 22:44 | Marco van de Voort | Note Added: 0058399 | |
| 2012-04-07 22:44 | Marco van de Voort | Note Edited: 0058399 | |
| 2012-04-09 17:53 | Anton Kavalenka | Note Added: 0058439 | |
| 2012-04-09 17:59 | Anton Kavalenka | Note Added: 0058440 | |
| 2012-04-09 19:20 | Marco van de Voort | Note Added: 0058441 | |
| 2012-04-09 19:21 | Marco van de Voort | Note Edited: 0058441 | |
| 2012-04-10 12:20 | Anton Kavalenka | Note Added: 0058460 | |
| 2012-04-30 19:05 | Anton Kavalenka | Note Added: 0059130 | |
| 2012-04-30 19:07 | Anton Kavalenka | Note Edited: 0059130 | |
| 2012-04-30 19:08 | Anton Kavalenka | File Added: project2-2.lpr | |
| 2012-04-30 19:08 | Anton Kavalenka | File Added: test.ini | |
| 2012-04-30 21:11 | Marco van de Voort | Note Added: 0059142 | |
| 2012-04-30 23:22 | Anton Kavalenka | Note Added: 0059148 | |
| 2012-05-03 04:44 | Paul Ishenin | Note Added: 0059202 | |
| 2012-05-03 09:58 | Marco van de Voort | Note Added: 0059208 | |
| 2012-05-04 18:51 | Anton Kavalenka | Note Added: 0059236 | |
| 2012-05-04 18:52 | Anton Kavalenka | File Added: project3.lpr | |
| 2012-05-04 18:52 | Anton Kavalenka | Note Edited: 0059236 | |
| 2012-09-26 12:43 | Anton Kavalenka | Note Added: 0062638 | |
| 2012-09-26 13:50 | Jonas Maebe | Note Added: 0062642 | |
| 2012-09-26 14:58 | Jonas Maebe | Relationship added | related to 0022982 |
| 2012-09-26 16:13 | Anton Kavalenka | Note Added: 0062645 | |
| 2012-09-26 16:23 | Jonas Maebe | Status | new => resolved |
| 2012-09-26 16:23 | Jonas Maebe | Resolution | open => suspended |
| 2012-09-26 16:23 | Jonas Maebe | Assigned To | => Jonas Maebe |
| 2012-09-26 16:23 | Jonas Maebe | Note Added: 0062646 | |
| 2012-09-26 16:24 | Anton Kavalenka | Status | resolved => closed |
| 2012-09-27 11:03 | Jonas Maebe | Note Added: 0062670 | |
| Main | My View | View Issues | Change Log | Roadmap |



