0037315FPCRTLpublic2020-07-09 23:23
ReporterCudaText man Assigned ToMichael Van Canneyt  
Status assignedResolutionopen 
Summary0037315: WideStrUtils - more implemented

Orig docs

User asked to add some func there, so I added N funcs.
CudaText man

2020-07-08 10:53

reporter (3,629 bytes)

Tomas Hajny

2020-07-08 11:29

manager   ~0123815

Oh well, yet another half-baked invention created in Delphi. :-( I feel somehow that TEncodeType should contain at least something like etUTF16 (and preferably also etUTF32 for completeness) in addition to the other values defined for that set... Obviously, this comment is not meant as critics of the contribution provided above; the contribution is appreciated as a compatibility improvement. I realize that the problem is in the original Delphi definition of the mentioned type.

CudaText man

2020-07-09 23:07

reporter   ~0123857

updated unit fpdetectutf8 (optimized). (929 bytes)

CudaText man

2020-07-09 23:09

reporter   ~0123858

> that TEncodeType should contain at least something like etUTF16 (and preferably also etUTF32

no, Delphi type is fully Ok, no need in utf16 (very hard to detect even for Slavic txt, and very hard for CJK).

Tomas Hajny

2020-07-09 23:19

manager   ~0123859

Last edited: 2020-07-09 23:23

What's so difficult with reading the different BOMs? You cannot differentiate UTF-8 without BOM from complex 8-bit codepage text file using full range of characters completely reliably either...

And if you say that there's no need - well, there's obviously no such need for people interested in Delphi compatibility, which is probably the whole point of this unit. Apart from that, one could hardly say that there are no UTF-16 encoded text files which might need to be handled in Pascal code, right?

