View Issue Details

IDProjectCategoryView StatusLast Update
0011791FPCRTLpublic2009-03-17 20:10
ReporterJosé Mejuto Assigned ToJonas Maebe  
PrioritynormalSeverityminorReproducibilityalways
Status closedResolutionfixed 
Product Version2.2.0 
Target Version2.4.0Fixed in Version2.4.0 
Summary0011791: UTF8ToUnicode destroy string when encounter invalid sequence
DescriptionThe UTF8ToUnicode procedure in 'wustrings.inc' always return a NULL/blanked string when the source UTF8 string has an invalid sequence. I had check the current trunk code and the same routine is there, so this problem should be present in all FPC RTL versions, not only 2.2.0.
Additional InformationAttached there is a new implementation and the well know UTF8 stress test:

http://www.cl.cam.ac.uk/~mgk25/ucs/examples/UTF-8-test.txt

which this implementation passes (for 2 unicode chars). The implementation allows to expand it easily for 4,5,6 bytes UTF8 sequences, but as they can not fit in two bytes they are handled as invalid UTF8 sequences.
The implementation also handles the LF conversion to CR+LF (as specified in UTF8) but it has been disabled to preserve compatibility with current code.

It has been tested in WinXP with FPC 2.2.0 and 2.2.2.

There is a const value in the code to use as the mark for invalid UTF8 sequence.
TagsNo tags attached.
Fixed in Revision12902
FPCOldBugId
FPCTarget
Attached Files

Relationships

has duplicate 0011976 closedJonas Maebe UTF8Encode/Decode should work with UTF-16 and not UCS-2 
related to 0013075 closedJonas Maebe UnicodeToUtf8 does not convert UTF-8 surrogate characters correctly 

Activities

2008-08-04 17:56

 

UTF8ToUnicode.pas (7,329 bytes)   
const UNICODE_INVALID=63; //Invalid unicode utf-8 sequence char '?'

function UTF8ToUnicode(Dest: PWideChar; MaxDestChars: SizeUInt; Source: PChar; SourceBytes: SizeUInt): SizeUInt;
var
  InputUTF8: SizeUInt;
  IBYTE: BYTE;
  OutputUnicode: SizeUInt;
  PRECHAR: SizeUInt;
  TempBYTE: BYTE;
  CharLen: SizeUint;
  LookAhead: SizeUInt;
  UC: SizeUInt;
begin
  if not assigned(Source) then begin
    result:=0;
    exit;
  end;
  result:=SizeUInt(-1);
  InputUTF8:=0;
  OutputUnicode:=0;
  PreChar:=0;
  if Assigned(Dest) Then begin
    while (OutputUnicode<MaxDestChars) and (InputUTF8<SourceBytes) do begin
      IBYTE:=byte(Source[InputUTF8]);
      if (IBYTE and $80) = 0 then begin
        //One character US-ASCII, convert it to unicode
        if IBYTE = 10 then begin
          If (PreChar<>13) and FALSE Then begin
            //Expand to crlf, conform UTF-8.
            //This procedure will break the memory alocation by
            //FPC for the widestring, so never use it. Condition never true due the "and FALSE".
            if OutputUnicode+1<MaxDestChars Then begin
              Dest[OutputUnicode]:=WideChar(13);
              inc(OutputUnicode);
              Dest[OutputUnicode]:=WideChar(10);
              inc(OutputUnicode);
              PreChar:=10;
            end else begin
              Dest[OutputUnicode]:=WideChar(13);
              inc(OutputUnicode);
            end;
          end else begin
            Dest[OutputUnicode]:=WideChar(IBYTE);
            inc(OutputUnicode);
            PreChar:=IBYTE;
          end;
        end else begin
          Dest[OutputUnicode]:=WideChar(IBYTE);
          inc(OutputUnicode);
          PreChar:=IBYTE;
        end;
        inc(InputUTF8);
      end else begin
        TempByte:=IBYTE;
        CharLen:=0;
        while (TempBYTE and $80)<>0 do begin
          TempBYTE:=(TempBYTE shl 1) and $FE;
          inc(CharLen);
        end;
        //Test for the "CharLen" conforms UTF-8 string
        //This means the 10xxxxxx pattern.
{$HINTS OFF} //-HINT about converting to int64
        if (InputUTF8+CharLen-1)>SourceBytes Then begin
{$HINTS ON}
          //Insuficient chars in string to decode
          //UTF-8 array. Fallback to single char.
          CharLen:= 1;
        end;
        for LookAhead := 1 to CharLen-1 do begin
          if ((byte(Source[InputUTF8+LookAhead]) and $80)<>$80) or
             ((byte(Source[InputUTF8+LookAhead]) and $40)<>$00) then begin
              //Invalid UTF-8 sequence, fallback.
              CharLen:= LookAhead;
              break;
          end;
        end;
        UC:=$FFFF;
        Case CharLen of
          1:  begin
                //Not valid UTF-8 sequence
                UC:=UNICODE_INVALID;
              end;
          2:  begin
                //Two bytes UTF, convert it
                UC:=(byte(Source[InputUTF8]) and $1F) shl 6;
                UC:=UC or (byte(Source[InputUTF8+1]) and $3F);
                if UC <= $7F then begin
                    //Invalid UTF sequence.
                    UC:=UNICODE_INVALID;
                end;
              end;
          3:  begin
                //Three bytes, convert it to unicode
                UC:= (byte(Source[InputUTF8]) and $0F) shl 12;
                UC:= UC or ((byte(Source[InputUTF8+1]) and $3F) shl 6);
                UC:= UC or ((byte(Source[InputUTF8+2]) and $3F));
                If (UC <= $7FF) or (UC >= $FFFE) or ((UC >= $D800) and (UC <= $DFFF)) Then begin
                    //Invalid UTF-8 sequence
                    UC:= UNICODE_INVALID;
                End;
              end;
          4,5,6,7:  begin
                      //Invalid UTF8 to unicode conversion,
                      //mask it as invalid UNICODE too.
                      UC:=UNICODE_INVALID;
                    end;
        end;
        If CharLen > 0 Then begin
            PreChar:=UC;
            Dest[OutputUnicode]:=WideChar(UC);
            inc(OutputUnicode);
        End;
        InputUTF8:= InputUTF8 + CharLen;
      end;
    end;
    Result:=OutputUnicode+1;
  end else begin
    while (InputUTF8<SourceBytes) do begin
      IBYTE:=byte(Source[InputUTF8]);
      if (IBYTE and $80) = 0 then begin
        //One character US-ASCII, convert it to unicode
        if IBYTE = 10 then begin
          If (PreChar<>13) and FALSE Then begin
            //Expand to crlf, conform UTF-8.
            //This procedure will break the memory alocation by
            //FPC for the widestring, so never use it. Condition never true due the "and FALSE".
            inc(OutputUnicode,2);
            PreChar:=10;
          end else begin
            inc(OutputUnicode);
            PreChar:=IBYTE;
          end;
        end else begin
          inc(OutputUnicode);
          PreChar:=IBYTE;
        end;
        inc(InputUTF8);
      end else begin
        TempByte:=IBYTE;
        CharLen:=0;
        while (TempBYTE and $80)<>0 do begin
          TempBYTE:=(TempBYTE shl 1) and $FE;
          inc(CharLen);
        end;
        //Test for the "CharLen" conforms UTF-8 string
        //This means the 10xxxxxx pattern.
{$HINTS OFF} //-HINT about converting to int64
        if (InputUTF8+CharLen-1)>SourceBytes Then begin
{$HINTS ON}
          //Insuficient chars in string to decode
          //UTF-8 array. Fallback to single char.
          CharLen:= 1;
        end;
        for LookAhead := 1 to CharLen-1 do begin
          if ((byte(Source[InputUTF8+LookAhead]) and $80)<>$80) or
             ((byte(Source[InputUTF8+LookAhead]) and $40)<>$00) then begin
              //Invalid UTF-8 sequence, fallback.
              CharLen:= LookAhead;
              break;
          end;
        end;
        UC:=$FFFF;
        Case CharLen of
          1:  begin
                //Not valid UTF-8 sequence
                UC:=UNICODE_INVALID;
              end;
          2:  begin
                //Two bytes UTF, convert it
                UC:=(byte(Source[InputUTF8]) and $1F) shl 6;
                UC:=UC or (byte(Source[InputUTF8+1]) and $3F);
                if UC <= $7F then begin
                    //Invalid UTF sequence.
                    UC:=UNICODE_INVALID;
                end;
              end;
          3:  begin
                //Three bytes, convert it to unicode
                UC:= (byte(Source[InputUTF8]) and $0F) shl 12;
                UC:= UC or ((byte(Source[InputUTF8+1]) and $3F) shl 6);
                UC:= UC or ((byte(Source[InputUTF8+2]) and $3F));
                If (UC <= $7FF) or (UC >= $FFFE) or ((UC >= $D800) and (UC <= $DFFF)) Then begin
                    //Invalid UTF-8 sequence
                    UC:= UNICODE_INVALID;
                End;
              end;
          4,5,6,7:  begin
                      //Invalid UTF8 to unicode conversion,
                      //mask it as invalid UNICODE too.
                      UC:=UNICODE_INVALID;
                    end;
        end;
        If CharLen > 0 Then begin
            PreChar:=UC;
            inc(OutputUnicode);
        End;
        InputUTF8:= InputUTF8 + CharLen;
      end;
    end;
    Result:=OutputUnicode+1;
  end;
end;
UTF8ToUnicode.pas (7,329 bytes)   

2008-08-04 17:57

 

UTF-8-test.txt (20,334 bytes)   
UTF-8 decoder capability and stress test
----------------------------------------

Markus Kuhn <http://www.cl.cam.ac.uk/~mgk25/> - 2003-02-19

This test file can help you examine, how your UTF-8 decoder handles
various types of correct, malformed, or otherwise interesting UTF-8
sequences. This file is not meant to be a conformance test. It does
not prescribes any particular outcome and therefore there is no way to
"pass" or "fail" this test file, even though the texts suggests a
preferable decoder behaviour at some places. The aim is instead to
help you think about and test the behaviour of your UTF-8 on a
systematic collection of unusual inputs. Experience so far suggests
that most first-time authors of UTF-8 decoders find at least one
serious problem in their decoder by using this file.

The test lines below cover boundary conditions, malformed UTF-8
sequences as well as correctly encoded UTF-8 sequences of Unicode code
points that should never occur in a correct UTF-8 file.

According to ISO 10646-1:2000, sections D.7 and 2.3c, a device
receiving UTF-8 shall interpret a "malformed sequence in the same way
that it interprets a character that is outside the adopted subset" and
"characters that are not within the adopted subset shall be indicated
to the user" by a receiving device. A quite commonly used approach in
UTF-8 decoders is to replace any malformed UTF-8 sequence by a
replacement character (U+FFFD), which looks a bit like an inverted
question mark, or a similar symbol. It might be a good idea to
visually distinguish a malformed UTF-8 sequence from a correctly
encoded Unicode character that is just not available in the current
font but otherwise fully legal, even though ISO 10646-1 doesn't
mandate this. In any case, just ignoring malformed sequences or
unavailable characters does not conform to ISO 10646, will make
debugging more difficult, and can lead to user confusion.

Please check, whether a malformed UTF-8 sequence is (1) represented at
all, (2) represented by exactly one single replacement character (or
equivalent signal), and (3) the following quotation mark after an
illegal UTF-8 sequence is correctly displayed, i.e. proper
resynchronization takes place immageately after any malformed
sequence. This file says "THE END" in the last line, so if you don't
see that, your decoder crashed somehow before, which should always be
cause for concern.

All lines in this file are exactly 79 characters long (plus the line
feed). In addition, all lines end with "|", except for the two test
lines 2.1.1 and 2.2.1, which contain non-printable ASCII controls
U+0000 and U+007F. If you display this file with a fixed-width font,
these "|" characters should all line up in column 79 (right margin).
This allows you to test quickly, whether your UTF-8 decoder finds the
correct number of characters in every line, that is whether each
malformed sequences is replaced by a single replacement character.

Note that as an alternative to the notion of malformed sequence used
here, it is also a perfectly acceptable (and in some situations even
preferable) solution to represent each individual byte of a malformed
sequence by a replacement character. If you follow this strategy in
your decoder, then please ignore the "|" column.


Here come the tests:                                                          |
                                                                              |
1  Some correct UTF-8 text                                                    |
                                                                              |
You should see the Greek word 'kosme':       "κόσμε"                          |
                                                                              |
2  Boundary condition test cases                                              |
                                                                              |
2.1  First possible sequence of a certain length                              |
                                                                              |
2.1.1  1 byte  (U-00000000):        ""                                        
2.1.2  2 bytes (U-00000080):        "€"                                       |
2.1.3  3 bytes (U-00000800):        "ࠀ"                                       |
2.1.4  4 bytes (U-00010000):        "𐀀"                                       |
2.1.5  5 bytes (U-00200000):        "�����"                                       |
2.1.6  6 bytes (U-04000000):        "������"                                       |
                                                                              |
2.2  Last possible sequence of a certain length                               |
                                                                              |
2.2.1  1 byte  (U-0000007F):        ""                                        
2.2.2  2 bytes (U-000007FF):        "߿"                                       |
2.2.3  3 bytes (U-0000FFFF):        "￿"                                       |
2.2.4  4 bytes (U-001FFFFF):        "����"                                       |
2.2.5  5 bytes (U-03FFFFFF):        "�����"                                       |
2.2.6  6 bytes (U-7FFFFFFF):        "������"                                       |
                                                                              |
2.3  Other boundary conditions                                                |
                                                                              |
2.3.1  U-0000D7FF = ed 9f bf = "퟿"                                            |
2.3.2  U-0000E000 = ee 80 80 = ""                                            |
2.3.3  U-0000FFFD = ef bf bd = "�"                                            |
2.3.4  U-0010FFFF = f4 8f bf bf = "􏿿"                                         |
2.3.5  U-00110000 = f4 90 80 80 = "�"                                         |
                                                                              |
3  Malformed sequences                                                        |
                                                                              |
3.1  Unexpected continuation bytes                                            |
                                                                              |
Each unexpected continuation byte should be separately signalled as a         |
malformed sequence of its own.                                                |
                                                                              |
3.1.1  First continuation byte 0x80: "�"                                      |
3.1.2  Last  continuation byte 0xbf: "�"                                      |
                                                                              |
3.1.3  2 continuation bytes: "��"                                             |
3.1.4  3 continuation bytes: "���"                                            |
3.1.5  4 continuation bytes: "����"                                           |
3.1.6  5 continuation bytes: "�����"                                          |
3.1.7  6 continuation bytes: "������"                                         |
3.1.8  7 continuation bytes: "�������"                                        |
                                                                              |
3.1.9  Sequence of all 64 possible continuation bytes (0x80-0xbf):            |
                                                                              |
   "����������������                                                          |
    ����������������                                                          |
    ����������������                                                          |
    ����������������"                                                         |
                                                                              |
3.2  Lonely start characters                                                  |
                                                                              |
3.2.1  All 32 first bytes of 2-byte sequences (0xc0-0xdf),                    |
       each followed by a space character:                                    |
                                                                              |
   "� � � � � � � � � � � � � � � �                                           |
    � � � � � � � � � � � � � � � � "                                         |
                                                                              |
3.2.2  All 16 first bytes of 3-byte sequences (0xe0-0xef),                    |
       each followed by a space character:                                    |
                                                                              |
   "� � � � � � � � � � � � � � � � "                                         |
                                                                              |
3.2.3  All 8 first bytes of 4-byte sequences (0xf0-0xf7),                     |
       each followed by a space character:                                    |
                                                                              |
   "� � � � � � � � "                                                         |
                                                                              |
3.2.4  All 4 first bytes of 5-byte sequences (0xf8-0xfb),                     |
       each followed by a space character:                                    |
                                                                              |
   "� � � � "                                                                 |
                                                                              |
3.2.5  All 2 first bytes of 6-byte sequences (0xfc-0xfd),                     |
       each followed by a space character:                                    |
                                                                              |
   "� � "                                                                     |
                                                                              |
3.3  Sequences with last continuation byte missing                            |
                                                                              |
All bytes of an incomplete sequence should be signalled as a single           |
malformed sequence, i.e., you should see only a single replacement            |
character in each of the next 10 tests. (Characters as in section 2)          |
                                                                              |
3.3.1  2-byte sequence with last byte missing (U+0000):     "�"               |
3.3.2  3-byte sequence with last byte missing (U+0000):     "�"               |
3.3.3  4-byte sequence with last byte missing (U+0000):     "�"               |
3.3.4  5-byte sequence with last byte missing (U+0000):     "����"               |
3.3.5  6-byte sequence with last byte missing (U+0000):     "�����"               |
3.3.6  2-byte sequence with last byte missing (U-000007FF): "�"               |
3.3.7  3-byte sequence with last byte missing (U-0000FFFF): "�"               |
3.3.8  4-byte sequence with last byte missing (U-001FFFFF): "���"               |
3.3.9  5-byte sequence with last byte missing (U-03FFFFFF): "����"               |
3.3.10 6-byte sequence with last byte missing (U-7FFFFFFF): "�����"               |
                                                                              |
3.4  Concatenation of incomplete sequences                                    |
                                                                              |
All the 10 sequences of 3.3 concatenated, you should see 10 malformed         |
sequences being signalled:                                                    |
                                                                              |
   "������������������������"                                                               |
                                                                              |
3.5  Impossible bytes                                                         |
                                                                              |
The following two bytes cannot appear in a correct UTF-8 string               |
                                                                              |
3.5.1  fe = "�"                                                               |
3.5.2  ff = "�"                                                               |
3.5.3  fe fe ff ff = "����"                                                   |
                                                                              |
4  Overlong sequences                                                         |
                                                                              |
The following sequences are not malformed according to the letter of          |
the Unicode 2.0 standard. However, they are longer then necessary and         |
a correct UTF-8 encoder is not allowed to produce them. A "safe UTF-8         |
decoder" should reject them just like malformed sequences for two             |
reasons: (1) It helps to debug applications if overlong sequences are         |
not treated as valid representations of characters, because this helps        |
to spot problems more quickly. (2) Overlong sequences provide                 |
alternative representations of characters, that could maliciously be          |
used to bypass filters that check only for ASCII characters. For              |
instance, a 2-byte encoded line feed (LF) would not be caught by a            |
line counter that counts only 0x0a bytes, but it would still be               |
processed as a line feed by an unsafe UTF-8 decoder later in the              |
pipeline. From a security point of view, ASCII compatibility of UTF-8         |
sequences means also, that ASCII characters are *only* allowed to be          |
represented by ASCII bytes in the range 0x00-0x7f. To ensure this             |
aspect of ASCII compatibility, use only "safe UTF-8 decoders" that            |
reject overlong UTF-8 sequences for which a shorter encoding exists.          |
                                                                              |
4.1  Examples of an overlong ASCII character                                  |
                                                                              |
With a safe UTF-8 decoder, all of the following five overlong                 |
representations of the ASCII character slash ("/") should be rejected         |
like a malformed UTF-8 sequence, for instance by substituting it with         |
a replacement character. If you see a slash below, you do not have a          |
safe UTF-8 decoder!                                                           |
                                                                              |
4.1.1 U+002F = c0 af             = "��"                                        |
4.1.2 U+002F = e0 80 af          = "�"                                        |
4.1.3 U+002F = f0 80 80 af       = "�"                                        |
4.1.4 U+002F = f8 80 80 80 af    = "�����"                                        |
4.1.5 U+002F = fc 80 80 80 80 af = "������"                                        |
                                                                              |
4.2  Maximum overlong sequences                                               |
                                                                              |
Below you see the highest Unicode value that is still resulting in an         |
overlong sequence if represented with the given number of bytes. This         |
is a boundary test for safe UTF-8 decoders. All five characters should        |
be rejected like malformed UTF-8 sequences.                                   |
                                                                              |
4.2.1  U-0000007F = c1 bf             = "��"                                   |
4.2.2  U-000007FF = e0 9f bf          = "�"                                   |
4.2.3  U-0000FFFF = f0 8f bf bf       = "�"                                   |
4.2.4  U-001FFFFF = f8 87 bf bf bf    = "�����"                                   |
4.2.5  U-03FFFFFF = fc 83 bf bf bf bf = "������"                                   |
                                                                              |
4.3  Overlong representation of the NUL character                             |
                                                                              |
The following five sequences should also be rejected like malformed           |
UTF-8 sequences and should not be treated like the ASCII NUL                  |
character.                                                                    |
                                                                              |
4.3.1  U+0000 = c0 80             = "��"                                       |
4.3.2  U+0000 = e0 80 80          = "�"                                       |
4.3.3  U+0000 = f0 80 80 80       = "�"                                       |
4.3.4  U+0000 = f8 80 80 80 80    = "�����"                                       |
4.3.5  U+0000 = fc 80 80 80 80 80 = "������"                                       |
                                                                              |
5  Illegal code positions                                                     |
                                                                              |
The following UTF-8 sequences should be rejected like malformed               |
sequences, because they never represent valid ISO 10646 characters and        |
a UTF-8 decoder that accepts them might introduce security problems           |
comparable to overlong UTF-8 sequences.                                       |
                                                                              |
5.1 Single UTF-16 surrogates                                                  |
                                                                              |
5.1.1  U+D800 = ed a0 80 = "�"                                                |
5.1.2  U+DB7F = ed ad bf = "�"                                                |
5.1.3  U+DB80 = ed ae 80 = "�"                                                |
5.1.4  U+DBFF = ed af bf = "�"                                                |
5.1.5  U+DC00 = ed b0 80 = "�"                                                |
5.1.6  U+DF80 = ed be 80 = "�"                                                |
5.1.7  U+DFFF = ed bf bf = "�"                                                |
                                                                              |
5.2 Paired UTF-16 surrogates                                                  |
                                                                              |
5.2.1  U+D800 U+DC00 = ed a0 80 ed b0 80 = "��"                               |
5.2.2  U+D800 U+DFFF = ed a0 80 ed bf bf = "��"                               |
5.2.3  U+DB7F U+DC00 = ed ad bf ed b0 80 = "��"                               |
5.2.4  U+DB7F U+DFFF = ed ad bf ed bf bf = "��"                               |
5.2.5  U+DB80 U+DC00 = ed ae 80 ed b0 80 = "��"                               |
5.2.6  U+DB80 U+DFFF = ed ae 80 ed bf bf = "��"                               |
5.2.7  U+DBFF U+DC00 = ed af bf ed b0 80 = "��"                               |
5.2.8  U+DBFF U+DFFF = ed af bf ed bf bf = "��"                               |
                                                                              |
5.3 Other illegal code positions                                              |
                                                                              |
5.3.1  U+FFFE = ef bf be = "￾"                                                |
5.3.2  U+FFFF = ef bf bf = "￿"                                                |
                                                                              |
THE END                                                                       |
UTF-8-test.txt (20,334 bytes)   

Florian

2008-08-22 21:56

administrator   ~0021609

Can you please also attach the test program you used?

José Mejuto

2008-08-23 20:11

reporter   ~0021642

Sure. It is a small lazarus project (I'm not using FPC alone).
I had also modified a bit the UTF8ToUnicode function to speedup when only counting bytes to be used (AKA UC chars * 2).
Also a small bug has been fixed that parses as a good char one invalid sequece.

2008-08-23 20:12

 

utf8test.zip (8,755 bytes)

Florian

2008-08-23 20:29

administrator   ~0021643

I just tested with delphi and delphi behaves the same as FPC without the change, so I wonder if we should change this?

José Mejuto

2008-08-24 00:06

reporter   ~0021645

Hmmm... yes tested in Delphi 7 and yes, it's normal that produces almost the same result as the code is too close, too close in the flow and how to check the sequences. In the other hand I must check my implement against "pchar" strings NULL terminated because I'm used to only work with length defined strings (I hate 'C' strings).

Maybe somebody should check the behavior of Delphi 2006 or 2009 about this topic. If the code is fixed in recent versions (obviously it is buggy in D7) replace the routine, if not fixed maybe add this one as "UTF8ToUnicodeSafe" ?

I was aware about this problem porting ANSI code to unicode, one of the files when converted to WideString simply returns an empty string which in my case was a very bad solution as I lost all the content, not only the malformed piece.

I think is time to the experts to decide if fix the code or keep compatibility with the buggy implementation, both solutions are fine, not for me, but this (FPC) is not a personalized solution ;)

I can only verify the working with pchars NULLed and suggest the new code.

Thank you for your support.

Jonas Maebe

2009-03-15 16:49

manager   ~0026145

I've fixed the implementation based on your patch, and also added support for 4-character encoded codepoints.

José Mejuto

2009-03-17 20:10

reporter   ~0026190

I had not checked the code (not using SVN) but it should work.
I'm writting a little different version which is a bit faster but I'll create a new bug report when ready.

Note: UNICODE_INVALID could be $63 '?' or $FFFD which is the "right" replacer and looks nicer, but both are valid according to UTF-8.

Issue History

Date Modified Username Field Change
2008-08-04 17:56 José Mejuto New Issue
2008-08-04 17:56 José Mejuto File Added: UTF8ToUnicode.pas
2008-08-04 17:57 José Mejuto File Added: UTF-8-test.txt
2008-08-22 21:56 Florian Note Added: 0021609
2008-08-23 20:11 José Mejuto Note Added: 0021642
2008-08-23 20:12 José Mejuto File Added: utf8test.zip
2008-08-23 20:29 Florian Note Added: 0021643
2008-08-24 00:06 José Mejuto Note Added: 0021645
2008-08-27 00:18 Jonas Maebe Relationship added has duplicate 0011976
2009-01-30 16:27 Jonas Maebe Relationship added related to 0013075
2009-03-15 16:49 Jonas Maebe Fixed in Revision => 12902
2009-03-15 16:49 Jonas Maebe Status new => resolved
2009-03-15 16:49 Jonas Maebe Fixed in Version => 2.3.1
2009-03-15 16:49 Jonas Maebe Resolution open => fixed
2009-03-15 16:49 Jonas Maebe Assigned To => Jonas Maebe
2009-03-15 16:49 Jonas Maebe Note Added: 0026145
2009-03-15 16:49 Jonas Maebe Target Version => 2.4.0
2009-03-17 20:10 José Mejuto Status resolved => closed
2009-03-17 20:10 José Mejuto Note Added: 0026190