csvreadwrite failing to read files with BOM correctly
Original Reporter info from Mantis: wp @wpam
-
Reporter name:
Original Reporter info from Mantis: wp @wpam
- Reporter name:
Description:
Since version 3.0 fpc contains the csvdocument library. There is an issue if the input file contains a BOM (byte-order-mark). The csv parser always begins it works at the beginning of the stream. If the stream begins with a BOM and if the first cell, for example, is a string representing a date then the conversion from string to date fails because the BOM adds unreckognized characters to the string.
The added patch reads the first few bytes of the input stream and checks whether they are identical with a UTF8, UTF16-LE or UTF16-BE BOM and continues parsing right after the BOM. The BOM found is stored as a read-only property.
Steps to reproduce:
Run the attached demo. It reads a UTF8-file with BOM; the first cell contains a date string. Reading aborts here because the BOM is contained in the string to be converted to date by means of ScanDateTime.
Mantis conversion info:
- Mantis ID: 30897
- Version: 3.0.0
- Fixed in version: 3.1.1
- Fixed in revision: 34871 (#5af24e94)
- Target version: 3.2.0