View Issue Details

IDProjectCategoryView StatusLast Update
0036679LazarusIDEpublic2020-03-21 13:44
ReporterSven Barth Assigned ToBart Broersma  
PrioritynormalSeverityminorReproducibilityalways
Status resolvedResolutionfixed 
PlatformX86_64OSWindows NT 
Product Version2.1 (SVN) 
Fixed in Version2.0.8 
Summary0036679: Lazarus detects file with a long, first line as "binary"
DescriptionWhen opening a "HTML resource" file as generated by pas2js which might contain long lines Lazarus might detect these files as non-binary because there's no newline during the first 1024 Byte. If this file is kept open upon recompilations then it's especially annoying as not only there's the "file was changed" dialog (*), but also this "file might not be text" dialog.

(*) Using the "use contents instead of time stamp" option improves this at least as long as nothing changed
Steps To Reproduce- open the attached file in Lazarus
- notice the "file might not be text" dialog
TagsNo tags attached.
Fixed in Revisionr62787
LazTarget2.0.8
WidgetsetWin32/Win64
Attached Files

Activities

Sven Barth

2020-02-07 23:19

manager  

tp2j-res.html (3,271 bytes)   
<link rel="preload" as="script" id="resource-unit1" data-unit="Unit1" href="data:application/octet-stream;base64,b2JqZWN0IFdGb3JtMTogVFdGb3JtMQ0KICBMZWZ0ID0gNzUwDQogIEhlaWdodCA9IDc3Nw0KICBUb3AgPSAzNzMNCiAgV2lkdGggPSAxNDM3DQogIEFscGhhQmxlbmQgPSBGYWxzZQ0KICBBbHBoYUJsZW5kVmFsdWUgPSAyNTUNCiAgQ2FwdGlvbiA9ICdXRm9ybTEnDQogIENsaWVudEhlaWdodCA9IDc3Nw0KICBDbGllbnRXaWR0aCA9IDE0MzcNCiAgRGVzaWduVGltZVBQSSA9IDE5Mg0KICBvYmplY3QgV0J1dHRvbjE6IFRXQnV0dG9uDQogICAgTGVmdCA9IDEwNA0KICAgIEhlaWdodCA9IDUwDQogICAgVG9wID0gNjQNCiAgICBXaWR0aCA9IDE1MA0KICAgIENhcHRpb24gPSAnV0J1dHRvbjEnDQogICAgVGFiT3JkZXIgPSAwDQogIGVuZA0KICBvYmplY3QgV1BhZ2VDb250cm9sMTogVFdQYWdlQ29udHJvbA0KICAgIExlZnQgPSAxNDYNCiAgICBIZWlnaHQgPSA0MDANCiAgICBUb3AgPSAyMDcNCiAgICBXaWR0aCA9IDQwMA0KICAgIEFjdGl2ZVBhZ2UgPSBUYWJTaGVldDENCiAgICBUYWJJbmRleCA9IDANCiAgICBUYWJPcmRlciA9IDENCiAgICBvYmplY3QgVGFiU2hlZXQxOiBUVGFiU2hlZXQNCiAgICAgIENhcHRpb24gPSAnVGFiU2hlZXQxJw0KICAgIGVuZA0KICBlbmQNCiAgb2JqZWN0IFdJbWFnZTE6IFRXSW1hZ2UNCiAgICBMZWZ0ID0gNjM2DQogICAgSGVpZ2h0ID0gMTgwDQogICAgVG9wID0gMTQ2DQogICAgV2lkdGggPSAxODANCiAgZW5kDQogIG9iamVjdCBXRWRpdDE6IFRXRWRpdA0KICAgIExlZnQgPSAyOTYNCiAgICBIZWlnaHQgPSA0MA0KICAgIFRvcCA9IDEwMg0KICAgIFdpZHRoID0gMTYwDQogICAgVGFiT3JkZXIgPSAyDQogICAgVGV4dCA9ICdXRWRpdDEnDQogIGVuZA0KICBvYmplY3QgV0NvbWJvQm94MTogVFdDb21ib0JveA0KICAgIExlZnQgPSA1NTkNCiAgICBIZWlnaHQgPSA0MA0KICAgIFRvcCA9IDY2DQogICAgV2lkdGggPSAyMDANCiAgICBJdGVtSGVpZ2h0ID0gMzINCiAgICBUYWJPcmRlciA9IDMNCiAgICBUZXh0ID0gJ1dDb21ib0JveDEnDQogIGVuZA0KICBvYmplY3QgV0xhYmVsMTogVFdMYWJlbA0KICAgIExlZnQgPSA0MDANCiAgICBIZWlnaHQgPSAzNA0KICAgIFRvcCA9IDE2NA0KICAgIFdpZHRoID0gMTMwDQogICAgQXV0b1NpemUgPSBGYWxzZQ0KICAgIENhcHRpb24gPSAnV0xhYmVsMScNCiAgICBQYXJlbnRDb2xvciA9IEZhbHNlDQogIGVuZA0KICBvYmplY3QgV1BhbmVsMTogVFdQYW5lbA0KICAgIExlZnQgPSA3MDcNCiAgICBIZWlnaHQgPSAxMDANCiAgICBUb3AgPSA0MzQNCiAgICBXaWR0aCA9IDM0MA0KICAgIENhcHRpb24gPSAnV1BhbmVsMScNCiAgICBUYWJPcmRlciA9IDQNCiAgZW5kDQogIG9iamVjdCBXQ2hlY2tib3gxOiBUV0NoZWNrYm94DQogICAgTGVmdCA9IDk1NA0KICAgIEhlaWdodCA9IDQ2DQogICAgVG9wID0gMjc2DQogICAgV2lkdGggPSAxODANCiAgICBDYXB0aW9uID0gJ1dDaGVja2JveDEnDQogICAgVGFiT3JkZXIgPSA1DQogIGVuZA0KICBvYmplY3QgV0ZpbGVCdXR0b24xOiBUV0ZpbGVCdXR0b24NCiAgICBMZWZ0ID0gOTI4DQogICAgSGVpZ2h0ID0gNTANCiAgICBUb3AgPSAxMzINCiAgICBXaWR0aCA9IDE1MA0KICAgIENhcHRpb24gPSAnV0ZpbGVCdXR0b24xJw0KICAgIFRhYk9yZGVyID0gNg0KICBlbmQNCiAgb2JqZWN0IFdGbG9hdEVkaXQxOiBUV0Zsb2F0RWRpdA0KICAgIExlZnQgPSA2OTcNCiAgICBIZWlnaHQgPSA0MA0KICAgIFRvcCA9IDM4MQ0KICAgIFdpZHRoID0gMTYwDQogICAgQWxpZ25tZW50ID0gdGFSaWdodEp1c3RpZnkNCiAgICBEZWNpbWFsUGxhY2VzID0gMg0KICAgIFRhYk9yZGVyID0gNw0KICAgIFRleHQgPSAnMCwwMCcNCiAgICBWYWx1ZSA9IDANCiAgZW5kDQogIG9iamVjdCBXSW50ZWdlckVkaXQxOiBUV0ludGVnZXJFZGl0DQogICAgTGVmdCA9IDg5NQ0KICAgIEhlaWdodCA9IDQwDQogICAgVG9wID0gMzUwDQogICAgV2lkdGggPSAxNjANCiAgICBBbGlnbm1lbnQgPSB0YVJpZ2h0SnVzdGlmeQ0KICAgIFRhYk9yZGVyID0gOA0KICAgIFRleHQgPSAnMCcNCiAgZW5kDQogIG9iamVjdCBXRGF0ZUVkaXRCb3gxOiBUV0RhdGVFZGl0Qm94DQogICAgTGVmdCA9IDMyNw0KICAgIEhlaWdodCA9IDQwDQogICAgVG9wID0gMjgNCiAgICBXaWR0aCA9IDE2MA0KICAgIFRhYk9yZGVyID0gOQ0KICAgIFRleHQgPSAnMzAuMTIuMTg5OScNCiAgICBWYWx1ZSA9IDANCiAgZW5kDQogIG9iamVjdCBXVGltZUVkaXRCb3gxOiBUV1RpbWVFZGl0Qm94DQogICAgTGVmdCA9IDc5Mg0KICAgIEhlaWdodCA9IDQwDQogICAgVG9wID0gNTYNCiAgICBXaWR0aCA9IDE2MA0KICAgIFRhYk9yZGVyID0gMTANCiAgICBUZXh0ID0gJzAwOjAwJw0KICAgIFZhbHVlID0gMA0KICBlbmQNCmVuZA0K" />
tp2j-res.html (3,271 bytes)   

Bart Broersma

2020-02-08 00:15

developer   ~0120933

Last edited: 2020-02-08 00:21

View 3 revisions

FileIsText maybe should only exit false if it finds a character of which we are sure is not text.
What are the chances you don't find an "offending" byte in a binary file within 1024 bytes?

I have a GetTextFileType function that uses that approach and that says the offending file is of type tftASCII.

Sven Barth

2020-02-08 18:56

manager   ~0120949

I personally agree with that assumption.

Bart Broersma

2020-02-08 19:31

developer   ~0120952

Last edited: 2020-02-08 19:31

View 2 revisions

The code also allows for malformed Unicode (Utf8, Utf16LE and Utf16BE) text.
I.e. a file with UTF8 BOM and then one Chr(10) and after that only $FF will result in the functon returning True .
(My GetTextFileType will return tftUnknown)

Bart Broersma

2020-03-01 14:21

developer   ~0121294

I would propose to skip the test for NewLine and increase the buffer to 2048.
This should catch most cases of non-text files.

Bart Broersma

2020-03-21 13:42

developer   ~0121672

Fixed as proposed.
Please test and close if OK.

Issue History

Date Modified Username Field Change
2020-02-07 23:19 Sven Barth New Issue
2020-02-07 23:19 Sven Barth File Added: tp2j-res.html
2020-02-08 00:15 Bart Broersma Note Added: 0120933
2020-02-08 00:21 Bart Broersma Note Edited: 0120933 View Revisions
2020-02-08 00:21 Bart Broersma Note Edited: 0120933 View Revisions
2020-02-08 18:56 Sven Barth Note Added: 0120949
2020-02-08 19:31 Bart Broersma Note Added: 0120952
2020-02-08 19:31 Bart Broersma Note Edited: 0120952 View Revisions
2020-03-01 14:21 Bart Broersma Note Added: 0121294
2020-03-21 13:42 Bart Broersma Assigned To => Bart Broersma
2020-03-21 13:42 Bart Broersma Status new => resolved
2020-03-21 13:42 Bart Broersma Resolution open => fixed
2020-03-21 13:42 Bart Broersma Fixed in Version => 2.0.8
2020-03-21 13:42 Bart Broersma Fixed in Revision => r62787
2020-03-21 13:42 Bart Broersma LazTarget => 2.0.8
2020-03-21 13:42 Bart Broersma Widgetset Win32/Win64 => Win32/Win64
2020-03-21 13:42 Bart Broersma Note Added: 0121672