View Issue Details

IDProjectCategoryView StatusLast Update
0037370FPCOtherpublic2020-07-21 18:27
ReporterLuca Olivetti Assigned To 
PrioritynormalSeverityminorReproducibilityalways
Status newResolutionopen 
OSwin32/linux x86_64 
Summary0037370: TSdfDataset causes segfault if used with lazarus
DescriptionThe attached project causes a segfault if compiled with fpc 3.2.0. Specifically it segfaults in the 3rd call to DataSet.Post.
On my machine (linux x86_64) the segfault occurs in line 963 of rtl/inc/heap.inc ("if poc^.used = 0 then"), under win32 occurs in the same function however I don't know the exact line (I don't have a version of fpc with debug information).

A simple pascal project that does the same (creates a tsdfdataset and appends records) doesn't cause a segfault, that's why I reported the bug under lazarus but I'm not sure it's lazarus or fpc.

If I compile the project with fpc 3.0.4 there's no segfault (unfortunately I cannot use 3.0.4 due to other bugs). For the time being I'm compiling my project with the old and trusty lazarus 1.6.4/fpc 2.6.4.
Additional InformationThe stack trace in case it is useful (I'm using the qt5 widgetset):

#0 SYSGETMEM_FIXED(544) at ../inc/heap.inc:963
0000001 SYSGETMEM(544) at ../inc/heap.inc:1082
0000002 GETMEM(0x45624a, 527) at ../inc/heap.inc:284
0000003 NEWANSISTRING(502) at ../inc/astrings.inc:115
0000004 fpc_ansistr_setlength(0x0, 502, 0) at ../inc/astrings.inc:776
0000005 STORETOBUF(0x7ffff3c451d0, 0x7ffff7fb7318 'ABC,ABC') at fcl-db/src/sdf/sdfdata.pp:1137
0000006 GETRECORD(0x7ffff3c451d0, 0x7fffe9f01360 'ABC', GMCURRENT, false) at fcl-db/src/sdf/sdfdata.pp:571
0000007 RESYNC(0x7ffff3c451d0, []) at fcl-db/src/base/dataset.inc:2150
0000008 POST(0x7ffff3c451d0) at fcl-db/src/base/dataset.inc:2098
0000009 LOADDATA(0x7ffff3c451d0) at unit1.pas:49
0000010 FORMSHOW(0x7ffff3c43370, 0x7ffff3c43370) at unit1.pas:62
0000011 DOSHOW(0x7ffff3c43370) at include/customform.inc:1019
0000012 CMSHOWINGCHANGED(0x7ffff3c43370, {MSG = 45081, UNUSEDMSG = 0, WPARAM = 0, LPARAM = 0, RESULT = 0}) at include/customform.inc:649
0000013 DISPATCH(0x7ffff3c43370, ) at ../inc/objpas.inc:684
0000014 WNDPROC(0x7ffff3c43370, {MSG = 45081, UNUSEDMSG = 0, WPARAM = 0, LPARAM = 0, RESULT = 0}) at include/control.inc:2241
0000015 WNDPROC(0x7ffff3c43370, {MSG = 45081, UNUSEDMSG = 0, WPARAM = 0, LPARAM = 0, RESULT = 0}) at include/wincontrol.inc:5411
0000016 WNDPROC(0x7ffff3c43370, {MSG = 45081, UNUSEDMSG = 0, WPARAM = 0, LPARAM = 0, RESULT = 0}) at include/customform.inc:1477
0000017 PERFORM(0x7ffff3c43370, 45081, 0, 0) at include/control.inc:1581
0000018 CHANGESHOWING(0x7fffffffe1e0, true) at include/wincontrol.inc:4385
0000019 UPDATESHOWING(0x7ffff3c43370) at include/wincontrol.inc:4435
0000020 UPDATESHOWING(0x7ffff3c43370) at include/customform.inc:2799
0000021 DOALLAUTOSIZE(0x7ffff3c43370) at include/wincontrol.inc:3589
0000022 ENABLEAUTOSIZING(0x7ffff3c43370) at include/control.inc:5754
0000023 SETVISIBLE(0x7ffff3c43370, true) at include/control.inc:4557
0000024 SETVISIBLE(0x7ffff3c43370, true) at include/customform.inc:492
0000025 SHOW(0x7ffff3c43370) at include/customform.inc:2330
0000026 RUN(0x7ffff3c42bb0) at include/application.inc:1404
0000027 main at project1.lpr:19
TagsNo tags attached.
Fixed in Revision
FPCOldBugId
FPCTarget
Attached Files

Activities

Luca Olivetti

2020-07-16 08:50

reporter  

project1.zip (2,836 bytes)

Luca Olivetti

2020-07-16 08:52

reporter   ~0124075

A simple pascal program that, in spite of the name, doesn't cause a segfault (zipped because mantis wouldn't let me upload the lpr directly)
sdfsegfault.lpr.zip (660 bytes)

Luca Olivetti

2020-07-16 09:00

reporter   ~0124076

I realized that the simple pascal program is a version where I included a thread to see if it'd make a difference (threading vs. no threading) but it doesn't.

Cyrax

2020-07-16 12:34

reporter   ~0124086

Does FPC trunk work?

Luca Olivetti

2020-07-16 12:41

reporter   ~0124087

I didn't try

Luca Olivetti

2020-07-16 13:01

reporter   ~0124088

I tried with the snapshot from here ftp://ftp.freepascal.org/pub/fpc/snapshot/trunk/x86_64-linux/ and it segfaults too.

Luca Olivetti

2020-07-16 13:36

reporter   ~0124090

I also tried the project with lazarus 1.8.4 and lazarus 1.6.4 and they segfault too if using the 3.2.0 compiler, so I'm starting to think this is either an fpc issue or a long standing lazarus issue that didn't reveal until now.
Any hint on how to debug this? heaptrc doesn't help, -Ct -Cr -Co doesn't help either, I tried to compile fpc with -dDUMPGROW and also didn't help,

wp

2020-07-16 15:53

reporter   ~0124094

I even wonder why the demo (project1.zip) even works for two posting actions. There are no fields because there are no FieldDefs, and there is no schema. In sdfsegfault.lpr.zip, on the other hand, you define a schema at runtime, and the program works.

Please check your project1 whether it really represents the error case that you are trying to report.

Luca Olivetti

2020-07-16 16:26

reporter   ~0124096

Last edited: 2020-07-16 16:26

View 2 revisions

Both projects use the same values for the schema property, check unit1.lfm (or open the schema property of Modelos in the object inspector).
In fact to generate sdfsegfault.lpr I just set the properties that are saved in the lfm.
I don't know if the FieldDefs is created automatically from the Schema or it isn't needed, I know that the project I took this from has been working 24/7 since 2013 (lcl version 0.9.30) and a segfault here seems a serious regression.

wp

2020-07-16 16:44

reporter   ~0124097

Sorry for the noise - I had not seen the Schema property in the code and had missed to open it in the OI.

Luca Olivetti

2020-07-16 18:12

reporter   ~0124099

If I revert this change the segfault goes away:

https://svn.freepascal.org/cgi-bin/viewvc.cgi?view=revision&revision=43245

Luca Olivetti

2020-07-16 18:46

reporter   ~0124100

In fact, just changing the line

FieldDefs.Add(Trim(LstFields.Names[i]), ftString, Len, 0, False,False,FieldDefs.Count+1,FEnc);

to the previous version

FieldDefs.Add(Trim(LstFields.Names[i]), ftString, Len, False);

(which I guess it has the same net effect) is enough to "fix" the segfault.

Luca Olivetti

2020-07-17 13:11

reporter   ~0124122

It turns out that lazarus uses CP_UTF8 by default and in that case the sdf used too short a buffer.
Fix is in the patch.
sdfsegfault.patch (450 bytes)   
--- sdf/sdfdata.pp	2019-10-19 16:43:38.000000000 +0200
+++ packages/fcl-db/src/sdf/sdfdata.pp	2020-07-17 13:06:13.769103569 +0200
@@ -411,6 +411,8 @@
       else
         FEnc:=DefaultSystemCodePage;
       FieldDefs.Add(Trim(LstFields.Names[i]), ftString, Len, 0, False,False,FieldDefs.Count+1,FEnc);
+      if FEnc=CP_UTF8 then
+        Len:=Len*4;
       Inc(Len);
 {$IFDEF FPC_REQUIRES_PROPER_ALIGNMENT}
       Len := Align(Len, SizeOf(PtrInt));
sdfsegfault.patch (450 bytes)   

Luca Olivetti

2020-07-18 13:24

reporter   ~0124143

Could somebody change the project of this issue to FPC? (I cannot)

jamie philbrook

2020-07-18 18:17

reporter   ~0124146

3.0.4 also used UTF8 by default so how is this different ?

Your choice of simply multiply it * 4 is a hack and holds no real reason as to why this value is the magic number ?

 Also being concerned about alignment is another trouble area that kind of irks me.
 
 Wouldn't it be better to find the real reason for the issue? Maybe it worked in 3.0.4 and maybe you were just lucky.

 I haven't looked into the sources between now and then but have you done a comparison between 3.2 and 3.0.4 have they sources changed in those areas ?

 If not I would think a deeper look into the problem is warranted .

Luca Olivetti

2020-07-18 18:30

reporter   ~0124147

In 3.0.4 sdfdata.pp didn't use the encoding at all, as per the hack, is the same that is in packages/fcl-db/src/base/fields.inc. I did look at the sources, that's how I finally found where the problem is and the possible solution.

jamie philbrook

2020-07-18 19:00

reporter   ~0124148

Finding a problem in your view still does not answer the question I asked.

IS THE SOURCE DIFFERENT in that Area between 3.0.4 and 3.2 ?

 and if it is, your winging the numbers like that is what we call a Friday afternoon answer to a problem to make sure the weekend comes without being at work...

 You need to show some reasoning why you came up with * 4 ? I don't buy the guessing logic it needs to fit in my book and if that is the case then there must be a way to measure the actually needed amount instead of just winging it.

 This is how a large code base gets washed down the stream when things like this does not have any rhyme or reason.
 
 I hope you aren't taking this the wrong way but I've been in this scene for many years and to much of this hackery code has crossed my eyes. It needs to have a valid reason for your numbers and not just force alignment to fix an issue, that just shows you haven't really figured out a real solution, one that flows with any content.

Luca Olivetti

2020-07-18 20:20

reporter   ~0124152

Please keep out of the discussion if you didn't check the differences between 3.0.4 and 3.2.0 and don't know anything about how utf-8 works.

jamie philbrook

2020-07-18 21:25

reporter   ~0124154

Excuse me ?
I don't know how UTF-8 works ? I beg the differ that maybe the situation is on the other foot, mainly yours..

  This is one of the reasons I am forced to use Delphi at work due to the hit, miss and just down right guessing game that is found here in an attempt to resolve issues.. It basically boils down to "I found out how to make my app work" right or wrong that solution should be pushed on to every one else with no research on it, just a shot in the dark.

 And I asked you if you have check the difference between sources, you won't give a direct answer to that which tells me you are just tip toeing in the tulips hoping no one notices.

  I work with a couple that have a mentality like yours, one of them are on their way out the door and the other just refuses to leave, he'll be terminated the hard way which I don't wish that on anyone.

.

Luca Olivetti

2020-07-20 21:02

reporter   ~0124193

I wasn't sure if you were just ignorant or were deliberately trolling, now I'm sure you're just trolling. If you knew anything about utf-8 you wouldn't think that 4 is a magic number that I just pulled out of thin air. Please keep using delphi and stop trolling this bug tracker.

Thaddy de Koning

2020-07-21 09:02

reporter   ~0124199

"Your choice of simply multiply it * 4 is a hack and holds no real reason as to why this value is the magic number ?"
No it is not: UTF8 requires space for all possible combinations, so should reserve 4 bytes.
UTF8 range is from 1 to 4 bytes per character.

wp

2020-07-21 09:38

reporter   ~0124201

Last edited: 2020-07-21 13:42

View 3 revisions

While this argument is correct the root cause for the current crash, however, must be somewhere else because all strings used in the demo are plain ASCII.

Luca Olivetti

2020-07-21 13:54

reporter   ~0124208

yes, they are plain text but the buffer is shorter than the buffer allocated for the field and that's what causes the heap corruption.

Luca Olivetti

2020-07-21 14:08

reporter   ~0124209

Last edited: 2020-07-21 14:12

View 2 revisions

In fact my patch just fixes the segfault but I think there is more brokennes (i.e. the offsets are calculated based on the size of the fielddef that doesn't take into account the encoding).
Edit: I'm referring to StoreToBuf/BufToStore, in other places the code uses a field (which takes the encoding into account) instead of a fielddef (which doesn't).

Luca Olivetti

2020-07-21 17:16

reporter   ~0124212

I changed the code to limit the fields to 4 characters (adding =4 to the schema lines).
If I then assign 4 chars that encode to a 2 bytes sequence each, (e.g. DataSet.Fields[f].AsString:='áéíó'), only 2 chars are visible in the dbstringrid, and if I use 4 byte characters only one shows.
I changed the methods StoreToBuf/BufToStore to multiply MaxLen by 4 if the fielddef is utf-8 but it didn't change the result. Also, it seems that ExtractDelimited couldn't possibly cope with multi byte encodings (I don't know if one of the codes corresponding to the delimiter or quotes could possibly occur in a utf-8 sequence).
I'm happy enough to avoid the segfault (in my application I'm only going to use plain ascii characters), but I think that sdfdata.pp (or maybe some of the lower levels in the db abstractions) needs more changes if it needs to be used with utf8.

BrunoK

2020-07-21 17:38

reporter   ~0124213

Last edited: 2020-07-21 17:39

View 2 revisions

The bug is due to buffer (ed. overwrite) overflow in TFixedFormatDataSet.SetFieldData
line 748 Move(Buffer^, RecBuf[0], Field.DataSize);
that should probably be
line 748 Move(Buffer^, RecBuf[0], Field.Size);

There is a lot of possible confusion with CodePage'd TStringFields as what DataSize means vs. Size

Luca Olivetti

2020-07-21 17:42

reporter   ~0124214

I think that changing that line as you say would definitely make sdfdata.pp only work with single byte codepages.

wp

2020-07-21 17:50

reporter   ~0124215

Last edited: 2020-07-21 18:27

View 3 revisions

Hey BrunoK - that's it! Which confirms Luca's factor 4:

function TStringField.GetDataSize: Integer;
begin
  case FCodePage of
    CP_UTF8: Result := 4*Size+1;
    else Result := Size+1;
  end;
end;

Probably the same has to be done with GetFieldData.

Issue History

Date Modified Username Field Change
2020-07-16 08:50 Luca Olivetti New Issue
2020-07-16 08:50 Luca Olivetti File Added: project1.zip
2020-07-16 08:52 Luca Olivetti Note Added: 0124075
2020-07-16 08:52 Luca Olivetti File Added: sdfsegfault.lpr.zip
2020-07-16 09:00 Luca Olivetti Note Added: 0124076
2020-07-16 12:34 Cyrax Note Added: 0124086
2020-07-16 12:41 Luca Olivetti Note Added: 0124087
2020-07-16 13:01 Luca Olivetti Note Added: 0124088
2020-07-16 13:36 Luca Olivetti Note Added: 0124090
2020-07-16 15:53 wp Note Added: 0124094
2020-07-16 16:26 Luca Olivetti Note Added: 0124096
2020-07-16 16:26 Luca Olivetti Note Edited: 0124096 View Revisions
2020-07-16 16:44 wp Note Added: 0124097
2020-07-16 18:12 Luca Olivetti Note Added: 0124099
2020-07-16 18:46 Luca Olivetti Note Added: 0124100
2020-07-17 13:11 Luca Olivetti Note Added: 0124122
2020-07-17 13:11 Luca Olivetti File Added: sdfsegfault.patch
2020-07-18 13:24 Luca Olivetti Note Added: 0124143
2020-07-18 16:36 Sven Barth Project Lazarus => FPC
2020-07-18 18:17 jamie philbrook Note Added: 0124146
2020-07-18 18:30 Luca Olivetti Note Added: 0124147
2020-07-18 19:00 jamie philbrook Note Added: 0124148
2020-07-18 20:20 Luca Olivetti Note Added: 0124152
2020-07-18 21:25 jamie philbrook Note Added: 0124154
2020-07-20 21:02 Luca Olivetti Note Added: 0124193
2020-07-21 09:02 Thaddy de Koning Note Added: 0124199
2020-07-21 09:38 wp Note Added: 0124201
2020-07-21 13:41 wp Note Edited: 0124201 View Revisions
2020-07-21 13:42 wp Note Edited: 0124201 View Revisions
2020-07-21 13:54 Luca Olivetti Note Added: 0124208
2020-07-21 14:08 Luca Olivetti Note Added: 0124209
2020-07-21 14:12 Luca Olivetti Note Edited: 0124209 View Revisions
2020-07-21 17:16 Luca Olivetti Note Added: 0124212
2020-07-21 17:38 BrunoK Note Added: 0124213
2020-07-21 17:39 BrunoK Note Edited: 0124213 View Revisions
2020-07-21 17:42 Luca Olivetti Note Added: 0124214
2020-07-21 17:50 wp Note Added: 0124215
2020-07-21 17:56 wp Note Edited: 0124215 View Revisions
2020-07-21 18:27 wp Note Edited: 0124215 View Revisions