View Issue Details

IDProjectCategoryView StatusLast Update
0036747FPCDatabasepublic2020-03-07 15:53
ReporterOndrej Pokorny Assigned ToOndrej Pokorny  
Status resolvedResolutionfixed 
Platformlinux-arm Raspberry Pi 4OSRaspbian 
Product Version3.3.1 
Target Version3.2.0Fixed in Version3.3.1 
Summary0036747: Memory corruption in DB on platforms with FPC_REQUIRES_PROPER_ALIGNMENT
DescriptionMemory is being overwritten in DB / BufDataset. I debugged into it and found out that the compiler define FPC_REQUIRES_PROPER_ALIGNMENT is causing it.

The problem is the difference between DataSize returned in TStringField:

function TStringField.GetDataSize: Integer;
  case FCodePage of
    CP_UTF8: Result := 4*Size+1;
    else Result := Size+1;

and the DataSize returned in TCustomBufDataset.GetFieldSize:

function TCustomBufDataset.GetFieldSize(FieldDef : TFieldDef) : longint;
  // ...

As you can see the result from TCustomBufDataset.GetFieldSize is aligned and thus it can be bigger than the result from TStringField.GetDataSize.

The problem is that the buffer is allocated with TStringField.GetDataSize and it is filled with TCustomBufDataset.GetFieldSize that causes memory corruption.

See the attached patch where I commented the places where the buffer is allocated and where it is filled. The patch also includes a (temporary) solution for TStringField.GetDataSize. I don't know if it is a correct solution. Maybe more places have to be fixed or the fix should be different. A solution could also be to disable alignment for strings in TCustomBufDataset.GetFieldSize.
Steps To ReproduceSee DBMemCorr.lpr
TagsNo tags attached.
Fixed in Revision44280
Attached Files


Ondrej Pokorny

2020-03-02 02:13


DBMemCorr.lpr (713 bytes)   
program DBMemCorr;

{$mode objfpc}{$H+}

  Classes, PQConnection, DB, SQLDB;

  C: TPQConnection;
  T: TSQLTransaction;
  Q: TSQLQuery;
  S: String;
  C := TPQConnection.Create(nil);
  C.Connected := True;
  T := TSQLTransaction.Create(nil);
  C.Transaction := T;
  Q := TSQLQuery.Create(nil);
  Q.SQLConnection := C;
  Q.Transaction := T;
  Q.SQL.Text := 'SELECT constraint_name FROM information_schema.constraint_column_usage';
  if Q.Fields[0].IsNull then
    S := 'NULL'
    S := Q.Fields[0].AsString; // crash here

DBMemCorr.lpr (713 bytes)   

Ondrej Pokorny

2020-03-02 02:14


memory-corruption.diff (1,589 bytes)   
Index: src/base/bufdataset.pas
--- src/base/bufdataset.pas	(revision 44251)
+++ src/base/bufdataset.pas	(working copy)
@@ -2482,7 +2482,7 @@
-  result:=Align(result,4);
+  result:=Align(result,4); // here the DataSize is increased
@@ -2607,7 +2607,7 @@
     if assigned(Buffer) then
-      Move(CurrBuff^, Buffer^, GetFieldSize(FieldDefs[Field.FieldNo-1]));
+      Move(CurrBuff^, Buffer^, GetFieldSize(FieldDefs[Field.FieldNo-1])); // write outside the Buffer here because GetFieldSize returns a bigger size than TStringField.DataSize
     Result := True;
Index: src/base/
--- src/base/	(revision 44251)
+++ src/base/	(working copy)
@@ -1221,6 +1221,9 @@
     CP_UTF8: Result := 4*Size+1;
     else     Result :=   Size+1;
+  Result:=Align(Result,4); // temporary fix
+  {$ENDIF}
 function TStringField.GetDefaultWidth: Longint;
@@ -1258,7 +1261,7 @@
-    SetLength(DynBuf,DataSize);
+    SetLength(DynBuf,DataSize); // buffer allocated here with TStringField.DataSize that is not aligned
     DynBuf[DataSize-1]:=#0;  //limit string to Size
     If Result then
memory-corruption.diff (1,589 bytes)   

Michael Van Canneyt

2020-03-02 09:14

administrator   ~0121303

It seems to me that the simplest is to align the stringfield size too.
I don't know what kind of tricks need to be done when proper alignment is required and unaligned data needs to be accessed.

Ondrej Pokorny

2020-03-02 10:34

developer   ~0121306

IMO we should get rid of this ambiguity. That means either
1.) Remove TCustomBufDataset.GetFieldSize and do the alignment in TField.GetDataSize overrides.
- or -
2.) Create a link between TCustomBufDataset.GetFieldSize and TField.GetDataSize so that TField.GetDataSize calls TCustomBufDataset.GetFieldSize. For this we need a new GetFieldSize virtual method in TDataSet that TCustomBufDataset overrides and that TField.GetDataSize calls.

Now we have 2 independent methods to get the field data size. It is impossible to keep them synchronized. If we don't unite them we can never be 100% sure the bug is not present in any field type or that the bug won't appear again in the future.

I am not familiar with the alignment either, so we can talk about the best solution with the compiler people at the weekend. But when we consider other code in FPC - arrays and strings are usually not aligned to 4, so IMO it is not necessary to align them here either.

Michael Van Canneyt

2020-03-02 10:45

administrator   ~0121307

We cannot get rid of this "ambiguity"

GetDatasize and FieldSize are 2 different things.
The first is the actual data size of the data.
The second is the size needed in the dataset buffer to get the actual data.
For strings this is usually the same because the data is stored in the dataset buffer, but this is not guaranteed.

For Blob fields for example, FieldSize usually is the size of the pointer to the data, the data size is the size of the actual data.

Thaddy de Koning

2020-03-02 10:46

reporter   ~0121308

Ondrej, can this be a side effect of record field re-ordering?
If I turn of
- optimizations
and pack I get proper results at first test. (have to do more, I know)

Ondrej Pokorny

2020-03-02 11:05

developer   ~0121309

> We cannot get rid of this "ambiguity"
> GetDatasize and FieldSize are 2 different things.
> The first is the actual data size of the data.
> The second is the size needed in the dataset buffer to get the actual data.
> For strings this is usually the same because the data is stored in the dataset buffer, but this is not guaranteed.

OK, I understand. If it is not guaranteed that the data sizes are the same, it means that it doesn't make sense to add the alignment into TStringField.GetDataSize to actually *make them the same* (what I did in the first patch).

So we should add a BufferSize parameter to TDataSet.GetFieldData and TDataSet.SetFieldData. Otherwise the TDataSet cannot know what size the allocated buffer has. IMO there is no other solution. Since they are public virtual methods, we have to introduce new overloads and call them directly:
    function GetFieldData(Field: TField; Buffer: Pointer; BufferSize: NativeInt; NativeFormat: Boolean): Boolean; overload; virtual;
    procedure SetFieldData(Field: TField; Buffer: Pointer; BufferSize: NativeInt; NativeFormat: Boolean); overload; virtual;

Maybe even deprecate the other two overloads to encourage people not to use them?

Michael Van Canneyt

2020-03-02 11:13

administrator   ~0121310

Last edited: 2020-03-02 11:16

View 2 revisions

TDataset knows the size: it allocates it.

Also, the interface of TDataset is fixed. You can't change it, this would break possibly a lot of descendents

I don't see why we would need to change something that works for 20+ years :-)

Ondrej Pokorny

2020-03-02 11:38

developer   ~0121311

> TDataset knows the size: it allocates it.
In this case the buffer is allocated in TField and not in TDataset. I checked again and TCustomBufDataset.GetFieldData has the Field parameter - so it can read Field.DataSize. Problem solved :)

So we should either move only Field.DataSize bytes (if we can assume that Field.DataSize is always smaller than GetFieldSize (see new patch) or we should use the minimum of the both values: Min(GetFieldSize, Field.DataSize). What do you prefer?
bufdataset-datasize.patch (772 bytes)   
Index: bufdataset.pas
--- bufdataset.pas	(revision 44199)
+++ bufdataset.pas	(working copy)
@@ -2607,7 +2607,7 @@
     if assigned(Buffer) then
-      Move(CurrBuff^, Buffer^, GetFieldSize(FieldDefs[Field.FieldNo-1]));
+      Move(CurrBuff^, Buffer^, Field.DataSize);
     Result := True;
@@ -2649,7 +2649,7 @@
     if assigned(buffer) then
-      Move(Buffer^, CurrBuff^, GetFieldSize(FieldDefs[Field.FieldNo-1]));
+      Move(Buffer^, CurrBuff^, Field.DataSize);
bufdataset-datasize.patch (772 bytes)   

Ondrej Pokorny

2020-03-02 19:00

developer   ~0121316

Huh :/ Unfortunately bufdataset-datasize.patch doesn't work as-is for TBlobField, because TBlobField.DataSize always returns 0. (This is the same in Delphi.)

So now we have the situation that sometimes the buffer is allocated with Field.DataSize (TStringField) and sometimes with TCustomBufDataset.GetFieldSize (TBlobField - see TBufBlobStream.Create).

So either all fields must allocate the Buffer with TCustomBufDataset.GetFieldSize and not TField.DataSize or we need the BufferSize parameter in GetFieldData.

Issue History

Date Modified Username Field Change
2020-03-02 02:13 Ondrej Pokorny New Issue
2020-03-02 02:13 Ondrej Pokorny File Added: DBMemCorr.lpr
2020-03-02 02:14 Ondrej Pokorny File Added: memory-corruption.diff
2020-03-02 09:14 Michael Van Canneyt Note Added: 0121303
2020-03-02 10:34 Ondrej Pokorny Note Added: 0121306
2020-03-02 10:45 Michael Van Canneyt Note Added: 0121307
2020-03-02 10:46 Thaddy de Koning Note Added: 0121308
2020-03-02 11:05 Ondrej Pokorny Note Added: 0121309
2020-03-02 11:13 Michael Van Canneyt Note Added: 0121310
2020-03-02 11:16 Michael Van Canneyt Note Edited: 0121310 View Revisions
2020-03-02 11:38 Ondrej Pokorny File Added: bufdataset-datasize.patch
2020-03-02 11:38 Ondrej Pokorny Note Added: 0121311
2020-03-02 19:00 Ondrej Pokorny Note Added: 0121316
2020-03-07 15:53 Ondrej Pokorny Assigned To => Ondrej Pokorny
2020-03-07 15:53 Ondrej Pokorny Status new => resolved
2020-03-07 15:53 Ondrej Pokorny Resolution open => fixed
2020-03-07 15:53 Ondrej Pokorny Fixed in Version => 3.3.1
2020-03-07 15:53 Ondrej Pokorny Fixed in Revision => 44280
2020-03-07 15:53 Ondrej Pokorny FPCTarget => 3.2.0