View Issue Details

IDProjectCategoryView StatusLast Update
0037153FPCRTLpublic2020-05-30 09:09
ReporterPangea Assigned ToMichael Van Canneyt  
PrioritynormalSeverityminorReproducibilityalways
Status resolvedResolutionfixed 
Product Version3.0.4 
Fixed in Version3.3.1 
Summary0037153: TStringList.Sort() exhibits quadratic runtime complexity when most or all elements are identical
DescriptionTStringList.Sort() exhibits quadratic runtime complexity when most or all elements are identical.
Steps To Reproduce A simple program to demonstrate the performance in the "normal" and "degenerate" case can be written as:

program pListSort;

{$mode objfpc}{$H+}
{$APPTYPE CONSOLE}
uses {$IFDEF UNIX} {$IFDEF UseCThreads}cthreads, {$ENDIF} {$ENDIF} Classes, SysUtils, dateutils;
{$R *.res}

const
  MaxTestSize = 1000000;
var
  ls: TStringList;
  testsize, i: integer;
  t1, t2: TDateTime;
  equal: boolean;

begin
  for equal in [False, True] do
  begin
    // Write out table header
    if equal then
      WriteLn('Performance test: All elements are identical.')
    else
      WriteLn('Performance test: All elements are mutually distinct.');
    WriteLn(LineEnding,'Number of elements': 20, ' |', 'Elapsed time [ms]': 25);
    WriteLn('-----------------------------------------------');

    // Test loop
    testsize:= 512;
    while testsize < MaxTestSize do
    begin
      ls := TStringList.Create;
      // Add the elements
      for i := 0 to testsize - 1 do
        if equal then
          ls.Add('0')
        else
          ls.Add(IntToStr(i));

      // Sort the list and measure the elapsed wall clock time
      t1 := Now;
      ls.Sort;
      t2 := Now;

      // Write out the result
      WriteLn(testsize: 20, ' |', MilliSecondsBetween(t2, t1): 25);

      // Prepare the next iteration
      FreeAndNil(ls);
      testsize := testsize * 2;
    end;
    WriteLn(LineEnding);
  end;

  WriteLn(LineEnding, '[ENTER]');
  ReadLn;
end.
Additional InformationThe reason for this behavior seems to be the choice of sorting algorithm: QuickSort with random pivoting. This special choice of pivot avoids quadratic runtime behavior in the case of an already sorted list, but still fails when most or all elements are equal. A possible fix would be to switch to a different sorting algorithm (e.g. HeapSort) once the recursion depth exceeds a certain threshold (e.g. c*log(N) for some constant factor c).
TagsNo tags attached.
Fixed in Revision
FPCOldBugId
FPCTarget3.2.0
Attached Files

Activities

Pangea

2020-05-30 08:26

reporter   ~0123136

I chose the wrong category. It should be tagged RTL and not FCL.

Michael Van Canneyt

2020-05-30 09:09

administrator   ~0123138

This is already fixed in the trunk and upcoming 3.2. You can specify a sorting algorithm:
    procedure Sort(SortingAlgorithm: PSortingAlgorithm); virtual;
    procedure CustomSort(CompareFn: TStringListSortCompare); virtual;
    procedure CustomSort(CompareFn: TStringListSortCompare; SortingAlgorithm: PSortingAlgorithm); virtual;

Issue History

Date Modified Username Field Change
2020-05-30 08:17 Pangea New Issue
2020-05-30 08:26 Pangea Note Added: 0123136
2020-05-30 09:09 Michael Van Canneyt Assigned To => Michael Van Canneyt
2020-05-30 09:09 Michael Van Canneyt Status new => resolved
2020-05-30 09:09 Michael Van Canneyt Resolution open => fixed
2020-05-30 09:09 Michael Van Canneyt Fixed in Version => 3.3.1
2020-05-30 09:09 Michael Van Canneyt FPCTarget => 3.2.0
2020-05-30 09:09 Michael Van Canneyt Note Added: 0123138
2020-05-30 09:09 Michael Van Canneyt Category FCL => RTL