TStringList.Sort() exhibits quadratic runtime complexity when most or all elements are identical
Original Reporter info from Mantis: Pangea
-
Reporter name: Pangea
Original Reporter info from Mantis: Pangea
- Reporter name: Pangea
Description:
TStringList.Sort() exhibits quadratic runtime complexity when most or all elements are identical.
Steps to reproduce:
A simple program to demonstrate the performance in the "normal" and "degenerate" case can be written as:
program pListSort;
{$mode objfpc}{$H+}
{$APPTYPE CONSOLE}
uses {$IFDEF UNIX} {$IFDEF UseCThreads}cthreads, {$ENDIF} {$ENDIF} Classes, SysUtils, dateutils;
{$R *.res}
const
MaxTestSize = 1000000;
var
ls: TStringList;
testsize, i: integer;
t1, t2: TDateTime;
equal: boolean;
begin
for equal in [False, True] do
begin
// Write out table header
if equal then
WriteLn('Performance test: All elements are identical.')
else
WriteLn('Performance test: All elements are mutually distinct.');
WriteLn(LineEnding,'Number of elements': 20, ' |', 'Elapsed time [ms]': 25);
WriteLn('-----------------------------------------------');
// Test loop
testsize:= 512;
while testsize < MaxTestSize do
begin
ls := TStringList.Create;
// Add the elements
for i := 0 to testsize - 1 do
if equal then
ls.Add('0')
else
ls.Add(IntToStr(i));
// Sort the list and measure the elapsed wall clock time
t1 := Now;
ls.Sort;
t2 := Now;
// Write out the result
WriteLn(testsize: 20, ' |', MilliSecondsBetween(t2, t1): 25);
// Prepare the next iteration
FreeAndNil(ls);
testsize := testsize * 2;
end;
WriteLn(LineEnding);
end;
WriteLn(LineEnding, '[ENTER]');
ReadLn;
end.
Additional information:
The reason for this behavior seems to be the choice of sorting algorithm: QuickSort with random pivoting. This special choice of pivot avoids quadratic runtime behavior in the case of an already sorted list, but still fails when most or all elements are equal. A possible fix would be to switch to a different sorting algorithm (e.g. HeapSort) once the recursion depth exceeds a certain threshold (e.g. c*log(N) for some constant factor c).
Mantis conversion info:
- Mantis ID: 37153
- Version: 3.0.4
- Fixed in version: 3.3.1
- Target version: 3.2.0