View Issue Details

IDProjectCategoryView StatusLast Update
0036666FPCCompilerpublic2020-10-26 09:28
ReporterJan Bruns Assigned ToFlorian  
PrioritynormalSeverityminorReproducibilityhave not tried
Status resolvedResolutionreopened 
Product Version3.0.4 
Fixed in Version3.3.1 
Summary0036666: Heavy method overloading affects compile-time
DescriptionOverloading a static mathod name of a class some 100 times raises compilaton time from 0.3s to above 12s (on linux).

Absolutely no problem, but I thought it might be noteworthy relative to usual fpc compilation time experience.
Steps To Reproducemkdir vulkan
svn co svn://abnuto.de/vulkan
cd vulkan
cd examples
fpc -B -Fu../units -Fi../units example
fpc -B -Fu../units -Fi../units example2

see vulkan_auto.pp for what I thiink is the cause.
TagsNo tags attached.
Fixed in Revision47111
FPCOldBugId
FPCTarget-
Attached Files

Relationships

related to 0037969 assignedFlorian "Internal error 202002162" when compiling package lazreport V1.0 

Activities

Florian

2020-02-08 19:06

administrator   ~0120950

The overload search is O(n²). Currently, I have no idea how to overcome this. I will commit a patch which introduces some "early" exit code path to make searching faster (0000004:0000004 times), but the complexity stays the same unfortunatly.

Florian

2020-02-08 21:38

administrator   ~0120954

I were wrong, the patch doesn't work. Actually, I see no way to improve this significantly.

Thaddy de Koning

2020-02-08 21:56

reporter   ~0120956

an Incremental search may improve it? But that also requires a sort.

Jan Bruns

2020-02-08 23:50

reporter   ~0120958

It's strange that this not only affects unit compilation, but also uses of that unit without rebuild.

In that case, even a a simplemost implementation would probably be m*n*cc time (refcnt, methodcount, chars per name or something), which would be nearly 0 if only m was small enough (as in the example given).

Maybe something related to ppu-file handling?

Florian

2020-02-09 10:27

administrator   ~0120962

@Jan Bruns: Not really. When a certain symbol is encountered, all visible symtables are searched for that symbol and in case of subroutine calls all overloaded subroutines with the same name in all symtables must be collected. During this collection the O(n^2) takes place: every newly found subroutine must be compared with all other so far found. If there is no other with exactly the same parameters, the found subroutine is added to the subroutines to be considered. As I see no way to define a metric enabling sorting this list, this problem cannot be overcome. Further, the generated list is dynamic, as the visible symtables change with each new subroutine (or even a with statement), so it cannot be built during unit compilation. There might be some hacks possible to overcome a particular case but in general it will most fail again. This expensive search was the reason why early FPC versions didn't support cross symtable overloading. However, people didn't like this.

Jan Bruns

2020-02-09 20:57

reporter   ~0120979

@florian: if I get you correcctly, normal syms are looked up using the normal hash-tab way, but for overloaded funcs, the actual parameter list is then scanned for in a linear fashion.

I tried to rebuild fpc with dbg symbs + lineinfo to see what linux perf tools thinks about where the compile time is spent, but failed (don't have that perf tool for current kernel, downgrade kernel too complicated for gpu driver reasons)...

perf record fpc -B -Fu../units -Fi../units example2

Florian

2020-02-09 21:51

administrator   ~0120980

You could compile with -gv and use valgrind with tool callgrind. But I did this already ;) The culprit is the for loop with the nested while loop in fpc/compiler/htypechk.pas:2575+. Most of the time is spent in the call to compare_paras at line 2624.

Jan Bruns

2020-02-09 22:54

reporter   ~0120982

True, got perf to work temporarily and it reported
27% in defcmp.compare_defs_ext() plus
20% in symsym.Find_procdef_assignment_operator() // calls compare_defs_ext()

But why? The program "eample2" contains only some tens of symbols that might be looked up expensively.

Florian

2020-02-09 22:59

administrator   ~0120983

Finding the right overloaded procedure to call takes the time. It requires the steps I described above.

Jan Bruns

2020-02-09 23:21

reporter   ~0120984

Last edited: 2020-02-09 23:22

View 2 revisions

I see: tried adding just about 400 more such calls in "example2" and compiletime became 5Minutes.

So a single call lookup takes about 0.5..1 second to choose from about 500 alternatives (of the same name).

Jan Bruns

2020-02-10 04:54

reporter   ~0120989

Almost all the CPU consumption can be tracked to be initiated within

tcallcandidates.create_candidate_list().

And it appears to me that there is loop structure accidently doubled up.

The method is called once per usage of an overloaded function. It seems to copy all the overloads to the candidate list. But then there is some more walk over all the overloads, inside that first loop. Probably to sort out multiple compatible ones (but we're interested in compatibility with our reference, not in the relation of overloads to each other at this point, I think).

Jan Bruns

2020-02-10 05:53

reporter   ~0120990

Commenting that inner loop out make fpc compile "example2.pp" uperfast, but still gives a working binary:

svn://abnuto.de/fpc

Jan Bruns

2020-02-10 19:52

reporter   ~0121001

When using the patched compiler to compile fpc, that costly inner loop managed to filter the following duplicates:

existing candidate: FPC_PCHAR_LENGTH
new candidate: FPC_PCHAR_LENGTH

existing candidate: SYSUTILS_$$_STRPAS$PCHAR$$ANSISTRING
new candidate: SYSTEM_$$_STRPAS$PCHAR$$SHORTSTRING

existing candidate: SYSUTILS_$$_TRIM$WIDESTRING$$WIDESTRING
new candidate: SYSUTILS_$$_TRIM$UNICODESTRING$$UNICODESTRING

So there is a way for duplicates to occur, but if filtering them out does help with anything, given that the best-candidate search later on would see matches in a specific order...?

Jan Bruns

2020-02-11 19:08

reporter   ~0121015

Also, note that in actual candidate choosing, there's some condition saying "if (validchoices=1) ...", which sounds like it relies on absence of such dubios "duplicates" (wich that inner loop in question probably won't reliably sort out anyway, because of it's dependence on element order).

[wishlist] Furthermore, from experments on earlier fpc versions I know that ret-type of overloaded functions doesn't have any weight in candidate choosing. This is very natural, given references will typically be somewhere deep in rhs-expressions, but it's also very often the compilter could try to obey a "requested rettype", where it is obvious (inside explicit typecasts, ...). Well, that's just personal perference, but relatively useful compared to "all_encountered_overloads"-relations one could imganine, given the code I commented out.

Florian

2020-02-11 20:57

administrator   ~0121020

Overload selection is a very fragile process. Did you run regression tests? The FPC code itself is pretty simple regarding overloading, the "ugly" cases are hidden in the regression tests.

Jan Bruns

2020-02-11 23:07

reporter   ~0121022

No, the test-cases don't make here. Depending on my fpc.cfg's -M choice, there's some steps advance or not. And these makefile call /usr/bin/install. Wouldn't have tried it out, if I had seen that earlier.

J. Gareth Moreton

2020-02-11 23:19

reporter   ~0121024

You should try to run the regression tests either way, usually as simple as making and installing the compiler, then going into the tests subfolder and calling "sudo make full FPC=/usr/local/lib/fpc/3.3.1/ppcx64", for example, and then comparing the log file ("output/x86_64-linux/log", for example) against that when the tests are run on the trunk.

Jan Bruns

2020-02-11 23:37

reporter   ~0121027

@J. Gareth Moreton: Sorry, I don't understand what you're talking about. I don't have anything to do with makefiles, i compile fpc using fpc, and don't need root/admin privilegs for that.

J. Gareth Moreton

2020-02-11 23:44

reporter   ~0121029

Try calling "sudo make clean all install" from the FPC source directory first. You may or may not need root privileges, but I use it just to avoid write errors, especially during installation.

Jan Bruns

2020-02-12 00:58

reporter   ~0121030

@J. Gareth Moreton: Aehm, "try reinstalling fpc" is what you think we should discuss here?

What I did was.
svn co https://svn.freepascal.org/svn/fpc/trunk fpc
cd fpc
cd tests
make full TEST_FPC=/usr/lib/x86_64-linux-gnu/fpc/3.0.4/ppcx64

and it turned out that the makefile doesn't work here. So what?

J. Gareth Moreton

2020-02-12 01:30

reporter   ~0121031

Well, what error are you getting?

Jonas Maebe

2020-02-12 22:09

manager   ~0121063

You indeed don't need to install fpc. First do a "make all -j x FPMAKEOPT="-T x" at in fpc directory. Then go in the tests directory and do a
  make TEST_FPC=/full/path/to/fpc/compiler/ppcx64 full -j x

(with x the number of cores you have)

First do the above without your patch, and copy output/<platform>/faillist to a different location to get a baseline. Then do a "make clean" in the fpc directory, repeat with your patch and compare the new faillist with the old one.

Jan Bruns

2020-02-13 00:07

reporter   ~0121071

No, @Jonas, this doesn't work for current trunk, as discussed 0036694.
Live with that I won't yet run these tests.

Jan Bruns

2020-02-13 01:06

reporter   ~0121072

So here's the same patch relative to trunk Revision 44153 (the patched method changed significantly since my version of 3.0.4, and I have no idea what this change is about). Absolutely untested, even not tried to compile.
patch2.diff (1,488 bytes)   
Index: htypechk.pas
===================================================================
--- htypechk.pas	(Revision 44153)
+++ htypechk.pas	(Arbeitskopie)
@@ -2497,6 +2497,7 @@
       end;
 
 
+
     procedure tcallcandidates.create_candidate_list(ignorevisibility,allowdefaultparas,objcidcall,explicitunit,searchhelpers,anoninherited:boolean;spezcontext:tspecializationcontext);
       var
         j     : integer;
@@ -2575,8 +2576,11 @@
         for j:=0 to ProcdefOverloadList.Count-1 do
           begin
             pd:=tprocdef(ProcdefOverloadList[j]);
+{$ifdef CHECK_FOR_POVERL_DUPLICATES}
             added:=false;
-
+{$else}
+            added:=true;
+{$endif}
             { only when the # of parameter are supported by the procedure and
               it is visible }
             if (FParalength>=pd.minparacount) and
@@ -2619,6 +2623,7 @@
                   cpoptions:=cpoptions+[cpo_rtlproc];
                 found:=false;
                 hp:=FCandidateProcs;
+{$ifdef CHECK_FOR_POVERL_DUPLICATES}
                 while assigned(hp) do
                   begin
                     if (compare_paras(hp^.data.paras,pd.paras,cp_value_equal_const,cpoptions)>=te_equal) and
@@ -2635,6 +2640,9 @@
                     proc_add(st,pd,objcidcall);
                     added:=true;
                   end;
+{$else}
+                proc_add(st,pd,objcidcall);
+{$endif}
               end;
 
             { we need to remove all specializations that were not used from their
patch2.diff (1,488 bytes)   

Jan Bruns

2020-02-13 01:20

reporter   ~0121073

update
patch3.diff (1,103 bytes)   
Index: htypechk.pas
===================================================================
--- htypechk.pas	(Revision 44153)
+++ htypechk.pas	(Arbeitskopie)
@@ -2497,6 +2497,7 @@
       end;
 
 
+
     procedure tcallcandidates.create_candidate_list(ignorevisibility,allowdefaultparas,objcidcall,explicitunit,searchhelpers,anoninherited:boolean;spezcontext:tspecializationcontext);
       var
         j     : integer;
@@ -2619,6 +2620,7 @@
                   cpoptions:=cpoptions+[cpo_rtlproc];
                 found:=false;
                 hp:=FCandidateProcs;
+{$ifdef CHECK_FOR_POVERL_DUPLICATES}
                 while assigned(hp) do
                   begin
                     if (compare_paras(hp^.data.paras,pd.paras,cp_value_equal_const,cpoptions)>=te_equal) and
@@ -2635,6 +2637,10 @@
                     proc_add(st,pd,objcidcall);
                     added:=true;
                   end;
+{$else}
+                added:=true;
+                proc_add(st,pd,objcidcall);
+{$endif}
               end;
 
             { we need to remove all specializations that were not used from their
patch3.diff (1,103 bytes)   

Jonas Maebe

2020-02-13 09:09

manager   ~0121078

The way I explained to run the tests does work. It is how I run the tests every time. The "make all" in the top-level FPC directory compiles a new compiler and then all packages/utils with this new compiler. Next, in the tests directory, with the posted "make full" command you specify that the new compiler should be used to compile the tests.

Jan Bruns

2020-02-13 11:23

reporter   ~0121081

One thing that seems to be toally broken is operator overloading.
Here's some dump from compiling the classes.pp.
dump.txt (2,317 bytes)   
Compiling ./unix/classes.pp
Compiling ./objpas/types.pp
typshrd.inc(39,16) Warning: Location (LOC_REG) not equal to expectloc (LOC_CREG): temprefn
typshrd.inc(39,16) Warning: Location (LOC_REG) not equal to expectloc (LOC_CREG): blockn
typshrd.inc(39,31) Warning: Location (LOC_REG) not equal to expectloc (LOC_CREG): temprefn
typshrd.inc(39,31) Warning: Location (LOC_REG) not equal to expectloc (LOC_CREG): blockn
typshrd.inc(193,18) Warning: new call candidate: TYPES../build/pp2TRECT_$__$$_equal$TRECT$TRECT$$BOOLEAN existing:TYPES../build/pp2TRECT_$__$$_equal$TRECT$TRECT$$BOOLEAN
typshrd.inc(193,18) Warning: Overloaded callnode: equal(TRect,TRect)
typshrd.inc(193,18) Warning:   operator =(TRect;TRect):Boolean;
typshrd.inc(193,18) Warning:    ex: 0 eq: 0 l1: 0 l2: 0 l3: 0 l4: 0 l5: 0 l6: 0 oper: 0 ord:  0.00000000000000000000E+0000
typshrd.inc(193,18) Warning:     - TRect : incompatible
typshrd.inc(193,18) Warning:     - TRect : incompatible
typshrd.inc(193,18) Warning:   operator =(TRect;TRect):Boolean;
typshrd.inc(193,18) Warning:    ex: 0 eq: 0 l1: 0 l2: 0 l3: 0 l4: 0 l5: 0 l6: 0 oper: 0 ord:  0.00000000000000000000E+0000
typshrd.inc(193,18) Warning:     - TRect : incompatible
typshrd.inc(193,18) Warning:     - TRect : incompatible
typshrd.inc(193,18) Warning: Overloaded callnode: equal(TRect,TRect)
typshrd.inc(193,18) Warning:   operator =(const Variant;const Variant):Boolean;
typshrd.inc(193,18) Warning:    ex: 0 eq: 0 l1: 0 l2: 0 l3: 0 l4: 0 l5: 0 l6: 0 oper: 0 ord:  1.00000000000000000000E+0000
typshrd.inc(193,18) Warning:     - Variant : incompatible
typshrd.inc(193,18) Warning:     - Variant : incompatible
typshrd.inc(193,18) Warning:   operator =(TRect;TRect):Boolean;
typshrd.inc(193,18) Warning:    ex: 0 eq: 0 l1: 0 l2: 0 l3: 0 l4: 0 l5: 0 l6: 0 oper: 0 ord:  0.00000000000000000000E+0000
typshrd.inc(193,18) Warning:     - TRect : incompatible
typshrd.inc(193,18) Warning:     - TRect : incompatible
typshrd.inc(193,18) Warning:   operator =(TRect;TRect):Boolean;
typshrd.inc(193,18) Warning:    ex: 0 eq: 0 l1: 0 l2: 0 l3: 0 l4: 0 l5: 0 l6: 0 oper: 0 ord:  0.00000000000000000000E+0000
typshrd.inc(193,18) Warning:     - TRect : incompatible
typshrd.inc(193,18) Warning:     - TRect : incompatible
typshrd.inc(193,18) Error: Can't determine which overloaded function to call
dump.txt (2,317 bytes)   

Jan Bruns

2020-02-13 14:31

reporter   ~0121083

For now, it helped a lot to filter out duplicate candidates already in list using a marker in Tprocdef (the same instance is often passed multple times with oo).

I still don't get tests to work here, even with the debian compiler. But if anyone is interested, the patched 3.0.4 compiler src is at: svn://abnuto.de/fpcsrc

If someone should only be interested in what I have changed, svn://abnuto.de/fpc has the 2 modified files as svn mod.

Again, I still don't have plans to do anything more.

Florian

2020-02-13 20:38

administrator   ~0121088

So I mark it as won't fix. Currently I see no way to overcome this in general, i. e. find an algorithm which is better than O(n^2). There might be ways to fix the particular case but I am pretty sure it's easy to create other cases which show again this behaviour.

Jan Bruns

2020-02-14 16:50

reporter   ~0121098

No, Florian.

Assuming the current sortout machanism works, failing to obey the sortout in a way that could lead to wrong candidate-choice WCC would mean:
1. WCC has compatible paramlist
2. WCC it as least "as good" as any other compatible choice (in terms of of the choosing mechanics)
3. There are no other choices as good as WCC

If 3. doesn't hold, the user would be presented with a compile-time error.

So assuming we can't get to see such a compile-time error (may that only be for "too difficult for us to construct"-reasons), rule 3. can be assumed to hold, and WCC must be the sole best possible choice (apart from should have been sorted out).

But to become the sole best possible choice, WCC is in the bestchoice-testchain, where it is checked for the sortout-condition. That seems to be enough in practice.

However, my patch yet also checks the final choice against sortout-condition (theoretically allowing some false-positives) for all the candidates that might have led to sortout, so that there can't ever be an WCC (but compile-time errors might happen).

Jan Bruns

2020-02-15 23:23

reporter   ~0121115

It's of course very unattractive to apply a patch that temporarily might improve very specific compiler-performance issues.

On the other hand, these specific performance issues might in turn have been temporarily introduced to simplify implementation of new language features. At least it doesn't match typical fpc coding style to let sensitive language specifics up to some stupid allinone list-processor.

So maybe consequent future compiler improvements would adress the dirty-hack issue, and as a side-effect also solve the performance issue.

In the meantime, users would have to either accept the compiler-performance issue (long app compile time) or use other langauage constructs that don't come with the performance issue.

Jan Bruns

2020-02-17 06:35

reporter   ~0121134

The current patch version (revr393 svn://abnuto.de/fpc , 2 files) has a very slight, intentional difference to reference compler, probably only of theoretical interest:

If there is a candidate c1 hidden by c2, but c2 gets hidden by one more candidate c3, the unpatched compiler's sortout loop might probably sometimes allow c1 to be called (in terms of choosing alogrithm's input), whereas in the patched version, a candidate once hidden stays hidden.

I'm unsure if it is possible at all to create a real world difference example (given above condition is only about some algorithm's input), and even more unsure wether patched or not would be preferable/intended in that case.

Jan Bruns

2020-02-21 11:09

reporter   ~0121181

Here's a test-program generator to demonstrate what this issue is all about. It generates programs like this:

type
T00000001 = record
  f : byte;
end;


procedure someproc(a : T00000001);
begin
  write(a.f);
end;

procedure makecalls();
var
  v00000001 : T00000001;
begin
  v00000001.f := 140; someproc(v00000001);
end;


begin
  makecalls();
  writeln;
end.

As you probably noticed, determining wich overloaded funtion to call is perfectly well something one might expect a compiling-tool to handle for the user.

And here are some stats gathered on my computer:
b=50: unpatched 0.2 sec
n=100: unpatched 1.3 sec
n=200: unpatched 9.5 sec, patched 0.4 sec
n=400: unpatched 74.9 sec, patched 1.3 sec
n=800: patched 4.9 sec
n=1600: patched 19.2 sec
n=3200: patched 81.7 sec

So unpatched grows cubic with n, patched square, where theoretical optimum would be n*log(n).
testgen.pas (1,791 bytes)   
type
T_uint = dword;


function maketypename(n : T_uint) : string;
begin
  maketypename := 'T'+hexstr(n,2*sizeof(T_uint));
end;

function makevarname(n : T_uint) : string;
begin
  makevarname := 'v'+hexstr(n,2*sizeof(T_uint));
end;


procedure maketype(n : T_uint);
begin
  writeln(maketypename(n),' = record');
  writeln('  f : byte;');
  writeln('end;');
  writeln;
end;

procedure maketypes(n : T_uint);
begin
  writeln('type');
  while (n>0) do begin
    maketype(n);
    dec(n);
  end;
  writeln;
end;

procedure makeoverloadproc(n : T_uint);
begin
  writeln('procedure someproc(a : ',maketypename(n),');');
  writeln('begin');
  writeln('  write(a.f);');
  writeln('end;');
  writeln;
end;

procedure makeoverloadprocs(n : T_uint);
begin
  while (n>0) do begin
    makeoverloadproc(n);
    dec(n);
  end;
end;


procedure makecall(n : T_uint);
begin
  write('  '+makevarname(n),'.f := ',random(256),';');
  writeln('  someproc(',makevarname(n),');');
end;

procedure makecalls(n : T_uint);
begin
  while (n>0) do begin
    makecall(n);
    dec(n);
  end;
end;

procedure makedecl(n : T_uint);
begin
  writeln('  '+makevarname(n),' : ',maketypename(n),';');
end;

procedure makedecls(n : T_uint);
begin
  writeln('var');
  while (n>0) do begin
    makedecl(n);
    dec(n);
  end;
end;

procedure makeallcallings(n : T_uint);
begin
  writeln('procedure makecalls();');
  makedecls(n);
  writeln('begin');
  makecalls(n);
  writeln('end;');
  writeln;
end;


var
n : T_uint;
w : word;

begin
  if (paramcount>=1) then begin
   val(paramstr(1),n);
  end else begin
    w := 1;    
  end;
  if not(w=0) then n:=100;
  maketypes(n);
  makeoverloadprocs(n);
  makeallcallings(n);
  writeln;
  writeln('begin');
  writeln('  makecalls();');
  writeln('  writeln;');
  writeln('end.');
end.


testgen.pas (1,791 bytes)   

Jan Bruns

2020-02-27 12:22

reporter   ~0121241

And here's the patch relative to current trunk (tried to minimize number of modiefied blocks, and give a single define to deactivate the patch).

The fpc test-cases don't show up any difference relative to trunk compiler. But this isn't very meaningful: Due to test-case lack of complexity, even early versions didn't show diffrent test-case behaviour.

Would be nice if someone could try to create a more complicate test-case (probably based on tb0224).
patch3-2.diff (10,804 bytes)   
Index: compiler/symdef.pas
===================================================================
--- compiler/symdef.pas	(Revision 44246)
+++ compiler/symdef.pas	(Arbeitskopie)
@@ -849,6 +849,9 @@
             a routine that has to be internally generated by the compiler }
           synthetickind: tsynthetickind;
           visibility   : tvisibility;
+{$ifndef DISABLE_FAST_OVERLOAD_PATCH}
+          seenmarker : pointer; // used for filtering in tcandidate
+{$endif}
           constructor create(level:byte;doregister:boolean);virtual;
           constructor ppuload(ppufile:tcompilerppufile);
           destructor  destroy;override;
@@ -6010,6 +6013,9 @@
 {$else symansistr}
          _mangledname:=nil;
 {$endif symansistr}
+{$ifndef DISABLE_FAST_OVERLOAD_PATCH}
+         seenmarker := nil;
+{$endif}
          fileinfo:=current_filepos;
          extnumber:=$ffff;
          aliasnames:=TCmdStrList.create;
Index: compiler/htypechk.pas
===================================================================
--- compiler/htypechk.pas	(Revision 44246)
+++ compiler/htypechk.pas	(Arbeitskopie)
@@ -62,7 +62,10 @@
          cl6_count,
          coper_count : integer; { should be signed }
          ordinal_distance : double;
-         invalid     : boolean;
+         invalid : boolean;
+{$ifndef DISABLE_FAST_OVERLOAD_PATCH}
+         saved_validity : boolean;
+{$endif}
          wrongparanr : byte;
       end;
 
@@ -2579,7 +2582,11 @@
 
             { only when the # of parameter are supported by the procedure and
               it is visible }
+{$ifdef DISABLE_FAST_OVERLOAD_PATCH}
             if (FParalength>=pd.minparacount) and
+{$else}
+            if (pd.seenmarker<>pointer(self)) and (FParalength>=pd.minparacount) and
+{$endif}
                (
                 (
                  allowdefaultparas and
@@ -2619,6 +2626,7 @@
                   cpoptions:=cpoptions+[cpo_rtlproc];
                 found:=false;
                 hp:=FCandidateProcs;
+{$ifdef DISABLE_FAST_OVERLOAD_PATCH}
                 while assigned(hp) do
                   begin
                     if (compare_paras(hp^.data.paras,pd.paras,cp_value_equal_const,cpoptions)>=te_equal) and
@@ -2630,10 +2638,14 @@
                       end;
                     hp:=hp^.next;
                   end;
+{$endif}
                 if not found then
                   begin
                     proc_add(st,pd,objcidcall);
                     added:=true;
+{$ifndef DISABLE_FAST_OVERLOAD_PATCH}
+                    pd.seenmarker:=self;
+{$endif}
                   end;
               end;
 
@@ -2647,6 +2659,14 @@
                 pd.free;
               end;
           end;
+{$ifndef DISABLE_FAST_OVERLOAD_PATCH}
+        {cleanup modified duplicate pd markers}
+        hp := FCandidateProcs;
+        while assigned(hp) do begin
+          hp^.data.seenmarker := nil;
+          hp := hp^.next;
+        end;
+{$endif}
 
         calc_distance(st,objcidcall);
 
@@ -3233,6 +3253,8 @@
       end;
 
 
+
+
     function is_better_candidate(currpd,bestpd:pcandidate):integer;
       var
         res : integer;
@@ -3483,6 +3505,9 @@
       end;
 
 
+
+{$ifdef DISABLE_FAST_OVERLOAD_PATCH}
+
     function tcallcandidates.choose_best(var bestpd:tabstractprocdef; singlevariant: boolean):integer;
       var
         pd: tprocdef;
@@ -3570,6 +3595,199 @@
       end;
 
 
+{$else}
+
+    function compare_by_old_sortout_check(pd,bestpd:pcandidate):integer;
+      var cpoptions : tcompare_paras_options;
+      begin
+        { don't add duplicates, only compare visible parameters for the user }
+        cpoptions:=[cpo_ignorehidden];
+        if (po_compilerproc in bestpd^.data.procoptions) then cpoptions:=cpoptions+[cpo_compilerproc];
+        if (po_rtlproc in bestpd^.data.procoptions) then cpoptions:=cpoptions+[cpo_rtlproc];
+
+        compare_by_old_sortout_check := 0; // can't decide, bestpd probably wasn't sorted out in unpatched
+        if (compare_paras(pd^.data.paras,bestpd^.data.paras,cp_value_equal_const,cpoptions)>=te_equal) and
+             ( not(po_objc in bestpd^.data.procoptions) or (bestpd^.data.messageinf.str^=pd^.data.messageinf.str^)) 
+        then begin
+          compare_by_old_sortout_check := 1; // bestpd was sorted out before patch
+        end;
+     end;
+
+    function decide_restart(pd,bestpd:pcandidate) : boolean;
+    begin
+      decide_restart := false;
+      if assigned(bestpd) then begin
+        { don't restart if bestpd is marked invalid already }
+        if not bestpd^.invalid then begin
+          decide_restart := 0<>compare_by_old_sortout_check(pd,bestpd);
+        end;
+      end;
+    end;
+
+    procedure save_validity(c : pcandidate);
+    begin
+      while assigned(c) do begin
+        c^.saved_validity := c^.invalid;
+        c := c^.next;
+      end;
+    end;
+
+    procedure restore_validity(c : pcandidate);
+    begin
+      while assigned(c) do begin
+        c^.invalid := c^.saved_validity;
+        c := c^.next;
+      end;
+    end;
+
+
+    function tcallcandidates.choose_best(var bestpd:tabstractprocdef; singlevariant: boolean):integer;
+      var
+        pd: tprocdef;
+        besthpstart,
+        hp,hp2        : pcandidate;
+        cntpd,
+        res           : integer;
+        restart : boolean;
+      begin
+        {
+          Returns the number of candidates left and the
+          first candidate is returned in pdbest
+        }
+        if not(assigned(FCandidateProcs)) then begin
+          choose_best := 0;
+          exit;
+        end;
+
+        bestpd:=FCandidateProcs^.data;
+        if FCandidateProcs^.invalid 
+        then cntpd:=0
+        else cntpd:=1;
+
+        if assigned(FCandidateProcs^.next) then
+         begin
+           save_validity(FCandidateProcs);
+           restart := false;
+           { keep restarting, until there wasn't a sorted-out besthpstart }
+           repeat
+             besthpstart:=FCandidateProcs;
+             bestpd:=FCandidateProcs^.data;
+             if restart then begin
+               restore_validity(FCandidateProcs);
+               restart := false;
+             end;
+             { Setup the first procdef as best, only count it as a result
+               when it is valid }
+             if FCandidateProcs^.invalid 
+             then cntpd:=0
+             else cntpd:=1;
+             hp:=FCandidateProcs^.next;
+             while assigned(hp) and not(restart) do begin
+               restart := decide_restart(hp,besthpstart);
+               if not restart then begin
+                 if not singlevariant
+                 then res:=is_better_candidate(hp,besthpstart)
+                 else res:=is_better_candidate_single_variant(hp,besthpstart);
+               end;
+               if restart then begin
+                 { mark the sorted out invalid globally }
+                 besthpstart^.saved_validity := true;
+               end else if (res>0) then begin
+                 { hp is better, flag all procs to be incompatible }
+                 while (besthpstart<>hp) do begin
+                   besthpstart^.invalid:=true;
+                   besthpstart:=besthpstart^.next;
+                 end;
+                 { besthpstart is already set to hp }
+                 bestpd:=besthpstart^.data;
+                 cntpd:=1;
+               end else if (res<0) then begin
+                  { besthpstart is better, flag current hp to be incompatible }
+                  hp^.invalid:=true;
+               end else begin
+                 { res=0, both are valid }
+                 if not hp^.invalid then inc(cntpd);
+               end;
+               hp:=hp^.next;
+
+             end;
+           until not(restart);
+         end;
+
+        { check the alternate choices if they would have been sorted out before patch... }
+
+        { note we have procadded the candidates, so order is reversed procadd order here.
+          this was also used above: each sorted-out always has an "outsorter" counterpart 
+          deeper down the next chain 
+        }
+
+        { for the intial implementation, let's first do some more consistency checking}
+        res := 0;
+        hp := FCandidateProcs;
+        while assigned(hp) do begin
+          if not(hp^.invalid) then inc(res);
+          hp := hp^.next;
+        end;
+        if (res<>cntpd) then internalerror(202002161);
+
+        { check all valid choices for sortout }
+        cntpd := 0;
+        hp := FCandidateProcs;
+        while assigned(hp) do begin
+          if not(hp^.invalid) then begin
+            hp2 := hp^.next;
+            while assigned(hp2) do begin
+              if compare_by_old_sortout_check(hp2,hp)<>0 then begin
+                hp^.invalid := true;
+                hp2 := nil;
+              end else hp2:=hp2^.next;
+            end;
+            if not(hp^.invalid) then begin
+              inc(cntpd);
+              { check for the impossible event bestpd had become invalid}
+              if (cntpd=1)and(hp^.data<>bestpd) 
+              then internalerror(202002162);
+            end;
+          end;
+          hp := hp^.next;
+        end;
+
+
+        { if we've found one, check the procdefs ignored for overload choosing
+          to see whether they contain one from a child class with the same
+          parameters (so the overload choosing was not influenced by their
+          presence, but now that we've decided which overloaded version to call,
+          make sure we call the version closest in terms of visibility }
+        if cntpd=1 then
+          begin
+            for res:=0 to FIgnoredCandidateProcs.count-1 do
+              begin
+                pd:=tprocdef(FIgnoredCandidateProcs[res]);
+                { stop searching when we start comparing methods of parent of
+                  the struct in which the current best method was found }
+                if assigned(pd.struct) and
+                   (pd.struct<>tprocdef(bestpd).struct) and
+                   def_is_related(tprocdef(bestpd).struct,pd.struct) then
+                  break;
+                if (pd.proctypeoption=bestpd.proctypeoption) and
+                   ((pd.procoptions*[po_classmethod,po_methodpointer])=(bestpd.procoptions*[po_classmethod,po_methodpointer])) and
+                   (compare_paras(pd.paras,bestpd.paras,cp_all,[cpo_ignorehidden,cpo_ignoreuniv,cpo_openequalisexact])=te_exact) then
+                  begin
+                    { first one encountered is closest in terms of visibility }
+                    bestpd:=pd;
+                    break;
+                  end;
+              end;
+          end;
+        result:=cntpd;
+      end;
+
+{$endif}
+
+
+
+
+
     procedure tcallcandidates.find_wrong_para;
       var
         currparanr : smallint;
patch3-2.diff (10,804 bytes)   

Jan Bruns

2020-04-26 14:16

reporter   ~0122447

Any prob with it?

The trick is to move the "compare all candidates against each other" to a point where a candidate probably gets chosen. Even more clever: During traversing the candidate list for the best-match search. as shown in patch3-2.dff (note 0036666:0121241 ).

Remember that by language specification, typically most overloads are invalid choices for an occurence of a call construct. So it doesn't make much sense to compare all these invalid choices against each other.

Jan Bruns

2020-09-11 13:31

reporter   ~0125478

Maybe it helps a little if I talk a bit more about.

When the (non-patched) compiler needs to generate a call, some logic searches for functions associated with the requested name (without taking care about the parameter signature).

Most parts of the resulting pre-list are then taken over into a candidate list. This is done in a square runtime fashion by comparing the parameter signature for any element of the pre-list to the signature of any candidate already taken over.

Things that get filtered out this expensive way include for example identical signature alternatives hidden by object-inheritance.

Then the candidate list elements are checked against the requested signature of the node to generate code for, resulting in some compatibility-measure, that a final best candidate search uses to identify a best choice.

Everything would work a lot faster, if compatibilty with the call-node (to generate code for) could be checked on the fly on generating the candidate list from the pre-list. However, this quick and easy solution would (theoretically) risk correctness of the sort-out machanism.

So my patch suggests to do the opposite thing: takeover all elemets of the pre-list to the candidate list, and delay the check for the sort-out criterion to the point when a candidate-function is about to get chosen (so that the heavy work gets done only for relatively "good" candidates compatible with the call-node). Except for the theoretical "reappearing zombie" problem (of the original compiler code) as outlined in previous posts, and probable freshly introduced implementation bugs, this should give fairly exact the same results as the original algorithm.

Florian

2020-10-14 22:08

administrator   ~0126312

Thanks for the patch, I committed it finally.

Issue History

Date Modified Username Field Change
2020-02-06 15:14 Jan Bruns New Issue
2020-02-08 19:06 Florian Note Added: 0120950
2020-02-08 21:38 Florian Note Added: 0120954
2020-02-08 21:56 Thaddy de Koning Note Added: 0120956
2020-02-08 23:50 Jan Bruns Note Added: 0120958
2020-02-09 10:27 Florian Note Added: 0120962
2020-02-09 20:57 Jan Bruns Note Added: 0120979
2020-02-09 21:51 Florian Note Added: 0120980
2020-02-09 22:54 Jan Bruns Note Added: 0120982
2020-02-09 22:59 Florian Note Added: 0120983
2020-02-09 23:21 Jan Bruns Note Added: 0120984
2020-02-09 23:22 Jan Bruns Note Edited: 0120984 View Revisions
2020-02-10 04:54 Jan Bruns Note Added: 0120989
2020-02-10 05:53 Jan Bruns Note Added: 0120990
2020-02-10 19:52 Jan Bruns Note Added: 0121001
2020-02-11 19:08 Jan Bruns Note Added: 0121015
2020-02-11 20:57 Florian Note Added: 0121020
2020-02-11 23:07 Jan Bruns Note Added: 0121022
2020-02-11 23:19 J. Gareth Moreton Note Added: 0121024
2020-02-11 23:37 Jan Bruns Note Added: 0121027
2020-02-11 23:44 J. Gareth Moreton Note Added: 0121029
2020-02-12 00:58 Jan Bruns Note Added: 0121030
2020-02-12 01:30 J. Gareth Moreton Note Added: 0121031
2020-02-12 22:09 Jonas Maebe Note Added: 0121063
2020-02-13 00:07 Jan Bruns Note Added: 0121071
2020-02-13 01:06 Jan Bruns File Added: patch2.diff
2020-02-13 01:06 Jan Bruns Note Added: 0121072
2020-02-13 01:20 Jan Bruns File Added: patch3.diff
2020-02-13 01:20 Jan Bruns Note Added: 0121073
2020-02-13 09:09 Jonas Maebe Note Added: 0121078
2020-02-13 11:23 Jan Bruns File Added: dump.txt
2020-02-13 11:23 Jan Bruns Note Added: 0121081
2020-02-13 14:31 Jan Bruns Note Added: 0121083
2020-02-13 20:38 Florian Assigned To => Florian
2020-02-13 20:38 Florian Status new => resolved
2020-02-13 20:38 Florian Resolution open => won't fix
2020-02-13 20:38 Florian FPCTarget => -
2020-02-13 20:38 Florian Note Added: 0121088
2020-02-14 16:50 Jan Bruns Status resolved => feedback
2020-02-14 16:50 Jan Bruns Resolution won't fix => reopened
2020-02-14 16:50 Jan Bruns Note Added: 0121098
2020-02-14 17:44 Florian Assigned To Florian =>
2020-02-15 23:23 Jan Bruns Note Added: 0121115
2020-02-15 23:23 Jan Bruns Status feedback => new
2020-02-17 06:35 Jan Bruns Note Added: 0121134
2020-02-21 11:09 Jan Bruns File Added: testgen.pas
2020-02-21 11:09 Jan Bruns Note Added: 0121181
2020-02-27 12:22 Jan Bruns File Added: patch3-2.diff
2020-02-27 12:22 Jan Bruns Note Added: 0121241
2020-04-26 14:16 Jan Bruns Note Added: 0122447
2020-09-11 13:31 Jan Bruns Note Added: 0125478
2020-10-14 22:08 Florian Assigned To => Florian
2020-10-14 22:08 Florian Status new => resolved
2020-10-14 22:08 Florian Fixed in Version => 3.3.1
2020-10-14 22:08 Florian Fixed in Revision => 47111
2020-10-14 22:08 Florian Note Added: 0126312
2020-10-26 09:28 Stephano Relationship added related to 0037969