View Issue Details Jump to Notes ] Issue History ] Print ]
IDProjectCategoryView StatusDate SubmittedLast Update
0014431FPCRTLpublic2009-08-25 15:322013-06-19 14:56
ReporterColin Haywood 
Assigned ToSven Barth 
PrioritynormalSeveritymajorReproducibilityalways
StatusresolvedResolutionfixed 
Platformx86OSWindowsOS VersionXP
Product Version2.2.2Product Build 
Target VersionFixed in Version2.7.1 
Summary0014431: Multiple (near-)simultaneous calls to TThread.Synchronize causing program to hang
DescriptionSee the attached code (from my post at http://community.freepascal.org:10000/bboards/message?message_id=334220&forum_id=24092 [^]), for which the equivalent Delphi version works correctly for any (sensible) value of MAXTHREADS. However under FP the program only works for MAXTHREADS = 1, and hangs for MAXTHREADS > 1.

The Delphi implementation of TThread.Synchronize (for Delphi 7, classes.pas line 9556) is much more sophisticated than that of FP (for 0.9.26.2 beta, classes.inc line 123), as it keeps a list of the procedures that need to be called (the parameter passed to Synchronize(...)) and uses Windows threadsafe API calls to organise this list. However the FP one appears to just overwrite the details of any previous call, whether or not it has been executed, and this I believe to be the cause of this problem.
Steps To ReproduceSee code attached. When the button is clicked, the program should display the 'Total =' message box with a number. However with MAXTHREADS > 1 the program will hang instead.
Additional InformationSometimes Lazarus stops responding to program reset (CTRL-F2) commands once the program has hung.
TagsNo tags attached.
FPCOldBugId
Fixed in Revision23227
Attached Files? file icon threadtestform.pas [^] (2,157 bytes) 2009-08-25 15:32

- Relationships
related to 0017297resolvedSven Barth Missing tthread.queue 

-  Notes
(0030150)
Micha Nelissen (manager)
2009-08-25 22:48

I notice you're calling Suspend in a synchronized procedure. This sounds like a bad idea to me. (Note: haven't run any code yet, just commenting).
(0030151)
Micha Nelissen (manager)
2009-08-25 22:57

Thinking about it, I guess this happens: you suspend the secondary thread from the main thread; this causes the secondary thread to not give up the lock to the synchronize data (because it has been suspended). Therefore other secondary threads cannot enter the critical section for .Synchronize anymore -> deadlock. (Apart from the main/GUI thread, I assume?)

Try moving your Suspend call to after the Synchronize call.
(0030153)
Colin Haywood (reporter)
2009-08-26 03:17

Thanks Micha - that _almost_ fixed it. I had the Suspend call where you suggested in a previous version under Delphi 7, where the program hung as the thread was actually being asked to resume before it had finished suspending. Moving the Suspend call to the synchronized procedure fixed the problem in Delphi 7, but fails in FP for the reason I gave in the original post.

However, after implementing Micha's idea of having the Suspend call immediately follow Synchronize(...), a "tacky fix" of a loop that does nothing can be used to waste enough time so that the thread has completed suspending before being asked to resume. I inserted this code

asm
  MOV ECX, 28014
@TackyLoop:
  DEC ECX
  JNZ @TackyLoop
end;

after the repeat...until block containing the CheckSynchronize command. The number 28014 seems to be the smallest value that makes the program work correctly, although it will certainly vary from program to program and probably also from machine to machine (and so this fix is generally a bit useless).

I have, however, recently discovered a workaround for this problem using the event Windows API calls (SetEvent, ResetEvent, WaitForSingleObject, WaitForMultipleObjects) instead of Synchronize/CheckSynchronize. Obviously the program I attached was just a test, so I'll see if this workaround works for our main project, and if so I'll post the code here.
(0030156)
Vincent Snijders (manager)
2009-08-26 07:33

> but fails in FP for the reason I gave in the original post.

No, it fails because you suspend the thread waiting to finish the synchronize call as explained by Micha in note 30151.

FP does not overwrite a previous call, but uses a critical section to guard against that.
(0030181)
Micha Nelissen (manager)
2009-08-26 20:07

Colin, I suggest to not use Suspend/Resume in this case, but Events (as you discovered). Begin Thread.Execute with waiting for event instead of Suspend; set event instead of using Resume. FP provides an abstraction for that interface as well, RTLEvent*:

http://www.freepascal.org/docs-html/rtl/system/rtleventcreate.html [^]
(0030189)
Colin Haywood (reporter)
2009-08-26 23:30

Yep - thanks. As a bonus, I've discovered that using events seems to be faster than Suspend/Resume :)
(0030196)
Jonas Maebe (manager)
2009-08-27 13:23

So can this bug report be closed?
(0030208)
Colin Haywood (reporter)
2009-08-27 17:35

I don't know how much you want the FPC classes to emulate their equivalent Delphi classes, so if you're not particularly worried about multiple simultaneous calls to Suspend (inside the sync procedure) not being processed as neatly as they are in Delphi and/or a delay being needed in case of Suspend being called by the thread itself then yes it can be closed.

I'm wondering if the latter could be fixed with some event code being added (when Suspend is called, reset an event local to the thread then set it again once the Suspend routine completes, and insist that the event is set before the Resume routine starts)?
(0034838)
Marco van de Voort (manager)
2010-02-27 19:32

That is becoming an emulation rather than compatibility, and such things are hell to support with multiple platforms (with all slightly different threadlibs).
(0035767)
Marco van de Voort (manager)
2010-03-18 16:11

Last call, any comments before closing?
(0040586)
Marco van de Voort (manager)
2010-08-29 13:59

We will need a queue anyway in time, to support tthread.queue

(which schedules a method of a thread in the mainthread, without waiting for it).
(0049328)
Marco van de Voort (manager)
2011-06-23 12:09

Meanwhile Delphi XE has deprecated suspend, and so have we. And I'm talking to myself.
(0049388)
Thaddy de Koning (reporter)
2011-06-25 13:59

About the queue: TThreadlist is there, isn't it? Can be used as such if enqueue/dequeue performance is not an issue.
(0049392)
Marco van de Voort (manager)
2011-06-25 15:01
edited on: 2011-06-25 15:02

Synchronize requests are said to be also in that queue. The trouble is that it is not just the queue, but also storage of the event to signal the synchronizing() thread to continue. (I don't know why, but I assume to avoid the prefered kind to starve the other they are both put in the queue and processed in order)

IOW it reworks most of the current synchonization. I think it is wiser to start that only after 2.6.0 branched off.

And this is in sysutils, so can't use classes and "higher" units. (like syncobj). One needs to work directly with the primitives (threadmgr, which are roughly substitutes for abstract versions of the corresponding windows calls)

(0059283)
Marco van de Voort (manager)
2012-05-06 00:28

There is more to that. I've started coding, but ran into the problem that during a synchronize() event, queue must still be allowed.

So there probably should be two locks. One critical that blocks synchronize from running multiple times (like now), and one that guards the queue only when it is accessed (begin of synchronize, end of synchronize)
(0065327)
Sven Barth (manager)
2013-01-30 21:06

Let's try one last time with a feedback request. :)

Recently the CheckSynchronize implementation was brought up to par with Delphi XE3, so you could try again whether you still experience the problem with a recent trunk version.
Please note however Marco's comment regarding the deprecation of "Resume"/"Suspend" (see also http://wiki.freepascal.org/User_Changes_2.4.4#TThread.Suspend_and_TThread.Resume_have_been_deprecated [^] ). Also there will be a compiler error, because the "TThread_1.FSubtotal" field is declared after the "FSyncProc" method.
Additionally your program won't work on non-Windows systems (besides the usage of the "Windows" unit), because on those systems using "Suspend" from outside the thread itself is not possible (and the method "FSyncProc" won't run in the thread's context afterall...).

Also the following additional note: next time please provide a full compileable example (includes project file, form's unit, LFM file and - in this explicit case - the LRS file). Most preferably would be a project that does not depend on Lazarus/the LCL at all.

Regards,
Sven
(0065475)
Sven Barth (manager)
2013-02-04 17:16

I've now tested the example on a Windows system with a recent 2.7.1. No deadlock appears, so I'd say that this problem is solved with the recent changes to TThread and I'll resolve this bug report without waiting for further feedback...

@Colin: if you don't agree, then feel free to reopen the issue.

Regards,
Sven

- Issue History
Date Modified Username Field Change
2009-08-25 15:32 Colin Haywood New Issue
2009-08-25 15:32 Colin Haywood File Added: threadtestform.pas
2009-08-25 22:48 Micha Nelissen Note Added: 0030150
2009-08-25 22:57 Micha Nelissen Note Added: 0030151
2009-08-26 03:17 Colin Haywood Note Added: 0030153
2009-08-26 07:33 Vincent Snijders Note Added: 0030156
2009-08-26 12:59 Colin Haywood Note Added: 0030160
2009-08-26 12:59 Colin Haywood Note Deleted: 0030160
2009-08-26 20:07 Micha Nelissen Note Added: 0030181
2009-08-26 23:30 Colin Haywood Note Added: 0030189
2009-08-27 13:23 Jonas Maebe Note Added: 0030196
2009-08-27 17:35 Colin Haywood Note Added: 0030208
2010-02-27 19:32 Marco van de Voort Note Added: 0034838
2010-03-18 16:11 Marco van de Voort Note Added: 0035767
2010-08-29 13:59 Marco van de Voort Note Added: 0040586
2010-08-29 14:01 Marco van de Voort Relationship added related to 0017297
2011-06-23 12:09 Marco van de Voort Note Added: 0049328
2011-06-25 13:59 Thaddy de Koning Note Added: 0049388
2011-06-25 15:01 Marco van de Voort Note Added: 0049392
2011-06-25 15:02 Marco van de Voort Note Edited: 0049392
2012-05-06 00:28 Marco van de Voort Note Added: 0059283
2013-01-30 21:06 Sven Barth Note Added: 0065327
2013-01-30 21:06 Sven Barth Status new => feedback
2013-02-04 17:16 Sven Barth Fixed in Revision => 23227
2013-02-04 17:16 Sven Barth Note Added: 0065475
2013-02-04 17:16 Sven Barth Status feedback => resolved
2013-02-04 17:16 Sven Barth Fixed in Version => 2.7.1
2013-02-04 17:16 Sven Barth Resolution open => fixed
2013-02-04 17:16 Sven Barth Assigned To => Sven Barth



MantisBT 1.2.12[^]
Copyright © 2000 - 2012 MantisBT Group
Powered by Mantis Bugtracker