I was assuming that if a shared variable between threads has native type, atomicity should do the job.
But as per output of the code below, it is not the case, at least for delphi.
Thread t1 is simply incrementing the counter 10M times.
At the same time, thread t2 is decrementing the counter 10M times.
So expected counter value at the end is 0 but I read different values each time.
What is the proper way of sharing a native variable between threads in Delphi without locking?
procedure TForm1.Button1Click(Sender: TObject);
var
t1, t2: TThread;
Counter: NativeInt;
begin
Counter := 0;
// first thread to increment shared counter
t1 := TThread.CreateAnonymousThread(
procedure ()
var
i: Integer;
begin
for i := 1 to 10000000 do
Inc(Counter);
end
);
// second thread to decrement shared counter
t2 := TThread.CreateAnonymousThread(
procedure ()
var
i: Integer;
begin
for i := 1 to 10000000 do
Dec(Counter);
end
);
t1.FreeOnTerminate := false;
t2.FreeOnTerminate := false;
// start threads
t1.Start;
t2.Start;
// wait for them to finish
t1.WaitFor;
t2.WaitFor;
t1.Free;
t2.Free;
// print the counter, expected counter is 0
Caption := IntToStr(Counter);
end;
Reading and writing of aligned variables is atomic. But the problem is that when you use inc and dec you are both reading and writing. By performing two memory accesses then the compound operation is no longer atomic.
Use atomic increment functions instead. The TInterlocked class methods, or AtomicIncrement.
As far as what is native about NativeInt, that refers to its size. It is an integral type the same size as a pointer. So 32 bits in a 32 bit process, 64 bits in a 64 bit process. These types are seldom used for pure Delphi code, usually for interop with third party libraries which might declare handle types using pointer sized integers.
Related
I have a number crunching application in a unit called Execution that is called from the main form. Vey often the code in Execution.pas needs to run 10-15 times in a row and I am looking an efficient way to apply multi-threading, so that the calculations on the Execution unit run in parallel. A simplified version of the code is as follows:
Main Form with one Button1 in it:
unit MainForm;
interface
uses
Winapi.Windows, Winapi.Messages, System.SysUtils, System.Variants, System.Classes, Vcl.Graphics,
Vcl.Controls, Vcl.Forms, Vcl.Dialogs, Vcl.StdCtrls, Execution;
type
TMainForm1 = class(TForm)
Button1: TButton;
procedure Button1Click(Sender: TObject);
end;
var
MainForm1: TMainForm1;
implementation
{$R *.dfm}
procedure TMainForm1.Button1Click(Sender: TObject);
var
ExecutionThread: TThread;
run_ID: integer;
begin
for run_ID := 0 to 2 do
begin
ExecutionThread := TThread.CreateAnonymousThread(
procedure
begin
Execution.CalculateSum;
end);
ExecutionThread.FreeOnTerminate := true;
ExecutionThread.Start;
end;
end;
end.
Execution.pas unit:
unit Execution;
interface
uses System.SysUtils, Vcl.Dialogs;
procedure CalculateSum;
procedure IncrementSum(var Sum: integer);
implementation
const
NoOfTimes = 100;
var
Sum: integer;
procedure CalculateSum;
var
i: integer;
begin
Sum := 0;
for i := 0 to Pred(NoofTimes) do
begin
IncrementSum(Sum);
end;
ShowMessage('Sum = ' + IntToStr(Sum));
end;
procedure IncrementSum(var Sum: integer);
begin
Inc(Sum);
Sleep(10);
end;
end.
If I execute the simplified code, I get three messages with sums close to 300, but not 300 (285, 287, 289), which is expected, since all the threads change the same global variable Sum at the same time and sometimes the incrementations overlap.
Is there an easy way to allocate different memory to each thread, so that I get three messages with sums 100? Or do I have to hard code it myself, allocating different Execution variables in each thread?
Note that the original application that I have contains some thousands variables and records, hence I am looking for a fast way to do this without changing significantly the code (if there is any, of course).
Thank you in advance.
Don't write to the same variable from multiple threads without synchronization or you risk corrupting the data of that variable.
So instead of calling IncrementSum(Sum) in your thread call Synchronize(IncrementSum(Sum)).
This forces procedure IncrementSum to be executed within the scope of the main thread and thus prevents multiple of your worker threads to be modifying that variable at the same time and thus causing potential data damage.
Also you may want to avoid updating this global variable from each loop cycle of every of your worker threads. If you have 10 of your worker threads all waiting for global thread to update this global variable they might end up spending most of their time waiting for main thread to proces their Synchronize calls.
Se perhaps you should consider updating this global variable ever 10 or so cycles for instance. The number of cycles between you go and updating the global variable would depend on how fast each cycle is finished. faster the cycles are done more cycles it is work to wait before updating the global variable and thus reducing the number of Synchronize calls and thus putting less load on the main thread.
Is there an easy way to allocate different memory to each thread, so
that I get three messages with sums 100?
There is a really easy way to allocate different memory to each thread and yet use the same declaration: Use the threadvar keyword.
Here is the code changed:
unit Execution;
interface
uses
System.SysUtils, Vcl.Dialogs;
procedure CalculateSum;
procedure IncrementSum(var Sum: integer);
implementation
const
NoOfTimes = 100;
threadvar // <==== ONLY CHANGE IS HERE
Sum: integer;
procedure CalculateSum;
var
i: integer;
begin
Sum := 0;
for i := 0 to Pred(NoofTimes) do
begin
IncrementSum(Sum);
end;
ShowMessage('Sum = ' + IntToStr(Sum));
end;
procedure IncrementSum(var Sum: integer);
begin
Inc(Sum);
Sleep(10);
end;
end.
Have you thought about using TCriticalSection in System.SyncObjs ?
Only one thread can enter the TCriticalSection at a time, so if a thread enters, modifies shared variables, and leaves, then each thread will wait for the others.
If you are spending most of the time in the TCriticalSection this won't help you scale, because your threads will be mostly waiting for the object, but it does ensure that only one thread is modifying the data protected with the TCriticalSection at a time.
So if your code is continually updating shared variables this will stop them clashing but not really provide the benefit of separate threads. If, however, the threads are doing significant calculation without modifying variables then this allows that code to execute in parallel.
I have the following code running on Windows 10.
function SingleProcessorMask(const ProcessorIndex: Integer): DWORD_PTR;
begin
Result:= 1; Result:= Result shl (ProcessorIndex); //Make sure it works on processor 33 and up.
end;
procedure TForm2.BtnCreateLookup5x5to3x3UsingSpeculativeExplorationClick(Sender: TObject);
var
ThreadCount: integer;
Threads: TArray<TThread>;
CurrentProcessor: integer;
i,a: integer;
Done: boolean;
begin
ThreadCount:= System.CpuCount;
SetLength(Threads, ThreadCount);
CurrentProcessor:= GetCurrentProcessorNumber;
a:= 0;
for i:= 1 to ThreadCount-1 do begin
Threads[i]:= TThread.CreateAnonymousThread(procedure begin
CreateLookupUsingGridSolver(i, ThreadCount);
end);
Threads[i].FreeOnTerminate:= false;
if (CurrentProcessor = a) then Inc(a); //Skip the current processor.
Inc(a);
//if (SetThreadAffinityMask(Threads[i].handle, SingleProcessorMask(a))) = 0 then RaiseLastOSError; << fails here as well.
Threads[i].Start;
if (SetThreadAffinityMask(Threads[i].handle, SingleProcessorMask(a))) = 0 then RaiseLastOSError;
end; {for i}
CreateLookupUsingGridSolver(0, ThreadCount, NewLookup);
{Wait for all threads to finish}
.....
//Rest of the proc omitted to save space.
end;
I keep getting error 87, Incorrect parameter.
I'm fairly sure the SingleProcessorMask is correct.
Is there perhaps an issue with the TThread.Handle?
It does not matter if I call this code running as Administrator, running on a laptop or running on a i9. The result is always the same.
And yes, I really do need to force the threads, otherwise they all bunch up on the same core.
UPDATE
Once I fix the process affinity to match the system affinity, there is no need to muck around with assigning each thread to a specific core. In that case the automatic handling works. This is done using:
GetProcessAffinityMask(GetCurrentProcess(), ProcessAffinityMask, SystemAffinityMask);
SetProcessAffinityMask(GetCurrentProcess(), SystemAffinityMask);
//Error checking omitted for brevity
It looks like you are trying to create a separate thread for every CPU other than the "current" CPU that is running your OnClick handler. But, you never use CPU 0 in your affinity masks, because you increment a too soon. But more importantly, a thread's affinity mask must be a subset of the process's affinity mask, which specifies the CPUs the process is allowed to run on:
A thread can only run on the processors its process can run on. Therefore, the thread affinity mask cannot specify a 1 bit for a processor when the process affinity mask specifies a 0 bit for that processor.
The process affinity mask is itself a subset of the system affinity mask, which specifies which CPUs are installed.
So, the likely cause of your error is that you are calculating thread affinity masks that the OS rejects as invalid for your process.
Try something more like this instead (note: this doesn't take CPU processor groups into account, if the OS has more than 64 CPUs installed):
procedure TForm2.BtnCreateLookup5x5to3x3UsingSpeculativeExplorationClick(Sender: TObject);
var
ThreadCount, MaxThreadCount: integer;
Threads: TArray<TThread>;
i, CurrentProcessor: integer;
ProcessAffinityMask, SystemAffinityMask, AllowedThreadMask, NewThreadMask: DWORD_PTR;
Thread: TThread;
...
begin
if not GetProcessAffinityMask(GetCurrentProcess(), ProcessAffinityMask, SystemAffinityMask) then RaiseLastOSError;
// optional: up the CPUs this process can run on, if needed...
{
if not SetProcessAffinityMask(GetCurrentProcess(), SystemAffinityMask) then RaiseLastOSError;
ProcessAffinityMask := SystemAffinityMask;
}
AllowedThreadMask := DWORD_PTR(-1) and ProcessAffinityMask;
CurrentProcessor := GetCurrentProcessorNumber;
ThreadCount := 0;
MaxThreadCount := System.CpuCount;
NewThreadMask := 1;
SetLength(Threads, MaxThreadCount);
try
for i := 0 to MaxThreadCount-1 do
begin
if (i <> CurrentProcessor) and //Skip the current processor.
((AllowedThreadMask and NewThreadMask) <> 0) then // is this CPU allowed?
begin
Thread := TThread.CreateAnonymousThread(
procedure
begin
CreateLookupUsingGridSolver(...);
end
);
try
Thread.FreeOnTerminate := false;
if not SetThreadAffinityMask(Thread.Handle, NewThreadMask) then RaiseLastOSError;
Thread.Start;
except
Thread.Free;
raise;
end;
Threads[ThreadCount] := Thread;
Inc(ThreadCount);
end;
NewThreadMask := NewThreadMask shl 1;
end;
CreateLookupUsingGridSolver(...);
// Wait for all threads to finish...
// ...
finally
for i := 0 to ThreadCount-1 do
Threads[i].Free;
end;
end;
There are two arguments to SetThreadAffinityMask, the thread handle and the mask. It's pretty clear from the code that the thread handle is valid. Which leaves the mask. The documentation clearly states the following:
If the thread affinity mask requests a processor that is not selected for the process affinity mask, the last error code is ERROR_INVALID_PARAMETER.
It is rather hard to see what else could explain the behaviour that you report.
I have an application that is 100% Delphi code. It is a 64 bit windows console application, with a workload manager, and a fixed number of workers. This is done via creation of threads and each thread is a worker. The thread does not die, it pulls works from its own queue that the workload manager populates.
This appears to work just fine.
What I am finding, however, is that on a 16 core system I am seeing processing times around 90 minutes (it has 2,000,000+ workloads; and each does database work). When I added 16 to 32 cores, I saw the performance drop! There is no database contention. Essentially, the DB is waiting for things to do.
Each thread has its own DB connection. Each thread's queries use only that threads connection.
I updated the Delphi MM to use ScaleMM2; which made a big improvement; but I am still at a loss as to why increasing cores reduces performance.
When app has 256 threads, on 32 cores, CPU total use at 80%.
When app has 256 threads, on 16 cores, CPU total use at 100% (which is why I wanted to add cores) -- and it got slower :-(
I have applied as much as the advice as I can understand to the code-base.
ie - Functions not returning strings, using Const for arguments, protecting "shared" data with small critical sections (actually using Multi-read Exclusive Write). I currently do not assign processor affinity; I was reading conflicting advice on using it .. so I am currently not (would be trival to add, just not there today).
Questions - slanted towards I "think" the issue is around thread contention ...
How do I find confirm thread-contention is the issue? Are there tools available specifically for this type of contention identification?
How can I determine what is using "heap" and what is not, to further reduce contention there?
Insights, guidance, pointers would be appreciated.
Can provide relevant code areas ... if I knew what was relevant.
Procedure TXETaskWorkloadExecuterThread.Enqueue(Const Workload: TXETaskWorkload);
Begin
// protect your own queue
FWorkloadQueue.Enter;
FWorkloads.Add(Workload);
FWorkloadQueue.Leave;
End;
Procedure TXETaskManager.Enqueue(Const Workload: TXETaskWorkload);
Begin
If FWorkloadCount >= FMaxQueueSize Then Begin
WaitForEmptyQueue;
FWorkloadCount := 0;
End;
FExecuters[FNextThread].Enqueue(Workload);
// round-robin the queue
Inc(FNextThread);
Inc(FWorkloadCount);
If FNextThread >= FWorkerThreads Then Begin
FNextThread := 0;
End;
End;
Function TXETaskWorkloadExecuterThread.Dequeue(Var Workload: TXETaskWorkload): Boolean;
Begin
Workload := Nil;
Result := False;
FWorkloadQueue.Enter;
Try
If FNextWorkload < FWorkloads.Count Then Begin
Workload := FWorkloads[FNextWorkload];
Inc(FNextWorkload);
If Workload Is TXETaskWorkLoadSynchronize Then Begin
FreeAndNil(Workload);
Exit;
End;
Result := True;
End Else Begin
FWorkloads.Clear;
FNextWorkload := 0;
FHaveWorkloadInQueue.ResetEvent;
FEmptyAndFinishedQueue.SetEvent;
End;
Finally
FWorkloadQueue.Leave;
End;
End;
EDIT ---
Thanks for all the comments. Clarifications.
This system/VM has nothing else on it. The executable in question is the only thing using the CPU. Single threaded performance means linear. I have simply made this a divide/conquer. If I have 5,000,000 cars to park, and I have 30 drivers with 30 different parking lots. I can tell each driver to wait for the other drive to finish parking, it will be slower than telling 30 drivers to concurrently park cars.
Profiling in single threaded shows there is nothing that is causing this. I have seen mention on this board about Delphi and multi-core performance "gotcha's" (mostly related to string handling and LOCK).
The DB essentially is saying that it is bored, and waiting for things to do. I have checked with a copy of Intels vTune. Generally speaking, it says ... locks. But, I cannot find out where. What I have is pretty simple to my mind, and the current areas for locks are necessary and small. What I cannot see is locks that might be happening due to other things .. like strings creating a lock, or thread 1 causing some issue on the main process via accessing that data (even though protected via a critical section).
Continuing to research. Thanks again for the feedback/ideas.
Your Workload Manager is deciding which thread gets which work item. If a given thread blocks (say the work is long, DB latency, etc), you are queuing more items to that thread even though they might not get processed for awhile, if at all.
Typically, work items should be stored in a single shared queue that multiple threads then pull from. When any given thread is ready, it pulls the next available work item. For example:
constructor TXETaskManager.Create;
var
I: Integer;
begin
FWorkloadQueue := TCriticalSection.Create;
FWorkloads := TList<TXETaskWorkload>.Create;
FEmptyQueue := TEvent.Create(nil, True, True, '');
FHaveWorkloadInQueue := TEvent.Create(nil, True, False, '');
FNotFullQueue := TEvent.Create(nil, True, True, '');
FTermEvent := TEvent.Create(nil, True, False, '');
...
FMaxQueueSize := ...;
FWorkerThreads := ...;
for I := 0 to FWorkerThreads-1 do
FExecuters[I] := TXETaskWorkloadExecuterThread.Create(Self);
end;
destructor TXETaskManager.Destroy;
begin
for I := 0 to FWorkerThreads-1 do
FExecuters[I].Terminate;
FTermEvent.SetEvent;
for I := 0 to FWorkerThreads-1 do
begin
FExecuters[I].WaitFor;
FExecuters[I].Free;
end;
FWorkloadQueue.Free;
FWorkloads.Free;
FEmptyQueue.Free;
FHaveWorkloadInQueue.Free;
FNotFullQueue.Free;
FTermEvent.Free;
...
inherited;
end;
procedure TXETaskManager.Enqueue(Const Workload: TXETaskWorkload);
begin
FWorkloadQueue.Enter;
try
while FWorkloads.Count >= FMaxQueueSize do
begin
FWorkloadQueue.Leave;
FNotFullQueue.WaitFor(INFINITE);
FWorkloadQueue.Enter;
end;
FWorkloads.Add(Workload);
if FWorkloads.Count = 1 then
begin
FEmptyQueue.ResetEvent;
FHaveWorkloadInQueue.SetEvent;
end;
if FWorkloads.Count >= FMaxQueueSize then
FNotFullQueue.ResetEvent;
finally
FWorkloadQueue.Leave;
end;
end;
function TXETaskManager.Dequeue(var Workload: TXETaskWorkload): Boolean;
begin
Result := False;
Workload := nil;
FWorkloadQueue.Enter;
try
if FWorkloads.Count > 0 then
begin
Workload := FWorkloads[0];
FWorkloads.Delete(0);
Result := True;
if FWorkloads.Count = (FMaxQueueSize-1) then
FNotFullQueue.SetEvent;
if FWorkloads.Count = 0 then
begin
FHaveWorkloadInQueue.ResetEvent;
FEmptyQueue.SetEvent;
end;
end;
finally
FWorkloadQueue.Leave;
end;
end;
constructor TXETaskWorkloadExecuterThread.Create(ATaskManager: TXETaskManager);
begin
inherited Create(False);
FTaskManager := ATaskManager;
end;
procedure TXETaskWorkloadExecuterThread.Execute;
var
Arr: THandleObjectArray;
Event: THandleObject;
Workload: TXETaskWorkload;
begin
SetLength(Arr, 2);
Arr[0] := FTaskManager.FHaveWorkloadInQueue;
Arr[1] := FTaskManager.FTermEvent;
while not Terminated do
begin
case TEvent.WaitForMultiple(Arr, INFINITE, False, Event) of
wrSignaled:
begin
if Event = FTaskManager.FHaveWorkloadInQueue then
begin
if FTaskManager.Dequeue(Workload) then
try
// process Workload as needed...
finally
Workload.Free;
end;
end;
end;
wrError: begin
RaiseLastOSError;
end;
end;
end;
end;
If you find threads are not getting enough work, you can adjust your thread count as needed. You typically shouldn't be using very many more threads than you have CPU cores available.
I have some academic interest of how I can store a unique identifier in a dynamically created TThread.
I create something like this:
procedure TForm1.Button1Click(Sender: TObject);
var thrn:word;
begin
for thrn := 0 to 5 do//<--- this is a loop variable that should give the unique numbers
TThread.CreateAnonymousThread(
procedure()
var
i: longint;
r: double;
thrns:string;
begin
thrns:=inttostr(thrn);//in this thread? variable I try to store the ID as string
repeat
for i := 0 to 100000000 do
begin
r := random(high(i));//this loop gives some dummy job
r := sqr(r); //to the thread to slow it down
end;
TThread.Synchronize(nil,
procedure()
begin
memo1.Text:=memo1.Text+#13#10+
'done'+thrns;//it returns strange IDs including '6'
end);
until false;
end).Start;
end;
Can I pass a unique identifier to the dynamically created thread so that it could show it in its synchronize method?
This is a classic misunderstanding. We understand that anonymous methods capture, but what do they capture? The value or the variable?
The answer is the latter. They capture the variable. There is a single variable, thrn, that each of your six anonymous methods capture. Since there is one variable, there is only one value, at any one moment in time.
Of course, since you are executing code in threads, you have a data race on this variable. Hence my "at any one moment in time" proviso. And that's why you have unrepeatable, unpredictable results. And you are likely to access the loop variable after the loop has completed and the value then is undefined.
If you wish to have a different value for each anonymous method, you must make a new variable for each anonymous method. My answer to another question demonstrates that: Anonymous methods - variable capture versus value capture.
So, to illustrate in your context we need some more scaffolding.
function GetThreadProc(thrn: Integer): TProc;
begin
Result :=
procedure
begin
// thrn is passed by value, so a copy is made, i.e. a new variable
....
end;
end;
....
procedure TForm1.Button1Click(Sender: TObject);
var
thrn: Integer;
begin
for thrn := 0 to 5 do
TThread.CreateAnonymousThread(
GetThreadProc(thrn)).Start;
end;
You have to capture the value of your identifier. Here is an example how to do that.
procedure TForm1.Button1Click(Sender: TObject);
function GetAnonProc( ID: Word): TProc;
begin
Result :=
procedure
var
i: longint;
r: double;
thrns:string;
begin
thrns:= inttostr(ID);// Capture value
repeat
for i := 0 to 100000000 do
begin
r := random(high(i));//this loop gives some dummy job
r := sqr(r); //to the thread to slow it down
end;
TThread.Synchronize(nil,
procedure()
begin
memo1.Text:=memo1.Text+#13#10+
'done'+thrns;//it returns strange IDs including '6'
end);
until false;
end;
end;
var
thrn:word;
p: TProc;
begin
for thrn := 0 to 5 do
begin
p := GetAnonProc(thrn); // Capture thrn value
TThread.CreateAnonymousThread(p).Start;
end;
end;
The code above captures 6 different references to a local ID variable. Each with a different value.
The code in the question captures a single variable reference. Since you cannot control when the threads are running, there is no way to predict what value they will retrieve from the variable reference. The value 6 you observe is because of the fact that a loop variable's value is undefined after the loop is completed.
To further understand how anonymous methods works and use variable binding, read Variable Binding Mechanism.
I am in a multi-threaded situation and I have a function that I want to be run from only one thread at a time. However, rather than serializing the function in the tradition manner, I want any threads that attempt to enter the function whilst the first thread is running it to return immediately. I do not want the second thread to wait for the first thread.
Here is my code:
function InitMutex(const Name:String; var Handle: THandle):Boolean;
begin
Handle := CreateMutexA(NIL, True, PAnsiChar(Name));
Result := not (GetLastError = ERROR_ALREADY_EXISTS);
end;
procedure TForm1.Button1Click(Sender: TObject);
var
mHandle: THandle;
begin
if not InitMutex(BalloonTipMutex, mHandle) then Exit;
MessageBox(0, 'Executing Code....', '', 0);
ReleaseMutex(mHandle);
CloseHandle(mHandle);
end;
This is just an example with the same problem, cause I couldn't do a test sample with the threads.
The problem is: I click button1 for the first time, The messagebox appears, while the messagebox is still displayed (suppose the function is still running) I press button1 again, nothing is displayed (which is what's supposed to happen) but when I close the message box and press the button again, it shows nothing. (the function supposed to run again since its not running :S)
Try this instead:
procedure TForm1.Button1Click(Sender: TObject);
var mHandle: THandle;
begin
mHandle := 0;
if InitMutex(BalloonTipMutex, mHandle) then
begin
MessageBox(0, 'Executing Code....', '', 0);
ReleaseMutex(mHandle);
end;
if handle <> 0 then
CloseHandle(mHandle);
end;
your problem is... Even if CreateMutex returns error ERROR_ALREADY_EXISTS, it did "open" the mutex. So when your first function exit, the mutex is not freed since your 2nd call opened it, but never closed it. So when you try to call your function a 3rd time, it fails not because your first call kept the mutex open, but because your 2nd call did.
Also, I think InitMutex should return Result := (Handle <> 0) and not (GetLastError = ERROR_ALREADY_EXISTS)
EDIT: On a side note, this isn't really the way mutex are meant to be used. The "traditional" way to use mutex is to create them, then have your thread try to get ownership of them when you want to execute the code protected by the mutex. I would expect CreateMutex to be quite a bit slower than just taking ownership of a mutex and maybe there are some other pitfalls to that technique.
Now that I finally understand the question, I believe that the most efficient solution is to use interlocked operations.
procedure OneAtATimeThroughHere;
//FLockCount is a properly aligned integer, shared between all threads
var
ThisLockCount: Integer;
begin
ThisLockCount := InterlockedIncrement(FLockCount);
try
if ThisLockCount=1 then//we won the race
begin
//do stuff
end;
finally
InterlockedDecrement(FLockCount);
end;
end;
This approach will not permit re-entrant calls. If you need to cater for re-entrant calls then the solution is to use TryEnterCriticalSection(). Critical sections are much easier to use than mutexes, and they are faster too. Delphi wraps up the critical section API in the TCriticalSection object in the SyncObjs unit.
So your code would look like this:
procedure OneAtATimeThroughHere;
//FLock is an instance of TCriticalSection shared between all threads
if FLock.TryEnter then
begin
try
//do stuff
finally
FLock.Release;
end;
end;
As an alternate solution, you could use the AddAtom(), FindAtom() and DeleteAtom() Windows API functions (see: http://msdn.microsoft.com/en-us/library/ms649056(v=vs.85).aspx). There are also global versions of these for use between processes.
Using atoms would allow you to maintain full control over the flow of your threads and contain the entire locking mechanism within the function (like you could with a critical section).
You should create the mutex once and hold on to it for as long as your threads are running, and then have the function use WaitForSingleObject() with a timeout of 0 milliseconds to try to acquire the mutex lock. If WaitForSingleObject() returns WAIT_OBJECT_0, then the function was not already running yet.
var
mHandle: THandle = 0;
procedure TForm1.FormCreate(Sender: TObject);
begin
mHandle := CreateMutex(nil, False, nil);
end;
procedure TForm1.FormDestroy(Sender: TObject);
begin
CloseHandle(mHandle);
end;
procedure TForm1.Button1Click(Sender: TObject);
begin
if WaitForSingleObject(mHandle, 0) = WAIT_OBJECT_0 then
begin
try
MessageBox(0, 'Executing Code....', '', 0);
finally
ReleaseMutex(mHandle);
end;
end;
end;