Questions about implementation of worker tasks controlled by single master task - multithreading

I would like to implement a master task controlling several instances of a worker task. Each worker task has three different phases:
Initialization
Do work
Report results
At the beginning the master task should initialize all worker tasks (concurrently). Each worker task has then s seconds to successfully complete its initialization but the completion in s seconds is not guaranteed.
What efficient possibilities (signaling mechanisms) do I have to let the master task monitor the state of the initialization of all worker tasks? I thought to give each worker task access to a worker task specific protected type object with a procedure to set a boolean flag which would be set by the individiual worker tasks after they have successfully completed their initialization.
After the master task has triggered the initialization of all worker tasks it could remember the current time and enter a loop to periodically poll the worker tasks initialization states by using a function declared in the protected object type to retrieve the initialization state. The loop is then exited if all worker tasks have been initialized or s seconds have been passed.
Do I have to use such a polling concept using a delay statement inside the monitor loop using an appropriately time value? I read about timeouts of entry calls. Could I use such timeouts to prevent the polling?
After a worker task has been successfuly completed its initialization it should wait for a signal from the control task to execute one work package. So I think a worker task should have a Do_Work entry and the master task therefore should call these entries for all worker tasks in a loop, right?
The master task could use an appropriate mechanism to check if all worker tasks have been completed their work packages. After this has happened the worker tasks should report their work results but in a deterministic way (not concurrently). So if I use a Report_Result entry in the worker tasks to wait for a signal from the master task the call of this entries in a loop in the control task would lead to a non-deterministic order of the report results. Can these entries also be called in blocking way (like a normal procedure call)?

You are correct that the master task can call the Do_Work entry for each worker task.
Similarly, the master task can call the Report_Result entry of all worker tasks.
A simple way to accomplish this is to create a task type for the worker tasks, and then an array of the worker tasks.
procedure Master is
task type Workers is
entry Do_Work;
entry Report_Result;
end Workers;
Team : array(1..5) of Workers;
begin
-- Initialization will occur automatically
-- Signal workers to Do_Work
for Worker of Team loop
Worker.Do_Work;
end loop;
-- Create a loop to signal all reports
-- While the workers may finish in a random order, the
-- reporting will occur in the order of the array indices
for Worker of Team loop
Worker.Report_Result;
end loop;
end Master;
This example is incomplete because it does not define the task body for the Workers task type. The important features of this program are:
Task initialization of the workers in the Team array begins when execution reaches the begin statement in Master.
The Master will wait for each element of Team to accept the entry call to Do_Work.
Each element of Team will wait at the accept statement for Master to call the Do_Work entry.
The master will wait for each element of Team to accept the Report_Result entry.
Each element of Team will wait at its accept for Report_Result for the master to call that entry.
The Ada Rendezvous mechanism neatly coordinates all communication between master and each of the workers.

One thing you can do if you really want the workers to signal the manager that they are done, is pass the Manager's access to the workers and provide an entry for them to call. You have to decide how the manager and workers interact when that signal happens.
As an example, I had a manager keep an array of Workers and two lists of accesses to those workers (since they are limited types, you have to use access variables). One list would keep track of all available workers and the other would keep track of the workers currently doing something. As workers finish up their work, they signal the manager who removed them from the busy list and puts them in the available list. When the client requests that the manager do more work, it pulls a worker from the available list and places it on the busy list and starts the worker going. Here is an example compiled in GNAT 7.1.1:
with Ada.Text_IO; use Ada.Text_IO;
with Ada.Containers.Bounded_Doubly_Linked_Lists;
procedure Hello is
package Tasks is
type Worker;
type Worker_Access is access all Worker;
package Lists is new Ada.Containers.Bounded_Doubly_Linked_Lists
(Element_Type => Worker_Access);
task type Manager is
-- Called by client code
entry Add_Work;
entry Stop;
-- Only called by workers to signal they are
-- finished
entry Signal(The_Position : in out Lists.Cursor);
end Manager;
task type Worker(Boss : not null access Manager) is
entry Start(The_Position : Lists.Cursor);
end Worker;
end Tasks;
package body Tasks is
task body Worker is
Position : Lists.Cursor := Lists.No_Element;
begin
loop
select
accept Start(The_Position : Lists.Cursor) do
Position := The_Position;
end Start;
-- Do stuff HERE
delay 0.005;
-- Finished so signal the manager
Boss.Signal(Position);
Position := Lists.No_Element;
or
terminate;
end select;
end loop;
end Worker;
Worker_Count : constant := 10;
task body Manager is
-- Worker Pool
Workers : array(1..Worker_Count)
of aliased Worker(Manager'Unchecked_Access); -- ' Fixing formatting
-- Use 2 lists to keep track of who can work and who
-- is already tasked
Bored : Lists.List(Worker_Count);
Busy : Lists.List(Worker_Count);
-- Gonna call a couple of times, so use a nested
-- procedure. This procedure removes a worker
-- from the Busy list and places it on the Bored
-- list.
procedure Handle_Signal(Position : in out Lists.Cursor) is
begin
Put_Line("Worker Completed Work");
Bored.Append(Lists.Element(Position));
Busy.Delete(Position);
end Handle_Signal;
use type Ada.Containers.Count_Type;
begin
-- Start off all workers as Bored
for W of Workers loop
Bored.Append(W'Unchecked_Access); -- ' Fixing formatting
end loop;
-- Start working
loop
select
when Bored.Length > 0 =>
accept Add_Work do
-- Take a worker from the Bored list, put it
-- on the busy list, and send it off to work.
-- It will signal when it is finished
Put_Line("Starting Worker");
Busy.Append(Bored.First_Element);
Bored.Delete_First;
Busy.Last_Element.Start(Busy.Last);
end Add_Work;
or
accept Stop;
Put_Line("Received Stop Signal");
-- Wait for all workers to finish
while Busy.Length > 0 loop
accept Signal(The_Position : in out Lists.Cursor) do
Handle_Signal(The_Position);
end Signal;
end loop;
-- Break out of loop
exit;
or
accept Signal(The_Position: in out Lists.Cursor) do
Handle_Signal(The_Position);
end Signal;
end select;
end loop;
-- Work finished!
Put_Line("Manager is Finished");
end Manager;
end Tasks;
Manager : Tasks.Manager;
begin
for Count in 1 .. 20 loop
Manager.Add_Work;
end loop;
Manager.Stop;
-- Wait for task to finish
loop
exit when Manager'Terminated;
end loop;
Put_Line("Program is Done");
end Hello;
I use cursors to help the worker remember where in the busy list they were, so that they can tell the Manager, and it can quickly move things around.
Sample Output:
$gnatmake -o hello *.adb
gcc -c hello.adb
gnatbind -x hello.ali
gnatlink hello.ali -o hello
$hello
Starting Worker
Starting Worker
Starting Worker
Starting Worker
Starting Worker
Starting Worker
Starting Worker
Starting Worker
Starting Worker
Starting Worker
Worker Completed Work
Starting Worker
Worker Completed Work
Starting Worker
Worker Completed Work
Starting Worker
Worker Completed Work
Starting Worker
Worker Completed Work
Starting Worker
Worker Completed Work
Starting Worker
Worker Completed Work
Starting Worker
Worker Completed Work
Worker Completed Work
Starting Worker
Worker Completed Work
Starting Worker
Starting Worker
Received Stop Signal
Worker Completed Work
Worker Completed Work
Worker Completed Work
Worker Completed Work
Worker Completed Work
Worker Completed Work
Worker Completed Work
Worker Completed Work
Worker Completed Work
Worker Completed Work
Manager is Finished
Program is Done
Note that you can pretty this up and hide a bunch of things, I just wanted to get a quick example out.

Related

python3 - thread is missing from enumerate result when it is sleeping

We have an API endpoint that starts a thread, and another endpoint to check the status of the thread (based on a thread ID returned by the first API call).
We use the threading module.
The function that the thread is executing may or may not sleep for a duration of time.
When we create the thread, we override the default name provided by the module and add the thread ID that was generated by us (so we can keep track).
The status endpoint gets the thread ID from the client request and simply loops over the results from threading.enumerate(). When the thread is running and not sleeping, we see that the thread is returned by the threading.enumerate() function. When it is sleeping, it is not.
The function we use to see if a thread is alive:
def thread_is_running(thread_id):
all_threads = [ t.getName() for t in threading.enumerate() ]
return any(thread_id in item for item in all_threads)
When we run in debug and print the value of "all_threads", we only see the MainThread thread during our thread's sleep time.
As soon as the sleep is over, we see our thread in the value of "all_threads".
This is how we start the thread:
thread_id = random.randint(10000, 50000)
thread_name = f"{service_name}-{thread_id}"
threading.Thread(target=drain, args=(service_name, params,), name=thread_name).start()
Is there a way to get a list of all threads including idle threads? Is a sleeping thread marked as idle? Is there a better way to pause a thread?
We thought about making the thread update it's state in a database, but due to some internal issue we currently have, we cannot 100% count on writing to our database, so we prefer checking the system for the thread's status.
Turns out the reason we did not see the thread was our use of gunicorn and multi workers.
The thread was initiated on one of the 4 configured workers while the status api call could've been handled by any of the 4 workers. only when it was handled by the worker who is also responsible of running the thread - we were able to see it in the enumerate output

How to stop a schedule ScheduledExecutorService

I am using ScheduledExecutorService and initialized it (ScheduledExecutorService scheduledThreadPool = Executors.newScheduledThreadPool(20);) through a singleton class so that I don't create new threads every time. I then schedule my task using schedule "executorService.schedule(new Runnable(), 20, TimeUnit.SECONDS);
I have 2 questions on this:
1. How do I shutdown a thread once its job is over. If I am trying to call a shutdown method after first execution I get java.util.concurrent.RejectedExecutionException error (as the main executor is shutdown).
2. How do I cancel a long running thread after some time? Let's say if a request is sent and a thread is stuck in the execution how should I cancel it after some time.
The best way to end your long running thread is to return from the Runnable#run function. Any attempts to interrupt or cancel may not always work.

Test Action with Target All Threads

When adding Test Action with Stop/Stop Now action
You have Target options: All Threads/Current Thread
When choosing Current Thread in a multiple threads environment it stops and don't continue further this Sampler
The problem is that when choosing All Threads Some threads execute other sampler after than Test Action
It's very confusing, because I expect All Threads option to be more strict than just Current Thread
In code I saw that in Current Threads it also stop current threads context.getThread().stop(); and in All Threads option it doesn't.
Is it a bug or a feature (adding grace period of stopping)?
For example Test Plan:
Thread Group (5 Threads)
Sampler1
Test Action Target: All Threads, Action Stop/Stop Now
Sampler2
Sampler 2 is execute only when Target: All Threads and not when Target: Current Thread
Note: Also choosing Action Go to next loop iteration (Target field is disabled) prevent Sampler2 to be executed
Stop and Stop Now are different:
Stop is a clean shutdown, meaning the current running samples will complete. So it is ok if you see other samplers even after test action
Stop Now is a hard shutdown, meaning current running samples will be interrupted so again, it is ok if you see those other samplers after Test Action
Current thread will only stop the current thread not all thread, so it is ok that:
When choosing Current Thread in a multiple threads environment it stops and don't continue further this Sampler
All Threads will do action on all threads of test, in code we have:
if (action == STOP_NOW) {
log.info("Stopping all threads now from element {}", getName());
context.getEngine().stopTest();
} else {
log.info("Stopping all threads from element {}", getName());
context.getEngine().askThreadsToStop();
}
Regarding your particular case, here is what is happening:
When you select "Current Thread", JMeter immediately stops the current thread as this action is taken into action after the Test Action
When you select "All Threads", JMeter triggers asynchronously a call to all threads shutdown/stop, that's why Sampler2 is called
You may consider this a bug, but I think use case is different.
Still it is now fixed:
https://bz.apache.org/bugzilla/show_bug.cgi?id=61698

Multi-Producer Single-Consumer Lazy Task Execution

I am trying to model a system where there are multiple threads producing data, and a single thread consuming the data. The trick is that I don't want a dedicated thread to consume the data because all of the threads live in a pool. Instead, I want one of the producers to empty the queue when there is work, and yield if another producer is already clearing the queue.
The basic idea is that there is a queue of work, and a lock around the processing. Each producer pushes its payload onto the queue, and then attempts to enter the lock. The attempt is non-blocking and returns either true (the lock was acquired), or false (the lock is held by someone else).
If the lock is acquired, then that thread then processes all of the data in the queue until it is empty (including any new payloads introduced by other producers during processing). Once all of the work has been processed, the thread releases the lock and quits out.
The following is C++ code for the algorithm:
void Process(ITask *task) {
// queue is a thread safe implementation of a regular queue
queue.push(task);
// crit_sec is some handle to a critical section like object
// try_scoped_lock uses RAII to attempt to acquire the lock in the constructor
// if the lock was acquired, it will release the lock in the
// destructor
try_scoped_lock lock(crit_sec);
// See if this thread won the lottery. Prize is doing all of the dishes
if (!lock.Acquired())
return;
// This thread got the lock, so it needs to do the work
ITask *currTask;
while (queue.try_pop(currTask)) {
... execute task ...
}
}
In general this code works fine, and I have never actually witnessed the behavior I am about to describe below, but that implementation makes me feel uneasy. It stands to reason that a race condition is introduced between when the thread exits the while loop and when it releases the critical section.
The whole algorithm relies on the assumption that if the lock is being held, then a thread is servicing the queue.
I am essentially looking for enlightenment on 2 questions:
Am I correct that there is a race condition as described (bonus for other races)
Is there a standard pattern for implementing this mechanism that is performant and doesn't introduce race conditions?
Yes, there is a race condition.
Thread A adds a task, gets the lock, processes itself, then asks for a task from the queue. It is rejected.
Thread B at this point adds a task to the queue. It then attempts to get the lock, and fails, because thread A has the lock. Thread B exits.
Thread A then exits, with the queue non-empty, and nobody processing the task on it.
This will be difficult to find, because that window is relatively narrow. To make it more likely to find, after the while loop introduce a "sleep for 10 seconds". In the calling code, insert a task, wait 5 seconds, then insert a second task. After 10 more seconds, check that both insert tasks are finished, and there is still a task to be processed on the queue.
One way to fix this would be to change try_pop to try_pop_or_unlock, and pass in your lock to it. try_pop_or_unlock then atomically checks for an empty queue, and if so unlocks the lock and returns false.
Another approach is to improve the thread pool. Add a counting semaphore based "consume" task launcher to it.
semaphore_bool bTaskActive;
counting_semaphore counter;
when (counter || !bTaskActive)
if (bTaskActive)
return
bTaskActive = true
--counter
launch_task( process_one_off_queue, when_done( [&]{ bTaskActive=false ) );
When the counting semaphore is active, or when poked by the finished consume task, it launches a consume task if there is no consume task active.
But that is just off the top of my head.

Thread Pool : how to spawn a child task from a running task?

A simple thread pool with a global shared queue of tasks (functors).
Each worker (thread) will pick up one task from the worker, and execute it. It wont execute the next task, until this one is finished.
Lets imagine a big task that needs to spawn child tasks to produce some data, and then continue with evaluation (for example, to sort a big array before save to disk).
pseudo code of the task code:
do some stuff
generate a list of child tasks
threadpool.spawn (child tasks)
wait until they were executed
continue my task
The problem is that the worker will dead lock, because the task is waiting for the child task, and the thread pool is waiting for the parent task to end, before running the child one.
One idea is to run the child task inside the spawn code:
threadpool.spawn pseudo code:
threadpool.push (tasks)
while (not all incoming task were executed)
t = threadpool.pop()
t.run()
return (and continue executing parent task)
but, how can I know that all the task were executed , in an efficient way?
Another idea is to split the parent task.. something like this:
task pseudo code:
l = generate a list of child tasks
threadpool.push ( l , high priority )
t = create a task to work with generated data
threadpool.push (t , lo priority )
But i found this quite intrusive...
any opinions?
pd. merry christmas!
pd2. edited some bad names
You can have a mechanism for the children threads to signal back to the main worker whenever they are done so it can proceed. In Java, Callable tasks submitted to an ExecutorService thread pool respond back with their results as Futures data structures. Another approach would be to maintain a separate completion signal, something similar to a CountDownLatch, which will serve as a common countdown mechanism to be updated every time a thread completes.

Resources