design pattern for concurrent task execution with constraints

design pattern for concurrent task execution with constraints - multithreading

I have 3 classes of task (I, D, U) which come in on a queue, tasks of the same class must be processed in order. I want tasks to run as concurrently as possible; however there are some constraints:
U and D cannot run concurrently
U and I cannot run concurrently
I(n) requires U(n) has completed
Q: What design pattern(s) would fit this class of problem?
I have two approaches I am considering:
Approach 1:
Use 1 Thread per task, each with its own queue. Each thread has a synchronized start phase where it checks start conditions, then runs, then a synchronized stop phase. It is easy to see that this will provide good concurrency but I am unsure if it correctly implements my constraints and doesnt deadlock.
D_Thread { ...
while (task = D_Queue.take()) {
synchronized (State) { // start phase
waitForU();
State.setRunning(D, true);
}
run(task); // run phase
synchronized (State) { // stop phase
State.setRunning(D, false)
}
}
}
Approach 2: Alternatively, a single dispatch thread manages execution state, and schedules tasks in a ThreadPool, waiting if necessary for currently scheduled tasks to complete.

The Objective-C Foundation framework includes classes NSOperationQueue and NSOperation that satisfy some of these requirements. NSOperationQueue represents a queue of NSOperations. The queue runs a configurable maximum number of operations concurrently. Operations have a priority and a set of dependencies; all of the operations that an operation depends on must be completed before the queue will start running the operation. The operations are scheduled to run on a dynamically-sized pool of threads.
What you need requires a somewhat smarter version of NSOperationQueue that applies the constraints you have expressed, but NSOperationQueue and company provide an example of how roughly your problem has been solved in a production framework that resembles your second suggested solution of a dispatch thread running tasks on a thread pool.

Actually this turns out to be more simple than it seemed: a mutex is mainly all that is needed:
IThread(int k) {
synchronized (u_mutex) {
if (previousUSet.contains(k))) U(k);
}
I(k);
}
DThread(int k) {
synchronized (u_mutex) {
D(k);
previousUSet.add(k);
}
}

Related

what would be the right way to go for my scenario, thread array, thread pool or tasks?

I am working on a small microfinance application that processes financial transactions, the frequency of these transaction are quite high, which is why I am planning to make it a multi-threaded application that can process multiple transactions in parallel.
I have already designed all the workers that are thread safe,
what I need help for is how to manage these threads. here are some of my options
1.make a specified number of thread pool threads at startup and keep them running like in a infinite loop where they could keep looking for new transactions and if any are found start processing
example code:
void Start_Job(){
for (int l_ThreadId = 0; l_ThreadId < PaymentNoOfWorkerThread; l_ThreadId++)
{
ThreadPool.QueueUserWorkItem(Execute, (object)l_TrackingId);
}
}
void Execute(object l_TrackingId)
{
while(true)
{
var new_txns = Get_New_Txns(); //get new txns if any returns a queue
while(new_txns.count > 0 ){
process_txn(new_txns.Dequeue())
}
Thread.Sleep(some_time);
}
}
2.look for new transactions and assign a thread pool thread for each transaction (my understanding that these threads would be reused after their execution is complete for new txns)
example code:
void Start_Job(){
while(true){
var new_txns = Get_New_Txns(); //get new txns if any returns a queue
for (int l_ThreadId = 0; l_ThreadId < new_txns.count; l_ThreadId++)
{
ThreadPool.QueueUserWorkItem(Execute, (object)new_txn.Dequeue());
}
}
Thread.Sleep(some_time);
}
void Execute(object Txn)
{
process_txn(txn);
}
3.do the above but with tasks.
which option would be most efficient and well suited for my application,
thanks in advance :)

ThreadPool.QueueUserWorkItem is an older API and you shouldn't be using it directly
anymore. Tasks is the way to go and Thread pool is managed automatically for you.
What may suite your application would depend on what happens in process_txn and is subjective, so this is very generic guideline:
If process_txn is a compute bound operation: for example it performs only CPU bound calculations, then you may look at the Task Parallel Library. It will help you use the CPU cores more efficiently.
If process_txn is less of CPU and more IO bound operations: meaning if it may read/write from files/database or connects to some other remote service, then what you should look at is asynchronous programming and make sure your IO operations are all asynchronous which means your threads are never blocked on IO. This will help your service to be more scalable. Also depending on what your queue is, see if you can await on the queue asynchronously, so that none of your application threads are blocked just waiting on the queue.

Kotlin coroutines multithread dispatcher and thread-safety for local variables

Let's consider this simple code with coroutines
import kotlinx.coroutines.*
import java.util.concurrent.Executors
fun main() {
runBlocking {
launch (Executors.newFixedThreadPool(10).asCoroutineDispatcher()) {
var x = 0
val threads = mutableSetOf<Thread>()
for (i in 0 until 100000) {
x++
threads.add(Thread.currentThread())
yield()
}
println("Result: $x")
println("Threads: $threads")
}
}
}
As far as I understand this is quite legit coroutines code and it actually produces expected results:
Result: 100000
Threads: [Thread[pool-1-thread-1,5,main], Thread[pool-1-thread-2,5,main], Thread[pool-1-thread-3,5,main], Thread[pool-1-thread-4,5,main], Thread[pool-1-thread-5,5,main], Thread[pool-1-thread-6,5,main], Thread[pool-1-thread-7,5,main], Thread[pool-1-thread-8,5,main], Thread[pool-1-thread-9,5,main], Thread[pool-1-thread-10,5,main]]
The question is what makes these modifications of local variables thread-safe (or is it thread-safe?). I understand that this loop is actually executed sequentially but it can change the running thread on every iteration. The changes done from thread in first iteration still should be visible to the thread that picked up this loop on second iteration. Which code does guarantee this visibility? I tried to decompile this code to Java and dig around coroutines implementation with debugger but did not find a clue.

Your question is completely analogous to the realization that the OS can suspend a thread at any point in its execution and reschedule it to another CPU core. That works not because the code in question is "multicore-safe", but because it is a guarantee of the environment that a single thread behaves according to its program-order semantics.
Kotlin's coroutine execution environment likewise guarantees the safety of your sequential code. You are supposed to program to this guarantee without any worry about how it is maintained.
If you want to descend into the details of "how" out of curiosity, the answer becomes "it depends". Every coroutine dispatcher can choose its own mechanism to achieve it.
As an instructive example, we can focus on the specific dispatcher you use in your posted code: JDK's fixedThreadPoolExecutor. You can submit arbitrary tasks to this executor, and it will execute each one of them on a single (arbitrary) thread, but many tasks submitted together will execute in parallel on different threads.
Furthermore, the executor service provides the guarantee that the code leading up to executor.execute(task) happens-before the code within the task, and the code within the task happens-before another thread's observing its completion (future.get(), future.isCompleted(), getting an event from the associated CompletionService).
Kotlin's coroutine dispatcher drives the coroutine through its lifecycle of suspension and resumption by relying on these primitives from the executor service, and thus you get the "sequential execution" guarantee for the entire coroutine. A single task submitted to the executor ends whenever the coroutine suspends, and the dispatcher submits a new task when the coroutine is ready to resume (when the user code calls continuation.resume(result)).

Understanding Threads Swift

I sort of understand threads, correct me if I'm wrong.
Is a single thread allocated to a piece of code until that code has completed?
Are the threads prioritised to whichever piece of code is run first?
What is the difference between main queue and thread?
My most important question:
Can threads run at the same time? If so how can I specify which parts of my code should run at a selected thread?

Let me start this way. Unless you are writing a special kind of application (and you will know if you are), forget about threads. Working with threads is complex and tricky. Use dispatch queues… it's simpler and easier.
Dispatch queues run tasks. Tasks are closures (blocks) or functions. When you need to run a task off the main dispatch queue, you call one of the dispatch_ functions, the primary one being dispatch_async(). When you call dispatch_async(), you need to specify which queue to run the task on. To get a queue, you call one of the dispatch_queue_create() or dispatch_get_, the primary one being dispatch_get_global_queue.
NOTE: Swift 3 changed this from a function model to an object model. The dispatch_ functions are instance methods of DispatchQueue. The dispatch_get_ functions are turned into class methods/properties of DispatchQueue
// Swift 3
DispatchQueue.global(qos: .background).async {
var calculation = arc4random()
}
// Swift 2
dispatch_async(dispatch_get_global_queue(DISPATCH_QUEUE_PRIORITY_BACKGROUND, 0)) {
var calculation = arc4random()
}
The trouble here is any and all tasks which update the UI must be run on the main thread. This is usually done by calling dispatch_async() on the main queue (dispatch_get_main_queue()).
// Swift 3
DispatchQueue.global(qos: .background).async {
var calculation = arc4random()
DispatchQueue.main.async {
print("\(calculation)")
}
}
// Swift 2
dispatch_async(dispatch_get_global_queue(DISPATCH_QUEUE_PRIORITY_BACKGROUND, 0)) {
var calculation = arc4random()
dispatch_async(dispatch_get_main_queue()) {
print("\(calculation)")
}
}
The gory details are messy. To keep it simple, dispatch queues manage thread pools. It is up to the dispatch queue to create, run, and eventually dispose of threads. The main queue is a special queue which has only 1 thread. The operating system is tasked with assigning threads to a processor and executing the task running on the thread.
With all that out of the way, now I will answer your questions.
Is a single thread allocated to a piece of code until that code has completed?
A task will run in a single thread.
Are the threads prioritised to whichever piece of code is run first?
Tasks are assigned to a thread. A task will not change which thread it runs on. If a task needs to run in another thread, then it creates a new task and assigns that new task to the other thread.
What is the difference between main queue and thread?
The main queue is a dispatch queue which has 1 thread. This single thread is also known as the main thread.
Can threads run at the same time?
Threads are assigned to execute on processors by the operating system. If your device has multiple processors (they all do now-a-days), then multiple threads are executing at the same time.
If so how can I specify which parts of my code should run at a selected thread?
Break you code into tasks. Dispatch the tasks on a dispatch queue.

Is there a way to run delayed or scheduled task with GPars?

I'm building my concurrent application on top of GPars library.
It contains a thread pool under the hood, so I would like to solve all concurrency-related tasks by means of this pool.
I need to run a task with a certain delay (e.g. 30 seconds). Also I want to run some tasks periodically.
Are there any ways to implements these things with GPars?

What about Thread.sleep for delaying and Quartz for scheduling? I know there are the obvious choices but I don't see anything wrong with using them.
What I mean is to mix GPars with a bit of higher order closures e.g.:
#Grab(group='org.codehaus.gpars', module='gpars', version='1.2.1')
def delayDecorator = {closure, delay ->
return {params ->
Thread.sleep (delay)
closure.call (params)
}
}
groovyx.gpars.GParsPool.withPool() {
def closures = [{println it},{println it + 1}], delay = 1000
closures.collect(delayDecorator.rcurry(delay)).eachParallel {it (1)}
}

Limit number of concurrent thread in a thread pool

In my code I have a loop, inside this loop I send several requests to a remote webservice. WS providers said: "The webservice can host at most n threads", so i need to cap my code since I can't send n+1 threads.
If I've to send m threads I would that first n threads will be executed immediately and as soon one of these is completed a new thread (one of the remaining m-n threads) will be executed and so on, until all m threads are executed.
I have thinked of a Thread Pool and explicit setting of the max thread number to n. Is this enough?

For this I would avoid the use of multiple threads. Instead, wrapping the entire loop up which can be run on a single thread. However, if you do want to launch multiple threads using the/a thread pool then I would use the Semaphore class to facilitate the required thread limit; here's how...
A semaphore is like a mean night club bouncer, it has been provide a club capacity and is not allowed to exceed this limit. Once the club is full, no one else can enter... A queue builds up outside. Then as one person leaves another can enter (analogy thanks to J. Albahari).
A Semaphore with a value of one is equivalent to a Mutex or Lock except that the Semaphore has no owner so that it is thread ignorant. Any thread can call Release on a Semaphore whereas with a Mutex/Lock only the thread that obtained the Mutex/Lock can release it.
Now, for your case we are able to use Semaphores to limit concurrency and prevent too many threads from executing a particular piece of code at once. In the following example five threads try to enter a night club that only allows entry to three...
class BadAssClub
{
static SemaphoreSlim sem = new SemaphoreSlim(3);
static void Main()
{
for (int i = 1; i <= 5; i++)
new Thread(Enter).Start(i);
}
// Enfore only three threads running this method at once.
static void Enter(int i)
{
try
{
Console.WriteLine(i + " wants to enter.");
sem.Wait();
Console.WriteLine(i + " is in!");
Thread.Sleep(1000 * (int)i);
Console.WriteLine(i + " is leaving...");
}
finally
{
sem.Release();
}
}
}
I hope this helps.
Edit. You can also use the ThreadPool.SetMaxThreads Method. This method restricts the number of threads allowed to run in the thread pool. But it does this 'globally' for the thread pool itself. This means that if you are running SQL queries or other methods in libraries that you application uses then new threads will not be spun-up due to this blocking. This may not be relevant to you, in which case use the SetMaxThreads method. If you want to block for a particular method however, it is safer to use Semphores.

Develop Reference

node.js excel linux python-3.x azure haskell apache-spark rust .htaccess string