Multiple QtConcurrent::map execution order - multithreading

Lets say I have a vector of 100 elements and Func1 and Func2. In the single threaded version Func1 process vector elements and when Func1 finishes, Func2 must start a different process on elements.
I'm curious to know if I utilize QtConcurrent::map in the following order, in which order actually Func1 and Func2 will execute?
QFuture<void> future;
future = QtConcurrent::map(vector, Func1);
future = QtConcurrent::map(vector, Func2);
I must mention that using future.waitForFinished() will block my application main thread which I don't want.
future = QtConcurrent::map(vector, Func1);
future.waitForFinished();
future = QtConcurrent::map(vector, Func2);
Also I don't want to execute those QtConcurrent::map in a secondary thread and do the future.waitForFinished() there, because in that approach I will lose one of my threads in threadpool.
So, the question is do tasks added by QtConcurrent::map execute in order?
EDIT
In both single threaded and multi-threaded approaches Func2 must run only after Func1 finishes processing all elements.

Since you want all calls to Func1 to complete before any calls to Func2 are made you can't make the second call to QtConcurrent::map before the first is known to have finished.
However, rather than calling future.waitForFinished() you can use a QFutureWatcher...
QFutureWatcher<void> watcher;
auto future = QtConcurrent::map(vector, Func1);
QObject::connect(&watcher, &QFutureWatcher::finished,
[&]()
{
/*
* All calls to Func1 have finished so it's
* safe to invoke Func2.
*/
future = QtConcurrent::map(vector, Func2);
});
watcher.setFuture(future);
The above is untested but hopefully gives you some idea of what's required.

Sadly I wasn't able to find an answer from documentation so I have tested this code
future = QtConcurrent::map(vector, Func1);
future = QtConcurrent::map(vector, Func2);
and the result is that for some elements of the vector Func2 starts before Func1. Also there is no sequential order for processing vector elements which is natural using QtConcurrent::map.
For solving my problem I moved the Object with the slot that was calling both QtConcurrent::map to another thread and used wait which I will lose one thread in my threadpool.

Related

How can I stop async/await from bubbling up in functions?

Lets say I have a function A that uses a function B which uses C, etc:
A -> B -> C -> D and E
Now assume that function D has to use async/await. This means I have to use async/await to the call of function C and then to the call of function B and so on. I understand that this is because they depend on each other and if one of them is waiting for a function to resolve, then transitively, they all have to. What alternatives can I do to make this cleaner?
There is a way to do this, but you'll loose the benefits of async-await.
One of the reason for async-await, is, that if your thread has to wait for another process to complete, like a read or write to the hard-disk, a database query, or fetching some internet information, your thread might do some other useful stuff instead of just waiting idly for this other process to complete.
This is done by using the keyword await. Once your thread sees the await. The thread doesn't really wait idly. Instead, it remembers the context (think of variable values, parts of the call stack etc) and goes up the call stack to see if the caller is not awaiting. If not, it starts executing these statements until it sees an await. It goes up the call stack again to see if the caller is not awaiting, etc.
Once this other process is completed the thread (or maybe another thread from the thread pool that acts as if it is the original thread) continues with the statements after the await until the procedure is finished, or until it sees another await.
To be able to do this, your procedure must know, that when it sees an await, the context needs to be saved and the thread must act like described above. That is why you declare a method async.
That is why typical async functions are functions that order other processes to do something lengthy: disk access, database access, internet communications. Those are typical functions where you'll find a ReadAsync / WriteAsync, next to the standard Read / Write functions. You'll also find them in classes that are typically designed to call these processes, like StreamReaders, TextWriters etc.
If you have a non-async class that calls an async function and waits until the async function completes before returning, the go-up-the-call-stack-to-see-if-the-caller-is-not-awaiting stops here: your program acts as if it is not using async-await.
Almost!
If you start an awaitable task, without waiting for it to finish, and do something else before you wait for the result, then this something else is executed instead of the idly wait, that the thread would have done if you would have used the non-async version.
How to call async function from non-async function
ICollection<string> ReadData(...)
{
// call the async function, don't await yet, you'll have far more interesting things to do
var taskReadLines = myReader.ReadLinesAsync(...);
DoSomethingInteresting();
// now you need the data from the read task.
// However, because this method is not async, you can't await.
// This Wait will really be an idle wait.
taskReadLines.Wait();
ICollection<string> readLines= taskRead.Result;
return readLines();
}
Your callers won't benefit from async-await, however your thread will be able to do something interesting while the lines have not been read yet.

Synchronize multiple pthreads in a

I'm discovering the pthread library (in C) and I'm having some trouble understanding well a few things.
First of all, I understand what a mutex is, I understand how it works, ok, I also understand the concept of the cond, but I can't manage to use it properly (I don't really get how to combine the mutex and the cond)
This is, in pseudo-code, what I want to do :
thread :
loop :
// do something
end loop
end thread
So there is n threads, but each thread uses the same function. I want the inside of the loop to be executed in parallel by all the threads BUT each thread must be in the same iteration of the loop, meaning I don't care in what order the instructions inside the loop are executed between threads, but to start iteration 2 of a thread, all the other threads must have finished iteration 1 (etc).
So my question is : how do you do that ? Not particularly in a specific example, but theoretically.
EDIT
I manage to do it, I don't know if it's the proper way, but it's working :
global nbOfThreads
global nbOfIterations
thread :
lock(mutex0)
unlock(mutex0)
loop :
// Do something
lock(mutex1)
nbOfIterations++
if (nbOfIterations == nbOfThread) :
nbOfIterations = 0
broadcast(cond)
unlock(mutex1)
continue
end if
wait(cond, mutex1)
unlock(mutex1)
end loop
end thread
main (n) :
nbOfThreads = n
nbOfIterations = 0
lock(mutex0)
do nbOfThreads times : create(thread)
unlock(mutex0)
end main
I obviously tried to understand myself, but there are some things I don't understand :
The main one : WHY does a cond need to be pair with a mutex
In some examples I saw something like this :
// thread A :
while (!condition)
wait(&cond)
// thread B :
if (condition)
signal(&cond)
well I really don't get the point of this while loop, I thought wait put the thread in pause until the condition is true (until the other thread send the signal). I mean I would get it if it was an if instead of a while.
Thank you
WHY does a cond need.... because the (!condition) you reference almost certainly depends upon some bits of the object not changing while you reference them. Correspondingly, modifying the state of the object should be done in such a way as to appear atomic to any observer; thus a mutex. While you could rely on too-clever-by-half hackery like atomic types, there is also the problem of ‘what if it was modified just after you checked it’ -- a race condition. Thus the idiomatic lock(); while (!cond) { wait(); }.
The point of the while... The signal+wait is not a handoff of control; after the signal, any number of things could happen to the object before a particular thread returns from wait. Even though the condition might have been in the correct state, by the time thread A examines it, it may no longer be. At the point of exiting the while loop, thread A knows: The condition is in the state I desire, and I have exclusive access to the object.
Condition variables can have spurious wake-ups. The condition might not actually be true when the wait function returns.
Depending on your task, a different synchronization primitive, such as a barrier (see pthread_barrier_init) or a semaphore (sem_init) might be easier to use.

How this deadlock happens in Scala Future?

This snippet is excerpted from Monix document.
It's an example that how to enter deadlock in Scala.
import java.util.concurrent.Executors
import scala.concurrent._
implicit val ec = ExecutionContext.fromExecutor(Executors.newFixedThreadPool(1))
def addOne(x: Int) = Future(x + 1)
def multiply(x: Int, y: Int) = Future {
val a = addOne(x)
val b = addOne(y)
val result = for (r1 <- a; r2 <- b) yield r1 * r2
// This can dead-lock due to the limited size of our thread-pool!
Await.result(result, Duration.Inf)
}
I understand what the code does, but not about how it executed.
Why it is the line Await.result(result, Duration.Inf) causing the deadlock ? (Yes, I tested it)
Is not that the outermost Future at multiply function occupy all the thread pool(the single one) and thus deadlock (because the addOne future is forever blocked on waiting for thread)?
Is not that the outermost Future at multiply function occupy all the thread pool(the single one) and thus deadlock (because the addOne future is forever blocked on waiting for thread)?
Yes, sort of.
When you call val a = addOne(x), you create a new Future that starts waiting for a thread. However, as you noted, the only thread is currently in use by the outermost Future. That wouldn't be a problem without await, since Futures are able to handle this condition. However, this line:
Await.result(result, Duration.Inf)
causes the outer Future to wait for the result Future, which can't run because the outer Future is still using the only available thread. (And, of course, it also can't run because the a and b Futures can't run, again due to the outer Future.)
Here's a simpler example that also deadlocks without creating so many Futures:
def addTwo(x: Int) = Future {
Await.result(addOne(x + 1), Duration.Inf)
}
First of all I would say this code can simulate deadlock, it’s not guaranteed that it will always be in the deadlock.
What is happening in the above code. We have only a single thread in the thread pool. And as soon as we are calling the multiple function as it’s the future so it should run on a separate thread say we assign the single thread we have in the thread pool to this function.
Now the function addOne also is a future so it will again start running on the same thread, but will not wait for a=addOne to get complete and move to the next line b=addOne hence the same thread which was executing the a=addOne now executing the b=addOne and the value of all will never be calculated and that future is not complete and never going to be complete as we have only one thread, same case with the line b=addOne it control will not wait to complete that future and move to the for loop for is also async in the Scala so it will again not evaluated and move to the last line await and it will be waiting for the infinity amount of time to complete the previous futures.
Necessary and sufficient condition to get into the dead lock.
Mutual Exclusion Condition
Hold and Wait Condition
No-Preemptive Condition
Circular Wait Condition
Here we can see we have only one thread so the processes going to be execute are not mutually exclusive.
once the thread is executing specific block and hence it’s a future and not waiting to complete it, it’s going ahead and executing the next block hence it’s reaching to the await statement and the thread is holding there while all the other future which are not complete are waiting for the thread to complete the future.
Once the thread is allocated to the await it can’t be preempt that’s the reason we can’t execute the remaining future which are not complete.
And circular wait is there because awaits is waiting for the non-complete future to be complete and other futures are waiting for the await call to be complete.
Simply we can say the control will directly reach to the await statement and start waiting for the non-complete futures to got complete which is not going to be happen anyhow. Because we have only one thread in our thread pool.
Await.result(result, Duration.Inf)
When you use await, you are waiting for future to complete. And you have given infinite time. So if anyhow Future will never be able to complete, main thread go to infinite wait.

transforming a thread blocking to non thread blocking in f#

I retrieve data from the Bloomberg API, and am quite surprised by the slowness.
My computation is IO bounded by this.
Therefore I decided to use some async monad builder to unthrottle it.
Upon running it, the results are not so much better, which was obvious as I make a call to a function, NextEvent, which is thread blocking.
let outerloop args dic =
...
let rec innerloop continuetoloop =
let eventObj = session.NextEvent(); //This blocks
...
let seqtable = reader.ReadFile( #"C:\homeware\sector.csv", ";".[0], true)
let dic = ConcurrentDictionary<_,_> ()
let wf = seqtable |> Seq.mapi (fun i item -> async { outerloop item dic } )
wf |> Async.Parallel
|> Async.RunSynchronously
|> ignore
printfn "%A" ret
Is there a good way to wrap that blocking call to a nonblocking call ?
Also, why is the async framework not creating as many threads as I have requests (like 200)? when I inspect the threads from which I receive values I see only 4-5 that are used..
UPDATE
I found a compelling reason of why it will never be possible.
async operation take what is after the async instruction and schedule it somewhere in the threadpool.
for all that matters, as long as async function are use correctly, that is, always returning to the threadpool it originated from, we can consider that we are execution on a single thread.
Being on a single thread mean all that scheduling will always be executed somewhere later, and a blocking instruction has no way to avoid the fact that, eventually, once it runs, it will have to block at some point in the future the worflow.
Is there a good way to wrap that blocking call to a nonblocking call ?
No. You can never wrap blocking calls to make them non-blocking. If you could, async would just be a design pattern rather than a fundamental paradigm shift.
Also, why is the async framework not creating as many threads as I have requests (like 200)?
Async is built upon the thread pool which is designed not to create threads aggressively because they are so expensive. The whole point of (real) async is that it does not require as many threads as there are connections. You can handle 10,000 simultaneous connections with ~30 threads.
You seem to have a complete misunderstanding of what async is and what it is for. I'd suggest buying any of the F# books that cover this topic and reading up on it. In particular, your solution is not asynchronous because you just call your blocking StartGetFieldsValue member from inside your async workflow, defeating the purpose of async. You might as well just do Array.Parallel.map getFieldsValue instead.
Also, you want to use a purely functional API when doing things in parallel rather than mutating a ConcurrentDictionary in-place. So replace req.StartGetFieldsValue ret with
let! result = req.StartGetFieldsValue()
...
return result
and replace ignore with dict.
Here is a solution I made that seems to be working.
Of course it does not use only async (minus a), but Async as well.
I define a simple type that has one event, finished, and one method, asyncstart, with the method to run as an argument. it launches the method, then fires the event finished at the appropriate place (in my case I had to capture the synchronization context etc..)
Then in the consumer side, I use simply
let! completion = Async.Waitfromevent wapper.finished |> Async.StartAsChild
let! completed = completion
While running this code, on the consumer side, I use only async calls, making my code non blocking. Of course, there has to be some thread somewhere which is blocked, but this happens outside of my main serving loop which remains reactive and fit.

Wait without blocking thread? - How?

This is probably one of the most elementary things in F#, but I've just realized I've got no idea what's going on behind the sceens.
let testMe() =
async { printfn "before!"
do! myAsyncFunction() // Waits without blocking thread
printfn "after!" }
testMe |> Async.RunSynchronously
What's happening at do! myAsyncFunction()? I'm aware of that it waits for myAsyncFunction to finish, before moving on. But how can it do that, without blocking the thread?
My best guess is that everything after do! myAsyncFunction() is passed along as a continuation, which gets executed on the same thread myAsyncFunction() was scheduled on, once myAsyncFunction() has finished executing.. but then again, that's just a guess.
As you correctly pointed out, the myAsyncFunction is passed a continuation and it calls it to resume the rest of the asynchronous workflow when it completes.
You can understand it better by looking at the desugared version of the code:
let testMe() =
async.Delay(fun () ->
printfn "before!"
async.Bind(myAsyncFunction(), fun () ->
printfn "after!"
async.Zero()))
They key thing is that the asynchronous workflow created by myAsyncFunction is given to the Bind operation that starts it and gives it the second argument (a continuation) as a function to call when the workflow completes. If you simplify a lot, then an asynchronous workflow could be defined like this:
type MyAsync<'T> = (('T -> unit) * (exn -> unit)) -> unit
So, an asynchronous workflow is just a function that takes some continuations as argument. When it gets the continuations, it does something (i.e. create a timer or start I/O) and then it eventually calls these continuations. The question "On which thread are the continuations called?" is an interesting one - in a simple model, it depends on the MyAsync that you're starting - it may decide to run them anywhere it wants (i.e. Async.SwithcToNewThread runs them on a new thread). The F# library includes some additional handling that makes GUI programming using workflows easier.
Your example uses Async.RunImmediate, which blocks the current thread, but you could also use Async.Start, which just starts the workflow and ignores the result when it is produced. The implementation of Async.Start could look like this:
let Start (async:MyAsync<unit>) = async (ignore, ignore)

Resources