Scala: joining / waiting for growing queue of futures - multithreading

I launch several async processes which, in turn, can launch more processes if it's needed (think traversing directory structure or something like that). Each process returns something, and in the end I want to wait for completion of all of them and schedule a function that will do something with resulting collection.
Naïve attempt
My solution attempt used a mutable ListBuffer (to which I keep adding futures that I spawn), and Future.sequence to schedule some function to run on completion of all these futures listed in this buffer.
I've prepared a minimal example that illustrates the issue:
object FuturesTest extends App {
var queue = ListBuffer[Future[Int]]()
val f1 = Future {
Thread.sleep(1000)
val f3 = Future {
Thread.sleep(2000)
Console.println(s"f3: 1+2=3 sec; queue = $queue")
3
}
queue += f3
Console.println(s"f1: 1 sec; queue = $queue")
1
}
val f2 = Future {
Thread.sleep(2000)
Console.println(s"f2: 2 sec; queue = $queue")
2
}
queue += f1
queue += f2
Console.println(s"starting; queue = $queue")
Future.sequence(queue).foreach(
(all) => Console.println(s"Future.sequence finished with $all")
)
Thread.sleep(5000) // simulates app being alive later
}
It schedules f1 and f2 futures first, and then f3 will be scheduled in f1 resolution 1 second later. f3 itself will resolve in 2 more seconds. Thus, what I expect to get is the following:
starting; queue = ListBuffer(Future(<not completed>), Future(<not completed>))
f1: 1 sec; queue = ListBuffer(Future(<not completed>), Future(<not completed>), Future(<not completed>))
f2: 2 sec; queue = ListBuffer(Future(Success(1)), Future(<not completed>), Future(<not completed>))
f3: 1+2=3 sec; queue = ListBuffer(Future(Success(1)), Future(Success(2)), Future(<not completed>))
Future.sequence finished with ListBuffer(1, 2, 3)
However, I actually get:
starting; queue = ListBuffer(Future(<not completed>), Future(<not completed>))
f1: 1 sec; queue = ListBuffer(Future(<not completed>), Future(<not completed>), Future(<not completed>))
f2: 2 sec; queue = ListBuffer(Future(Success(1)), Future(<not completed>), Future(<not completed>))
Future.sequence finished with ListBuffer(1, 2)
f3: 1+2=3 sec; queue = ListBuffer(Future(Success(1)), Future(Success(2)), Future(<not completed>))
... which is most likely due to the fact that a list of futures that we wait for is fixed during the initial call of Future.sequence and won't change later.
Working, but ugly attempt
Ultimately, I've made it act as I wanted with this code:
waitForSequence(queue, (all: ListBuffer[Int]) => Console.println(s"finished with $all"))
def waitForSequence[T](queue: ListBuffer[Future[T]], act: (ListBuffer[T] => Unit)): Unit = {
val seq = Future.sequence(queue)
seq.onComplete {
case Success(res) =>
if (res.size < queue.size) {
Console.println("... still waiting for tasks")
waitForSequence(queue, act)
} else {
act(res)
}
case Failure(exc) =>
throw exc
}
}
This works as intended, getting all 3 futures in the end:
starting; queue = ListBuffer(Future(<not completed>), Future(<not completed>))
f1: 1 sec; queue = ListBuffer(Future(<not completed>), Future(<not completed>), Future(<not completed>))
f2: 2 sec; queue = ListBuffer(Future(Success(1)), Future(<not completed>), Future(<not completed>))
... still waiting for tasks
f3: 1+2=3 sec; queue = ListBuffer(Future(Success(1)), Future(Success(2)), Future(<not completed>))
finished with ListBuffer(1, 2, 3)
But it's still very ugly. It just restarts Future.sequence waiting if it sees that at time of completion the queue is longer than number of results, hoping that when it completes next time, situation will be better. Of course, this is bad because it exhausts stack and it might be error-prone if this check will trigger in a tiny window between creation of a future and appending it to the queue.
Is it possible to do so without rewriting everything with Akka, or resorting to use Await.result (which I can't actually use due to my code being compiled for Scala.js).

Like Justin mentioned, you can't lose the reference to the futures spawned inside of the other futures and you should use map and flatMap to chain them.
val f1 = Future {
Thread.sleep(1000)
val f3 = Future {
Thread.sleep(2000)
Console.println(s"f3: 1+2=3 sec")
3
}
f3.map{
r =>
Console.println(s"f1: 1 sec;")
Seq(1, r)
}
}.flatMap(identity)
val f2 = Future {
Thread.sleep(2000)
Console.println(s"f2: 2 sec;")
Seq(2)
}
val futures = Seq(f1, f2)
Future.sequence(futures).foreach(
(all) => Console.println(s"Future.sequence finished with ${all.flatten}")
)
Thread.sleep(5000) // simulates app being alive later
This works on the minimal example, I am not sure if it will work for your real use case. The result is:
f2: 2 sec;
f3: 1+2=3 sec
f1: 1 sec;
Future.sequence finished with List(1, 3, 2)

The right way to do this is probably to compose your Futures. Specifically, f1 shouldn't just kick off f3, it should probably flatMap over it -- that is, the Future of f1 doesn't resolve until f3 resolves.
Keep in mind, Future.sequence is kind of a fallback option, to use only when the Futures are all really disconnected. In a case like you're describing, where there are real dependencies, those are best represented in the Futures you've actually returning. When using Futures, flatMap is your friend, and should be one of the first tools you reach for. (Often but not always as for comprehensions.)
It's probably safe to say that, if you ever want a mutable queue of Futures, the code isn't structured correctly and there's a better way to do it. Specifically in Scala.js (which is where much of my code lies, and which is very Future-heavy), I use for comprehensions over those Futures constantly -- I think it's the only sane way to operate...

I would not involve Future.sequence: it parallelizes the operations, and you seem to be looking for a sequential async execution. Also, you probably don't need the futures to start right away after defining. The composition should looks something like this:
def run[T](queue: List[() => Future[T]]): Future[List[T]] = {
(Future.successful(List.empty[T]) /: queue)(case (f1, f2) =>
f1() flatMap (h => )
)
val t0 = now
def f(n: Int): () => Future[String] = () => {
println(s"starting $n")
Future[String] {
Thread.sleep(100*n)
s"<<$n/${now - t0}>>"
}
}
println(Await.result(run(f(7)::f(10)::f(20)::f(3)::Nil), 20 seconds))
The trick is not to launch the futures prematurely; that's why we have f(n) that won't start until we call it with ().

Related

Implementing a multithreading function for running "foreach, map and reduce" parallel

I am quite new to Scala but I am learning about Threads and Multithreading.
As the title says, I am trying to implement a way to divide the problem onto different threads of variable count.
We are given this code:
/** Executes the provided function for each entry in the input sequence in parallel.
*
* #param input the input sequence
* #param parallelism the number of threads to use
* #param f the function to run
*/
def parallelForeach[A](input: IndexedSeq[A], parallelism: Int, f: A => Unit): Unit = ???
I tried implementing it like this:
def parallelForeach[A](input: IndexedSeq[A], parallelism: Int, f: A => Unit): Unit = {
if (parallelism < 1) {
throw new IllegalArgumentException("a degree of parallelism < 1 s not allowed for parallel foreach")
}
val threads = (0 until parallelism).map { threadId =>
val startIndex = threadId * input.size / parallelism
val endIndex = (threadId + 1) * input.size / parallelism
val task: Runnable = () => {
(startIndex until endIndex).foreach { A =>
val key = input.grouped(input.size / parallelism)
val x: Unit = input.foreach(A => f(A))
x
}
}
new Thread(task)
}
threads.foreach(_.start())
threads.foreach(_.join())
}
for this test:
test("parallel foreach should perform the given function once for each element in the sequence") {
val counter = AtomicLong(0L)
parallelForeach((1 to 100), 16, counter.addAndGet(_))
assert(counter.get() == 5050)
But, as you can guess, it doesn't work this way as my result isn't 5050 but 505000.
Now here is my question. How do I implement a way to use multithreading efficiently, so there are for example 16 different threads working at the same time?
Check your test: "1 to 100".
With your Code you go with each thread through 100, this is why your result is 100 times to large.

Can I determine the result of a data race without reading the value?

I'm trying to better understand lock-free programming:
Suppose we have two threads in a data race:
// Thread 1
x = 1
// Thread 2
x = 2
Is there a lock-free way a third thread can know the result of the race without being able to read x?
Suppose thread 3 consumes a lock-free queue, and the code is:
// Thread 1
x = 1
queue.push(1)
// Thread 2
x = 2
queue.push(2)
Then the operations could be ordered as:
x = 1
x = 2
queue.push(1)
queue.push(2)
or
x = 1
x = 2
queue.push(2)
queue.push(1)
So having a lock-free queue alone would not suffice for thread 3 to know the value of x after the race.
If you know the value of x before the race began, the following code using atomic Read-Modify-Write operations should do the job.
// Notes:
// x == 0
// x and winner are both atomic
// atomic_swap swaps the content of two variables atomically,
// meaning, that no other thread can interfere with this operation
//thread-1:
t = 1;
atomic_swap(x, t);
if (t != 0) {
//x was non zero, when thread-1 called the swap operation
//--> thread-2 was faster
winner = 1;
}
//thread-2
t = 2;
atomic_swap(x, t);
if (t != 0) {
//x was non zero, when thread-2 called the swap operation
//--> thread-1 was faster
winner = 2;
}
//thread-3
while (winner == 0) {}
print("Winner is " + winner);

Sleep a long running process

I have an app that iterates over tens of thousands of records using various enumerators (such as directory enumerators)
I am seeing OS X saying my process is "Caught burning CPU" since its taking a large amount of CPU in doing so.
What I would like to do is build in a "pressure valve" such as a
[NSThread sleepForTimeInterval:cpuDelay];
that does not block other processes/threads on things like a dual core machine.
My processing is happening on a separate thread, but I can't break out of and re-enter the enumerator loop and use NSTimers to allow the machine to "breathe"
Any suggestions - should [NSThread sleepForTimeInterval:cpuDelay]; be working?
I run this stuff inside a dispatch queue:
if(!primaryTask)primaryTask=dispatch_queue_create( "com.me.app.task1",backgroundPriorityAttr);
dispatch_async(primaryTask,^{
[self doSync];
});
Try wrapping your processing in NSOperation and set lower QoS priority. Here is a little more information:
http://nshipster.com/nsoperation/
Here is code example I made up. Operation is triggered in view load event:
let processingQueue = NSOperationQueue()
override func viewDidLoad() {
let backgroundOperation = NSBlockOperation {
//My very long and intesive processing
let nPoints = 100000000000
var nPointsInside = 0
for _ in 1...nPoints {
let (x, y) = (drand48() * 2 - 1, drand48() * 2 - 1)
if x * x + y * y <= 1 {
nPointsInside += 1
}
}
let _ = 4.0 * Double(nPointsInside) / Double(nPoints)
}
backgroundOperation.queuePriority = .Low
backgroundOperation.qualityOfService = .Background
processingQueue.addOperation(backgroundOperation)
}

Use a MailboxProcessor with reply-channel to create limited agents that return values in order

Basically, I want to change the following into a limited threading solution, because in my situation the list of calculations is too large, spawning too many threads, and I'd like to experiment and measure performance with less threads.
// the trivial approach (and largely my current situation)
let doWork() =
[1 .. 10]
|> List.map (fun i -> async {
do! Async.Sleep (100 * i) // longest thread will run 1 sec
return i * i // some complex calculation returning a certain type
})
|> Async.Parallel
|> Async.RunSynchronously // works, total wall time 1s
My new approach, this code is borrowed/inspired by this online snippet from Tomas Petricek (which I tested, it works, but I need it to return a value, not unit).
type LimitAgentMessage =
| Start of Async<int> * AsyncReplyChannel<int>
| Finished
let threadingLimitAgent limit = MailboxProcessor.Start(fun inbox -> async {
let queue = System.Collections.Generic.Queue<_>()
let count = ref 0
while true do
let! msg = inbox.Receive()
match msg with
| Start (work, reply) -> queue.Enqueue((work, reply))
| Finished -> decr count
if count.Value < limit && queue.Count > 0 then
incr count
let work, reply = queue.Dequeue()
// Start it in a thread pool (on background)
Async.Start(async {
let! x = work
do! async {reply.Reply x }
inbox.Post(Finished)
})
})
// given a synchronous list of tasks, run each task asynchronously,
// return calculated values in original order
let worker lst =
// this doesn't work as expected, it waits for each reply
let agent = threadingLimitAgent 10
lst
|> List.map(fun x ->
agent.PostAndReply(
fun replyChannel -> Start(x, replyChannel)))
Now, with this in place, the original code would become:
let doWork() =
[1 .. 10]
|> List.map (fun i -> async {
do! Async.Sleep (100 * i) // longest thread will run 1 sec
return i * i // some complex calculation returning a certain type
})
|> worker // worker is not working (correct output, runs 5.5s)
All in all, the output is correct (it does calculate and propagate back the replies), but it does not do so in the (limited set) of threads.
I've been playing around a bit, but think I'm missing the obvious (and besides, who knows, someone may like the idea of a limited-threads mailbox processor that returns its calculations in order).
The problem is the call to agent.PostAndReply. PostAndReply will block until the work has finished. Calling this inside List.map will cause the work to be executed sequentially. One solution is to use PostAndAsyncReply which does not block and also returns you an async handle for getting the result back.
let worker lst =
let agent = threadingLimitAgent 10
lst
|> List.map(fun x ->
agent.PostAndAsyncReply(
fun replyChannel -> Start(x, replyChannel)))
|> Async.Parallel
let doWork() =
[1 .. 10]
|> List.map (fun i -> async {
do! Async.Sleep (100 * i)
return i * i
})
|> worker
|> Async.RunSynchronously
That's of course only one possible solution (getting all async handles back and awaiting them in parallel).

Scala Future/Promise fast-fail pipeline

I want to launch two or more Future/Promises in parallel and fail even if one of the launched Future/Promise fails and dont want to wait for the rest to complete.
What is the most idiomatic way to compose this pipeline in Scala.
EDIT: more contextual information.
I have to launch two external processes one writing to a fifo file and another reading from it. Say if the writer process fails; the reader thread might hang forever waiting for any input from the file. So I would want to launch both the processes in parallel and fail fast even if one of the Future/Promise fails without waiting for the completion of the other.
Below is the sample code to be more precise. the commands are not exactly cat and tail. I have used them for brevity.
val future1 = Future { executeShellCommand("cat file.txt > fifo.pipe") }
val future2 = Future { executeShellCommand("tail fifo.pipe") }
If I understand the question correctly, what we are looking for is a fail-fast sequence implementation, which is akin to a failure-biased version of firstCompletedOf
Here, we eagerly register a failure callback in case one of the futures fails early on, ensuring that we fail as soon as any of the futures fail.
import scala.concurrent.{Future, Promise}
import scala.util.{Success, Failure}
import scala.concurrent.ExecutionContext.Implicits.global
def failFast[T](futures: Seq[Future[T]]): Future[Seq[T]] = {
val promise = Promise[Seq[T]]
futures.foreach{f => f.onFailure{case ex => promise.failure(ex)}}
val res = Future.sequence(futures)
promise.completeWith(res).future
}
In contrast to Future.sequence, this implementation will fail as soon as any of the futures fail, regardless of ordering.
Let's show that with an example:
import scala.util.Try
// help method to measure time
def resilientTime[T](t: =>T):(Try[T], Long) = {
val t0 = System.currentTimeMillis
val res = Try(t)
(res, System.currentTimeMillis-t0)
}
import scala.concurrent.duration._
import scala.concurrent.Await
First future will fail (failure in 2 seconds)
val f1 = Future[Int]{Thread.sleep(2000); throw new Exception("boom")}
val f2 = Future[Int]{Thread.sleep(5000); 42}
val f3 = Future[Int]{Thread.sleep(10000); 101}
val res = failFast(Seq(f1,f2,f3))
resilientTime(Await.result(res, 10.seconds))
// res: (scala.util.Try[Seq[Int]], Long) = (Failure(java.lang.Exception: boom),1998)
Last future will fail. Failure also in 2 seconds. (note the order in the sequence construction)
val f1 = Future[Int]{Thread.sleep(2000); throw new Exception("boom")}
val f2 = Future[Int]{Thread.sleep(5000); 42}
val f3 = Future[Int]{Thread.sleep(10000); 101}
val res = failFast(Seq(f3,f2,f1))
resilientTime(Await.result(res, 10.seconds))
// res: (scala.util.Try[Seq[Int]], Long) = (Failure(java.lang.Exception: boom),1998)
Comparing with Future.sequence where failure depends on the ordering (failure in 10 seconds):
val f1 = Future[Int]{Thread.sleep(2000); throw new Exception("boom")}
val f2 = Future[Int]{Thread.sleep(5000); 42}
val f3 = Future[Int]{Thread.sleep(10000); 101}
val seq = Seq(f3,f2,f1)
resilientTime(Await.result(Future.sequence(seq), 10.seconds))
//res: (scala.util.Try[Seq[Int]], Long) = (Failure(java.lang.Exception: boom),10000)
Use Future.sequence:
val both = Future.sequence(Seq(
firstFuture,
secondFuture));
This is the correct way to aggregate two or more futures where the failure of one fails the aggregated future and the aggregated future completes when all inner futures complete. An older version of this answer suggested a for-comprehension which while very common would not reject immediately of one of the futures rejects but rather wait for it.
Zip the futures
val f1 = Future { doSomething() }
val f2 = Future { doSomething() }
val resultF = f1 zip f2
resultF future fails if any one of f1 or f2 is failed
Time taken to resolve is min(f1time, f2time)
scala> import scala.util._
import scala.util._
scala> import scala.concurrent._
import scala.concurrent._
scala> import scala.concurrent.ExecutionContext.Implicits.global
import scala.concurrent.ExecutionContext.Implicits.global
scala> val f = Future { Thread.sleep(10000); throw new Exception("f") }
f: scala.concurrent.Future[Nothing] = scala.concurrent.impl.Promise$DefaultPromise#da1f03e
scala> val g = Future { Thread.sleep(20000); throw new Exception("g") }
g: scala.concurrent.Future[Nothing] = scala.concurrent.impl.Promise$DefaultPromise#634a98e3
scala> val x = f zip g
x: scala.concurrent.Future[(Nothing, Nothing)] = scala.concurrent.impl.Promise$DefaultPromise#3447e854
scala> x onComplete { case Success(x) => println(x) case Failure(th) => println(th)}
result: java.lang.Exception: f

Resources