Scala Future/Promise fast-fail pipeline - multithreading

I want to launch two or more Future/Promises in parallel and fail even if one of the launched Future/Promise fails and dont want to wait for the rest to complete.
What is the most idiomatic way to compose this pipeline in Scala.
EDIT: more contextual information.
I have to launch two external processes one writing to a fifo file and another reading from it. Say if the writer process fails; the reader thread might hang forever waiting for any input from the file. So I would want to launch both the processes in parallel and fail fast even if one of the Future/Promise fails without waiting for the completion of the other.
Below is the sample code to be more precise. the commands are not exactly cat and tail. I have used them for brevity.
val future1 = Future { executeShellCommand("cat file.txt > fifo.pipe") }
val future2 = Future { executeShellCommand("tail fifo.pipe") }

If I understand the question correctly, what we are looking for is a fail-fast sequence implementation, which is akin to a failure-biased version of firstCompletedOf
Here, we eagerly register a failure callback in case one of the futures fails early on, ensuring that we fail as soon as any of the futures fail.
import scala.concurrent.{Future, Promise}
import scala.util.{Success, Failure}
import scala.concurrent.ExecutionContext.Implicits.global
def failFast[T](futures: Seq[Future[T]]): Future[Seq[T]] = {
val promise = Promise[Seq[T]]
futures.foreach{f => f.onFailure{case ex => promise.failure(ex)}}
val res = Future.sequence(futures)
promise.completeWith(res).future
}
In contrast to Future.sequence, this implementation will fail as soon as any of the futures fail, regardless of ordering.
Let's show that with an example:
import scala.util.Try
// help method to measure time
def resilientTime[T](t: =>T):(Try[T], Long) = {
val t0 = System.currentTimeMillis
val res = Try(t)
(res, System.currentTimeMillis-t0)
}
import scala.concurrent.duration._
import scala.concurrent.Await
First future will fail (failure in 2 seconds)
val f1 = Future[Int]{Thread.sleep(2000); throw new Exception("boom")}
val f2 = Future[Int]{Thread.sleep(5000); 42}
val f3 = Future[Int]{Thread.sleep(10000); 101}
val res = failFast(Seq(f1,f2,f3))
resilientTime(Await.result(res, 10.seconds))
// res: (scala.util.Try[Seq[Int]], Long) = (Failure(java.lang.Exception: boom),1998)
Last future will fail. Failure also in 2 seconds. (note the order in the sequence construction)
val f1 = Future[Int]{Thread.sleep(2000); throw new Exception("boom")}
val f2 = Future[Int]{Thread.sleep(5000); 42}
val f3 = Future[Int]{Thread.sleep(10000); 101}
val res = failFast(Seq(f3,f2,f1))
resilientTime(Await.result(res, 10.seconds))
// res: (scala.util.Try[Seq[Int]], Long) = (Failure(java.lang.Exception: boom),1998)
Comparing with Future.sequence where failure depends on the ordering (failure in 10 seconds):
val f1 = Future[Int]{Thread.sleep(2000); throw new Exception("boom")}
val f2 = Future[Int]{Thread.sleep(5000); 42}
val f3 = Future[Int]{Thread.sleep(10000); 101}
val seq = Seq(f3,f2,f1)
resilientTime(Await.result(Future.sequence(seq), 10.seconds))
//res: (scala.util.Try[Seq[Int]], Long) = (Failure(java.lang.Exception: boom),10000)

Use Future.sequence:
val both = Future.sequence(Seq(
firstFuture,
secondFuture));
This is the correct way to aggregate two or more futures where the failure of one fails the aggregated future and the aggregated future completes when all inner futures complete. An older version of this answer suggested a for-comprehension which while very common would not reject immediately of one of the futures rejects but rather wait for it.

Zip the futures
val f1 = Future { doSomething() }
val f2 = Future { doSomething() }
val resultF = f1 zip f2
resultF future fails if any one of f1 or f2 is failed
Time taken to resolve is min(f1time, f2time)
scala> import scala.util._
import scala.util._
scala> import scala.concurrent._
import scala.concurrent._
scala> import scala.concurrent.ExecutionContext.Implicits.global
import scala.concurrent.ExecutionContext.Implicits.global
scala> val f = Future { Thread.sleep(10000); throw new Exception("f") }
f: scala.concurrent.Future[Nothing] = scala.concurrent.impl.Promise$DefaultPromise#da1f03e
scala> val g = Future { Thread.sleep(20000); throw new Exception("g") }
g: scala.concurrent.Future[Nothing] = scala.concurrent.impl.Promise$DefaultPromise#634a98e3
scala> val x = f zip g
x: scala.concurrent.Future[(Nothing, Nothing)] = scala.concurrent.impl.Promise$DefaultPromise#3447e854
scala> x onComplete { case Success(x) => println(x) case Failure(th) => println(th)}
result: java.lang.Exception: f

Related

Implicit class holding mutable variable in multithreaded environment

I need to implement a parallel method, which takes two computation blocks, a and b, and starts each of them in a new thread. The method must return a tuple with the result values of both the computations. It should have the following signature:
def parallel[A, B](a: => A, b: => B): (A, B)
I managed to solve the exercise by using straight Java-like approach. Then I decided to make up a solution with implicit class. Here's it:
object ParallelApp extends App {
implicit class ParallelOps[A](a: => A) {
var result: A = _
def spawn(): Unit = {
val thread = new Thread {
override def run(): Unit = {
result = a
}
}
thread.start()
thread.join()
}
}
def parallel[A, B](a: => A, b: => B): (A, B) = {
a.spawn()
b.spawn()
(a.result, b.result)
}
println(parallel(1 + 2, "a" + "b"))
}
For unknown reason, I receive output (null,null). Could you please point me out where is the problem?
Spoiler alert: It's not complicated. It's funny, like a magic trick (if you consider reading the documentation about Java Memory Model "funny", that is). If you haven't figured it out yet, I would highly recommend to try to figure it out, otherwise it won't be funny. Someone should make a "division-by-zero proves 2 = 4"-riddle out of it.
Consider the following shorter example:
implicit class Foo[A](a: A) {
var result: String = "not initialized"
def computeResult(): Unit = result = "Yay, result!"
}
val a = "a string"
a.computeResult()
println(a.result)
When run, it prints
not initialized
despite the fact that we invoked computeResult() and set result to "Yay, result!". The problem is that the two invocations a.computeResult() and a.result belong to two completely independent instances of Foo. The implicit conversion is performed twice, and the second implicitly created object doesn't know anything about the changes in the first implicitly created object. It has nothing to do with threads or JMM at all.
By the way: your code is not parallel. Calling join right after calling start doesn't bring you anything, your main thread will simply go idle and wait until another thread finishes. At no point will there be two threads that do any useful work concurrently.
EDIT: Fixed a bug pointed out by Andrey Tyukin
One way to solve your problem is to use Scala Futures
Documentation. Tutorial.
Useful Klang Blog.
You'll typically need some combination of these libraries:
import scala.concurrent.ExecutionContext.Implicits.global
import scala.concurrent.{Await, Future}
import scala.util.{Failure, Success}
import scala.concurrent.duration._
an asynchronous example:
def parallelAsync[A,B](a: => A, b: => B): Future[(A,B)] = {
// as per Andrey Tyukin's comments, this line runs
// the two futures sequentially and we do not get
// any benefit from it. I will leave this line here
// so others will not fall in my trap
//for {i <- Future(a); j <- Future(b) } yield (i,j)
Future(a) zip Future(b)
}
parallelAsync(1 + 2, "a" + "b").onComplete {
case Success(x) => println(x)
case Failure(e) => e.printStackTrace()
}
If you must block until both are complete, you can use this:
def parallelSync[A,B](a: => A, b: => B): (A,B) = {
// see comment above
//val f = for { i <- Future(a); j <- Future(b) } yield (i,j)
val tuple = Future(a) zip Future(b)
Await.result(tuple, 5 second)
}
println(parallelSync(3 + 4, "c" + "d"))
When running these little examples, don't forget to sleep a little bit at the end so the program won't end before the results come back
Thread.sleep(3000)

How to pass function output in futures and then those futures to a new function?

My Scenario is like below:
Step1: x =def sum(a,b)
Step2: Thread.sleep(1s)
Step3: y =def subtract(a,b)
Step4: Thread.sleep(2s)
Step5: On successfull completion of above steps perform z = multiple(x,y)
I need to implement this scenario using futures in Scala. Please help.
I Tried this code but it is not working.
import scala.util.{Failure, Success}
def sum(a:Int ,b:Int) = a+b
def sub(c:Int, d:Int) = c-d
def mul(e: Int, f: Int) = e*f
val Sum1= Future {sum(2,3); Thread.sleep(1000)}
val SumFinal=Sum1.onComplete({
case Success(result) => println(result)
case Failure(e) => println("failed: " + e)
})
val Subt1 = Future {sub(5,3);Thread.sleep(2000)}
val SubtFinal = Subt1.onComplete({
case Success(result) => result
case Failure(e) => println("failed: " + e)
})
val Mul1= mul(SumFinal,SubtFinal)
println(Mul1)
Problem with your approach is that onComplete returns unit. That's why you don't get any result. So, subFimal and sumFinal has nothing in it.
scala> def sum(a: Int, b: Int) = Future { a + b }
sum: (a: Int, b: Int)scala.concurrent.Future[Int]
scala> def sub(a: Int, b: Int) = Future { a - b }
sub: (a: Int, b: Int)scala.concurrent.Future[Int]
scala> def mul(a: Int, b: Int) = Future { a * b }
mul: (a: Int, b: Int)scala.concurrent.Future[Int]
scala> for {
| a <- sum(2,3)
| b <- sub(10, 7)
| c <- mul(a, b)
| } yield c
res0: scala.concurrent.Future[Int] = Future(<not completed>)
scala> res0
res1: scala.concurrent.Future[Int] = Future(Success(15))
Problem 1:
The result of e.g. Future {sub(5,3);Thread.sleep(2000)} is the value returned by Thread.sleep, which is () in Scala. Just change the order: Future {Thread.sleep(2000); sub(5,3)} will finish with the result 2 after 2 seconds. If you really want to put sleep after the calculation, just store the result in a variable:
Future {
val res = sub(5,3)
Thread.sleep(2000)
res
}
Problem 2:
SumFinal and SubtFinal are again () because that's what onComplete returns. Instead you can combine two futures (or more, or modify one, etc. etc.) and get a future back. One way would be (after fixing problem 1)
val Mul1 = Sum1.zipWith(Sum2)(mul)
Mul1.onComplete {
...
}

How to do parallel execution of function calls using Futures in scala [closed]

Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 2 years ago.
Improve this question
I want to call a func() with different parameters and calling to this function should happen in parallel.
for(i <- 0 to count) {
Future {
for(j <- 0 to n_count) {
Future {
func(a,b, c(i), d(j))
}
}
}
}
Is it the right way?
Also my function is not returning anything. So how to know execution is running in parallel or how many threads has been created. Please provide full code for such scenario in Scala.
It's quite simple, scala gives you Future.sequence and Future.traverse for this reason
They both roughly work the same way, by converting any collection of the kind M[Future[T]] to a collection of Future[M[T]], but they are used differently.
val actions = List(Future(5), Future(6), Future(7))
val executed = Future.sequence(actions) map (l => println(l.mkString()) // 5, 6 ,7
Now Future.traverse is semantically different because you feed it an M[A], or a collection of elements, and then a function that converts those elements to a future. For instance:
val userIds = List(1, 2, 3, 4 , 5)
// let's pretend this calls an SQL DB
def userById(id: Int): Future[Option[User]]
Future.traverse(userIds)(id => userById(id))
In terms of execution semantics, both constructs execute all futures in parallel and fail if any of the futures fail. There's no guarantee the futures are executed in order.
Off topic, but fortunately it's pretty easy to write a "one at a time logic
yourself. This works by moving the function that produces the Future from an element "inside" the for block. Futures in Scala are always created in an already started state, which makes them harder to reason about.
def sequencedTraverse[
A,
B,
M[X] <: TraversableOnce[X]
](in: M[A])(fn: A => Future[B])(implicit
executor: ExecutionContextExecutor,
cbf: CanBuildFrom[M[A], B, M[B]]
): Future[M[B]] = {
in.foldLeft(Future.successful(cbf(in))) { (fr, a) =>
for (r <- fr; b <- fn(a)) yield r += b
}.map(_.result())
}
Building on the excellent answer of #flavian:
scala> val a = "a"
a: String = a
scala> val b = "b"
b: String = b
scala> val c = List("x","y","z")
c: List[String] = List(x, y, z)
scala> val d = List("u","v","w")
d: List[String] = List(u, v, w)
scala> def func(a: String, b: String, c: String, d: String) = a + b + c + d
func: (a: String, b: String, c: String, d: String)String
scala> val futures = for{i <- c; j <- d} yield Future(func(a,b,i,j))
futures: List[scala.concurrent.Future[String]] = List(Future(Success(abxu)), Future(Success(abxv)), Future(Success(abxw)), Future(Success(abyu)), Future(Success(abyv)), Future(Success(abyw)), Future(Success(abzu)), Future(Success(abzv)), Future(Success(abzw)))
scala> Future.sequence(futures)
res0: scala.concurrent.Future[List[String]] = Future(<not completed>)
scala> res0
res1: scala.concurrent.Future[List[String]] = Future(Success(List(abxu, abxv, abxw, abyu, abyv, abyw, abzu, abzv, abzw)))

What does Queue() function do in Chisel?

I was reading source code of rocket chip, in rocc.scala file in rocket/src/main/scala/ there is an example AccumulatorExample for using rocc. At first part of the code there is a function Queue() that I couldn't figure out what it's doing?
val n = 4
val regfile = Mem(UInt(width = params(XprLen)), n)
val busy = Vec.fill(n){Reg(init=Bool(false))}
val cmd = Queue(io.cmd)
val funct = cmd.bits.inst.funct
val addr = cmd.bits.inst.rs2(log2Up(n)-1,0)
val doWrite = funct === UInt(0)
val doRead = funct === UInt(1)
val doLoad = funct === UInt(2)
val doAccum = funct === UInt(3)
val memRespTag = io.mem.resp.bits.tag(log2Up(n)-1,0)
Thanks
Queue is a Module providing a hardware queue. Circle-talk I know but it's the best I can give. Hope this helps! Your code looks like it is setting the source of the queue as the io.cmd.
Constructor:
Queue(enq:DecoupledIO, entries:Int)
enq DecoupledIO source for the queue
entries size of queue
Interface:
.io.enq Decoupled | IO source (flipped)
.io.deq Decoupled | IO sink
.io.count UInt | count of elements in the queue

Synchronizing on function parameter for multithreaded memoization

My core question is: how can I implement synchronization in a method on the combination of the object instance and the method parameter?
Here are the details of my situation. I'm using the following code to implement memoization, adapted from this answer:
/**
* Memoizes a unary function
* #param f the function to memoize
* #tparam T the argument type
* #tparam R the result type
*/
class Memoized[-T, +R](f: T => R) extends (T => R) {
import scala.collection.mutable
private[this] val cache = mutable.Map.empty[T, R]
def apply(x: T): R = cache.getOrElse(x, {
val y = f(x)
cache += ((x, y))
y
})
}
In my project, I'm memoizing Futures to deduplicate asynchronous API calls. This worked fine when using for...yield to map over the resulting futures, created with the standard ExcecutionContext, but when I upgraded to Scala Async for nicer handling of these futures. However, I realized that the multithreading that library uses allowed multiple threads to enter apply, defeating memoization, because the async blocks all executed in parallel, entering the "orElse" thunk before cache could be updated with a new Future.
To work around this, I put the main apply function in a this.synchronized block:
def apply(x: T): R = this.synchronized {
cache.getOrElse(x, {
val y = f(x)
cache += ((x, y))
y
})
}
This restored the memoized behavior. The drawback is that this will block calls with different params, at least until the Future is created. I'm wondering if there is a way to set up finer grained synchronization on the combination of the Memoized instance and the value of the x parameter to apply. That way, only calls that would be deduplicated will be blocked.
As a side note, I'm not sure this is truly performance critical, because the synchronized block will release once the Future is created and returned (I think?). But if there are any concerns with this that I'm not thinking of, I would also like to know.
Akka actors combined with futures provide a powerful way to wrap over mutable state without blocking. Here is a simple example of how to use an Actor for memoization:
import akka.actor._
import akka.util.Timeout
import akka.pattern.ask
import scala.concurrent._
import scala.concurrent.duration._
class Memoize(system: ActorSystem) {
class CacheActor(f: Any => Future[Any]) extends Actor {
private[this] val cache = scala.collection.mutable.Map.empty[Any, Future[Any]]
def receive = {
case x => sender ! cache.getOrElseUpdate(x, f(x))
}
}
def apply[K, V](f: K => Future[V]): K => Future[V] = {
val fCast = f.asInstanceOf[Any => Future[Any]]
val actorRef = system.actorOf(Props(new CacheActor(fCast)))
implicit val timeout = Timeout(5.seconds)
import system.dispatcher
x => actorRef.ask(x).asInstanceOf[Future[Future[V]]].flatMap(identity)
}
}
We can use it like:
val system = ActorSystem()
val memoize = new Memoize(system)
val f = memoize { x: Int =>
println("Computing for " + x)
scala.concurrent.Future.successful {
Thread.sleep(1000)
x + 1
}
}
import system.dispatcher
f(5).foreach(println)
f(5).foreach(println)
And "Computing for 5" will only print a single time, but "6" will print twice.
There are some scary looking asInstanceOf calls, but it is perfectly type-safe.

Resources