ForkJoinPool for parallel processing - multithreading

I am trying to run run some code 1 million times. I initially wrote it using Threads but this seemed clunky. I started doing some more reading and I came across ForkJoin. This seemed like exactly what I needed but I cant figure out how to translate what I have below into "scala-style". Can someone explain the best way to use ForkJoin in my code?
val l = (1 to 1000000) map {_.toLong}
println("running......be patient")
l.foreach{ x =>
if(x % 10000 == 0) println("got to: "+x)
val thread = new Thread {
override def run {
//my code (API calls) here. writes to file if call success
}
}
}

The easiest way is to use par (it will use ForkJoinPool automatically):
val l = (1 to 1000000) map {_.toLong} toList
l.par.foreach { x =>
if(x % 10000 == 0) println("got to: " + x) //will be executed in parallel way
//your code (API calls) here. will also be executed in parallel way (but in same thread with `println("got to: " + x)`)
}
Another way is to use Future:
import scala.concurrent._
import ExecutionContext.Implicits.global //import ForkJoinPool
val l = (1 to 1000000) map {_.toLong}
println("running......be patient")
l.foreach { x =>
if(x % 10000 == 0) println("got to: "+x)
Future {
//your code (API calls) here. writes to file if call success
}
}
If you need work stealing - you should mark blocking code with scala.concurrent.blocking:
Future {
scala.concurrent.blocking {
//blocking API call here
}
}
It will tell ForkJoinPool to compensate blocked thread with new one - so you can avoid thread starvation (but there is some disadvantages).

In Scala, you can use Future and Promise:
val l = (1 to 1000000) map {
_.toLong
}
println("running......be patient")
l.foreach { x =>
if (x % 10000 == 0) println("got to: " + x)
Future{
println(x)
}
}

Related

Implementing a multithreading function for running "foreach, map and reduce" parallel

I am quite new to Scala but I am learning about Threads and Multithreading.
As the title says, I am trying to implement a way to divide the problem onto different threads of variable count.
We are given this code:
/** Executes the provided function for each entry in the input sequence in parallel.
*
* #param input the input sequence
* #param parallelism the number of threads to use
* #param f the function to run
*/
def parallelForeach[A](input: IndexedSeq[A], parallelism: Int, f: A => Unit): Unit = ???
I tried implementing it like this:
def parallelForeach[A](input: IndexedSeq[A], parallelism: Int, f: A => Unit): Unit = {
if (parallelism < 1) {
throw new IllegalArgumentException("a degree of parallelism < 1 s not allowed for parallel foreach")
}
val threads = (0 until parallelism).map { threadId =>
val startIndex = threadId * input.size / parallelism
val endIndex = (threadId + 1) * input.size / parallelism
val task: Runnable = () => {
(startIndex until endIndex).foreach { A =>
val key = input.grouped(input.size / parallelism)
val x: Unit = input.foreach(A => f(A))
x
}
}
new Thread(task)
}
threads.foreach(_.start())
threads.foreach(_.join())
}
for this test:
test("parallel foreach should perform the given function once for each element in the sequence") {
val counter = AtomicLong(0L)
parallelForeach((1 to 100), 16, counter.addAndGet(_))
assert(counter.get() == 5050)
But, as you can guess, it doesn't work this way as my result isn't 5050 but 505000.
Now here is my question. How do I implement a way to use multithreading efficiently, so there are for example 16 different threads working at the same time?
Check your test: "1 to 100".
With your Code you go with each thread through 100, this is why your result is 100 times to large.

Implicit class holding mutable variable in multithreaded environment

I need to implement a parallel method, which takes two computation blocks, a and b, and starts each of them in a new thread. The method must return a tuple with the result values of both the computations. It should have the following signature:
def parallel[A, B](a: => A, b: => B): (A, B)
I managed to solve the exercise by using straight Java-like approach. Then I decided to make up a solution with implicit class. Here's it:
object ParallelApp extends App {
implicit class ParallelOps[A](a: => A) {
var result: A = _
def spawn(): Unit = {
val thread = new Thread {
override def run(): Unit = {
result = a
}
}
thread.start()
thread.join()
}
}
def parallel[A, B](a: => A, b: => B): (A, B) = {
a.spawn()
b.spawn()
(a.result, b.result)
}
println(parallel(1 + 2, "a" + "b"))
}
For unknown reason, I receive output (null,null). Could you please point me out where is the problem?
Spoiler alert: It's not complicated. It's funny, like a magic trick (if you consider reading the documentation about Java Memory Model "funny", that is). If you haven't figured it out yet, I would highly recommend to try to figure it out, otherwise it won't be funny. Someone should make a "division-by-zero proves 2 = 4"-riddle out of it.
Consider the following shorter example:
implicit class Foo[A](a: A) {
var result: String = "not initialized"
def computeResult(): Unit = result = "Yay, result!"
}
val a = "a string"
a.computeResult()
println(a.result)
When run, it prints
not initialized
despite the fact that we invoked computeResult() and set result to "Yay, result!". The problem is that the two invocations a.computeResult() and a.result belong to two completely independent instances of Foo. The implicit conversion is performed twice, and the second implicitly created object doesn't know anything about the changes in the first implicitly created object. It has nothing to do with threads or JMM at all.
By the way: your code is not parallel. Calling join right after calling start doesn't bring you anything, your main thread will simply go idle and wait until another thread finishes. At no point will there be two threads that do any useful work concurrently.
EDIT: Fixed a bug pointed out by Andrey Tyukin
One way to solve your problem is to use Scala Futures
Documentation. Tutorial.
Useful Klang Blog.
You'll typically need some combination of these libraries:
import scala.concurrent.ExecutionContext.Implicits.global
import scala.concurrent.{Await, Future}
import scala.util.{Failure, Success}
import scala.concurrent.duration._
an asynchronous example:
def parallelAsync[A,B](a: => A, b: => B): Future[(A,B)] = {
// as per Andrey Tyukin's comments, this line runs
// the two futures sequentially and we do not get
// any benefit from it. I will leave this line here
// so others will not fall in my trap
//for {i <- Future(a); j <- Future(b) } yield (i,j)
Future(a) zip Future(b)
}
parallelAsync(1 + 2, "a" + "b").onComplete {
case Success(x) => println(x)
case Failure(e) => e.printStackTrace()
}
If you must block until both are complete, you can use this:
def parallelSync[A,B](a: => A, b: => B): (A,B) = {
// see comment above
//val f = for { i <- Future(a); j <- Future(b) } yield (i,j)
val tuple = Future(a) zip Future(b)
Await.result(tuple, 5 second)
}
println(parallelSync(3 + 4, "c" + "d"))
When running these little examples, don't forget to sleep a little bit at the end so the program won't end before the results come back
Thread.sleep(3000)

Future in scala cann't run asynchronously when the future number larger than cpu core number

I used Future to implement a multi-thread function in scala language. But when the future number lager than cpu core number, the threads were splited to groups. And the threads in one group completed, then the other threads in other groups started. My code and output were listed below. Is there something wrong in my code, and how to fix it?
import scala.collection.mutable._
import scala.concurrent.ExecutionContext.Implicits.global
import scala.concurrent._
import scala.concurrent.duration._
import scala.language.postfixOps
object FutrueTest {
def main(args: Array[String]) {
val threads=10
def ft(): Future[String] = Future {
for (i <- 1 until 3) {
Thread.sleep(1000)
println(Thread.currentThread().getName + "\t" + i)
}
Thread.currentThread().getName + " end..."
}
var fs = Set[Future[String]]()
for (j <- 1 until threads) {
val f = ft
f.onComplete {
case _ => "Thread :" + j + " complete"
}
fs += f
}
fs.foreach(f => {
Await.ready(f, Duration.Inf)
})
}
}
output in terminal
ForkJoinPool-1-worker-13 1
ForkJoinPool-1-worker-15 1
ForkJoinPool-1-worker-11 1
ForkJoinPool-1-worker-1 1
ForkJoinPool-1-worker-3 1
ForkJoinPool-1-worker-7 1
ForkJoinPool-1-worker-9 1
ForkJoinPool-1-worker-5 1
ForkJoinPool-1-worker-1 2
ForkJoinPool-1-worker-15 2
ForkJoinPool-1-worker-9 2
ForkJoinPool-1-worker-3 2
ForkJoinPool-1-worker-11 2
ForkJoinPool-1-worker-13 2
ForkJoinPool-1-worker-7 2
ForkJoinPool-1-worker-5 2
ForkJoinPool-1-worker-15 1
ForkJoinPool-1-worker-15 2
Process finished with exit code 0
You can create your own execution context.
import java.util.concurrent.Executors
import scala.concurrent.ExecutionContext
object CustomExecutionContext {
private val availableProcessors = Runtime.getRuntime.availableProcessors()
implicit val nDExecutionContext = ExecutionContext.fromExecutor(Executors.newFixedThreadPool(availableProcessors * N)) // N is number of threads
}
Another solution: Control the number of future execution in parallel using FixedThreadPool. It will start 10 futures first and then start others after completing these.
implicit val ec = ExecutionContext.fromExecutor(java.util.concurrent.Executors.newFixedThreadPool(10))
Third solution: You can use throttle execution context. Refer
Use this instead of global execution context.
implicit val ec = ThrottledExecutionContext(maxConcurrents = 4)(scala.concurrent.ExecutionContext.global)
It will limit the parallelism.
4th solution: You can use akka fsm to throttle.

Sleep a long running process

I have an app that iterates over tens of thousands of records using various enumerators (such as directory enumerators)
I am seeing OS X saying my process is "Caught burning CPU" since its taking a large amount of CPU in doing so.
What I would like to do is build in a "pressure valve" such as a
[NSThread sleepForTimeInterval:cpuDelay];
that does not block other processes/threads on things like a dual core machine.
My processing is happening on a separate thread, but I can't break out of and re-enter the enumerator loop and use NSTimers to allow the machine to "breathe"
Any suggestions - should [NSThread sleepForTimeInterval:cpuDelay]; be working?
I run this stuff inside a dispatch queue:
if(!primaryTask)primaryTask=dispatch_queue_create( "com.me.app.task1",backgroundPriorityAttr);
dispatch_async(primaryTask,^{
[self doSync];
});
Try wrapping your processing in NSOperation and set lower QoS priority. Here is a little more information:
http://nshipster.com/nsoperation/
Here is code example I made up. Operation is triggered in view load event:
let processingQueue = NSOperationQueue()
override func viewDidLoad() {
let backgroundOperation = NSBlockOperation {
//My very long and intesive processing
let nPoints = 100000000000
var nPointsInside = 0
for _ in 1...nPoints {
let (x, y) = (drand48() * 2 - 1, drand48() * 2 - 1)
if x * x + y * y <= 1 {
nPointsInside += 1
}
}
let _ = 4.0 * Double(nPointsInside) / Double(nPoints)
}
backgroundOperation.queuePriority = .Low
backgroundOperation.qualityOfService = .Background
processingQueue.addOperation(backgroundOperation)
}

Synchronizing on function parameter for multithreaded memoization

My core question is: how can I implement synchronization in a method on the combination of the object instance and the method parameter?
Here are the details of my situation. I'm using the following code to implement memoization, adapted from this answer:
/**
* Memoizes a unary function
* #param f the function to memoize
* #tparam T the argument type
* #tparam R the result type
*/
class Memoized[-T, +R](f: T => R) extends (T => R) {
import scala.collection.mutable
private[this] val cache = mutable.Map.empty[T, R]
def apply(x: T): R = cache.getOrElse(x, {
val y = f(x)
cache += ((x, y))
y
})
}
In my project, I'm memoizing Futures to deduplicate asynchronous API calls. This worked fine when using for...yield to map over the resulting futures, created with the standard ExcecutionContext, but when I upgraded to Scala Async for nicer handling of these futures. However, I realized that the multithreading that library uses allowed multiple threads to enter apply, defeating memoization, because the async blocks all executed in parallel, entering the "orElse" thunk before cache could be updated with a new Future.
To work around this, I put the main apply function in a this.synchronized block:
def apply(x: T): R = this.synchronized {
cache.getOrElse(x, {
val y = f(x)
cache += ((x, y))
y
})
}
This restored the memoized behavior. The drawback is that this will block calls with different params, at least until the Future is created. I'm wondering if there is a way to set up finer grained synchronization on the combination of the Memoized instance and the value of the x parameter to apply. That way, only calls that would be deduplicated will be blocked.
As a side note, I'm not sure this is truly performance critical, because the synchronized block will release once the Future is created and returned (I think?). But if there are any concerns with this that I'm not thinking of, I would also like to know.
Akka actors combined with futures provide a powerful way to wrap over mutable state without blocking. Here is a simple example of how to use an Actor for memoization:
import akka.actor._
import akka.util.Timeout
import akka.pattern.ask
import scala.concurrent._
import scala.concurrent.duration._
class Memoize(system: ActorSystem) {
class CacheActor(f: Any => Future[Any]) extends Actor {
private[this] val cache = scala.collection.mutable.Map.empty[Any, Future[Any]]
def receive = {
case x => sender ! cache.getOrElseUpdate(x, f(x))
}
}
def apply[K, V](f: K => Future[V]): K => Future[V] = {
val fCast = f.asInstanceOf[Any => Future[Any]]
val actorRef = system.actorOf(Props(new CacheActor(fCast)))
implicit val timeout = Timeout(5.seconds)
import system.dispatcher
x => actorRef.ask(x).asInstanceOf[Future[Future[V]]].flatMap(identity)
}
}
We can use it like:
val system = ActorSystem()
val memoize = new Memoize(system)
val f = memoize { x: Int =>
println("Computing for " + x)
scala.concurrent.Future.successful {
Thread.sleep(1000)
x + 1
}
}
import system.dispatcher
f(5).foreach(println)
f(5).foreach(println)
And "Computing for 5" will only print a single time, but "6" will print twice.
There are some scary looking asInstanceOf calls, but it is perfectly type-safe.

Resources