Implicit class holding mutable variable in multithreaded environment - multithreading

I need to implement a parallel method, which takes two computation blocks, a and b, and starts each of them in a new thread. The method must return a tuple with the result values of both the computations. It should have the following signature:
def parallel[A, B](a: => A, b: => B): (A, B)
I managed to solve the exercise by using straight Java-like approach. Then I decided to make up a solution with implicit class. Here's it:
object ParallelApp extends App {
implicit class ParallelOps[A](a: => A) {
var result: A = _
def spawn(): Unit = {
val thread = new Thread {
override def run(): Unit = {
result = a
}
}
thread.start()
thread.join()
}
}
def parallel[A, B](a: => A, b: => B): (A, B) = {
a.spawn()
b.spawn()
(a.result, b.result)
}
println(parallel(1 + 2, "a" + "b"))
}
For unknown reason, I receive output (null,null). Could you please point me out where is the problem?

Spoiler alert: It's not complicated. It's funny, like a magic trick (if you consider reading the documentation about Java Memory Model "funny", that is). If you haven't figured it out yet, I would highly recommend to try to figure it out, otherwise it won't be funny. Someone should make a "division-by-zero proves 2 = 4"-riddle out of it.
Consider the following shorter example:
implicit class Foo[A](a: A) {
var result: String = "not initialized"
def computeResult(): Unit = result = "Yay, result!"
}
val a = "a string"
a.computeResult()
println(a.result)
When run, it prints
not initialized
despite the fact that we invoked computeResult() and set result to "Yay, result!". The problem is that the two invocations a.computeResult() and a.result belong to two completely independent instances of Foo. The implicit conversion is performed twice, and the second implicitly created object doesn't know anything about the changes in the first implicitly created object. It has nothing to do with threads or JMM at all.
By the way: your code is not parallel. Calling join right after calling start doesn't bring you anything, your main thread will simply go idle and wait until another thread finishes. At no point will there be two threads that do any useful work concurrently.

EDIT: Fixed a bug pointed out by Andrey Tyukin
One way to solve your problem is to use Scala Futures
Documentation. Tutorial.
Useful Klang Blog.
You'll typically need some combination of these libraries:
import scala.concurrent.ExecutionContext.Implicits.global
import scala.concurrent.{Await, Future}
import scala.util.{Failure, Success}
import scala.concurrent.duration._
an asynchronous example:
def parallelAsync[A,B](a: => A, b: => B): Future[(A,B)] = {
// as per Andrey Tyukin's comments, this line runs
// the two futures sequentially and we do not get
// any benefit from it. I will leave this line here
// so others will not fall in my trap
//for {i <- Future(a); j <- Future(b) } yield (i,j)
Future(a) zip Future(b)
}
parallelAsync(1 + 2, "a" + "b").onComplete {
case Success(x) => println(x)
case Failure(e) => e.printStackTrace()
}
If you must block until both are complete, you can use this:
def parallelSync[A,B](a: => A, b: => B): (A,B) = {
// see comment above
//val f = for { i <- Future(a); j <- Future(b) } yield (i,j)
val tuple = Future(a) zip Future(b)
Await.result(tuple, 5 second)
}
println(parallelSync(3 + 4, "c" + "d"))
When running these little examples, don't forget to sleep a little bit at the end so the program won't end before the results come back
Thread.sleep(3000)

Related

Parallel data fetches that are batched

Is this pattern of batching a subset of a collection for parallel processing ok? Is there a better way to do this that I am missing?
When given a collection of entity ids that need to be fetched from a service which returns a scala Future instead of making all the requests at once we batch them because the service can only handle a certain number of requests at a time. In a way it is a primitive throttling mechanism to avoid overwhelming the data store. It looks like a code smell.
object FutureHelper{
def batchSerially[A, B, M[a] <: TraversableOnce[a]](l: M[A])(dbFetch: A => Future[B])(
implicit ctx: ExecutionContext, buildFrom: CanBuildFrom[M[A], B, M[B]]): Future[M[B]] =
l.foldLeft(Future.successful(buildFrom(l))){
case (accF, curr) => for {
acc <- accF
b <- dbFetch(curr)
} yield acc += b
}.map(s => s.result())
}
object FutureBatching extends App {
implicit val e: ExecutionContext = scala.concurrent.ExecutionContext.Implicits.global
val entityIds = List(1,2,3,4,5,6)
val batchSize = 2
val listOfFetchedResults =
FutureHelper.batchSerially(entityIds.grouped(batchSize)) {groupedByBatchSize =>
Future.sequence{
groupedByBatchSize.map( i => Future.successful(i))
}
}.map(_.flatten.toList)
}
I believe by default scala.Future will start executing as soon as the Future is created, so the invocations of dbFetch() will kick-off the connections right away. Since the foldLeft transforms all the suspended A => Future[B] to the actual Future objects, I don't believe the batching will happen the way you want.
Yes, I believe that code works correctly (see comments).
Another way is to let the pool define the level of parallelism, but that doesn't always work, depending on your execution environment.
I've had some success doing batching using the parallel collections. For instance, if you create a collection where the number of elements represent the number of concurrent activities, you can use .par. For instance,
// partition xs into numBatches Set elements, and invoke processBatch on each Set in parallel
def batch[A,B](xs: Iterable[A], numBatches: Int)
(processBatch: Set[A] => Set[B]): ParSeq[B] = split(xs,numBatches).par.flatMap(processBatch)
// Split the input iterable into numBatches sub-sets.
// For example split(Seq(1,2,3,4,5,6), 3) = Seq(Set(1, 4), Set(2, 5), Set(3, 6))
def split[A](xs: Iterable[A], numBatches: Int): Seq[Set[A]] = {
val buffers: Vector[VectorBuilder[A]] = Vector.fill(numBatches)(new VectorBuilder[A]())
val elems = xs.toIndexedSeq
for (i <- 0 until elems.length) {
buffers(i % numBatches) += elems(i)
}
buffers.map(_.result.toSet)
}

Using MetaProgramming to Add collectWithIndex and injectWithIndex similar to eachWithIndex

Please help with a metaprogramming configuration such that I can add collections methods called collectWithIndex and injectWithIndex that work in a similar manner to eachWithIndex but of course include the base functionality of collect and inject. The new methods would accept a two (three with maps) argument closure just like eachWithIndex. I would like to have the capability to utilize these methods across many different scripts.
Use case:
List one = [1, 2, 3]
List two = [10, 20, 30]
assert [10, 40, 90] == one.collectWithIndex { value, index ->
value * two [index]
}
Once the method is developed then how would it be made available to scripts? I suspect that a jar file would be created with special extension information and then added to the classpath.
Many thanks in advance
I'm still sure, it's not a proper SO question, but I'll give you an example, how you can enrich metaclass for your multiple scripts.
Idea is based on basescript, adding required method to List's metaClass in it's constructor. You have to implement collect logic yourself, through it's pretty easy. You can use wrapping
import org.codehaus.groovy.control.CompilerConfiguration
class WithIndexInjector extends Script {
WithIndexInjector() {
println("Adding collectWithIndex to List")
List.metaClass.collectWithIndex {
int i = 0
def result = []
for (o in delegate) // delegate is a ref holding initial list.
result << it(o, i++) // it is closure given to method
result
}
}
#Override Object run() {
return null
}
}
def configuration = new CompilerConfiguration()
configuration.scriptBaseClass = WithIndexInjector.name
new GroovyShell(configuration).evaluate('''
println(['a', 'b'].collectWithIndex { it, id -> "[$id]:$it" })
''')
// will print [[0]:a, [1]:b]
If you like to do it in more functional way, without repeating collect logic, you may use wrapping proxy closure. I expect it to be slower, but maybe it's not a deal. Just replace collectWithIndex with following implementation.
List.metaClass.collectWithIndex {
def wrappingProxyClosure = { Closure collectClosure, int startIndex = 0 ->
int i = startIndex
return {
collectClosure(it, i++) // here we keep hold on outer collectClosure and i, and use call former with one extra argument. "it" is list element, provided by default collect method.
}
}
delegate.collect(wrappingProxyClosure(it))
}
offtopic: In SO community your current question will only attract minuses, not answers.

Synchronizing on function parameter for multithreaded memoization

My core question is: how can I implement synchronization in a method on the combination of the object instance and the method parameter?
Here are the details of my situation. I'm using the following code to implement memoization, adapted from this answer:
/**
* Memoizes a unary function
* #param f the function to memoize
* #tparam T the argument type
* #tparam R the result type
*/
class Memoized[-T, +R](f: T => R) extends (T => R) {
import scala.collection.mutable
private[this] val cache = mutable.Map.empty[T, R]
def apply(x: T): R = cache.getOrElse(x, {
val y = f(x)
cache += ((x, y))
y
})
}
In my project, I'm memoizing Futures to deduplicate asynchronous API calls. This worked fine when using for...yield to map over the resulting futures, created with the standard ExcecutionContext, but when I upgraded to Scala Async for nicer handling of these futures. However, I realized that the multithreading that library uses allowed multiple threads to enter apply, defeating memoization, because the async blocks all executed in parallel, entering the "orElse" thunk before cache could be updated with a new Future.
To work around this, I put the main apply function in a this.synchronized block:
def apply(x: T): R = this.synchronized {
cache.getOrElse(x, {
val y = f(x)
cache += ((x, y))
y
})
}
This restored the memoized behavior. The drawback is that this will block calls with different params, at least until the Future is created. I'm wondering if there is a way to set up finer grained synchronization on the combination of the Memoized instance and the value of the x parameter to apply. That way, only calls that would be deduplicated will be blocked.
As a side note, I'm not sure this is truly performance critical, because the synchronized block will release once the Future is created and returned (I think?). But if there are any concerns with this that I'm not thinking of, I would also like to know.
Akka actors combined with futures provide a powerful way to wrap over mutable state without blocking. Here is a simple example of how to use an Actor for memoization:
import akka.actor._
import akka.util.Timeout
import akka.pattern.ask
import scala.concurrent._
import scala.concurrent.duration._
class Memoize(system: ActorSystem) {
class CacheActor(f: Any => Future[Any]) extends Actor {
private[this] val cache = scala.collection.mutable.Map.empty[Any, Future[Any]]
def receive = {
case x => sender ! cache.getOrElseUpdate(x, f(x))
}
}
def apply[K, V](f: K => Future[V]): K => Future[V] = {
val fCast = f.asInstanceOf[Any => Future[Any]]
val actorRef = system.actorOf(Props(new CacheActor(fCast)))
implicit val timeout = Timeout(5.seconds)
import system.dispatcher
x => actorRef.ask(x).asInstanceOf[Future[Future[V]]].flatMap(identity)
}
}
We can use it like:
val system = ActorSystem()
val memoize = new Memoize(system)
val f = memoize { x: Int =>
println("Computing for " + x)
scala.concurrent.Future.successful {
Thread.sleep(1000)
x + 1
}
}
import system.dispatcher
f(5).foreach(println)
f(5).foreach(println)
And "Computing for 5" will only print a single time, but "6" will print twice.
There are some scary looking asInstanceOf calls, but it is perfectly type-safe.

Performance difference in toString.map and toString.toArray.map

While coding Euler problems, I ran across what I think is bizarre:
The method toString.map is slower than toString.toArray.map.
Here's an example:
def main(args: Array[String])
{
def toDigit(num : Int) = num.toString.map(_ - 48) //2137 ms
def toDigitFast(num : Int) = num.toString.toArray.map(_ - 48) //592 ms
val startTime = System.currentTimeMillis;
(1 to 1200000).map(toDigit)
println(System.currentTimeMillis - startTime)
}
Shouldn't the method map on String fallback to a map over the array? Why is there such a noticeable difference? (Note that increasing the number even causes an stack overflow on the non-array case).
Original
Could be because toString.map uses the WrappedString implicit, while toString.toArray.map uses the WrappedArray implicit to resolve map.
Let's see map, as defined in TraversableLike:
def map[B, That](f: A => B)(implicit bf: CanBuildFrom[Repr, B, That]): That = {
val b = bf(repr)
b.sizeHint(this)
for (x <- this) b += f(x)
b.result
}
WrappedString uses a StringBuilder as builder:
def +=(x: Char): this.type = { append(x); this }
def append(x: Any): StringBuilder = {
underlying append String.valueOf(x)
this
}
The String.valueOf call for Any uses Java Object.toString on the Char instances, possibly getting boxed first. These extra ops might be the cause of speed difference, versus the supposedly shorter code paths of the Array builder.
This is a guess though, would have to measure.
Edit
After revising, the general point still stands, but the I referred the wrong implicits, since the toDigit methods return an Int sequence (or like), not a translated string as I misread.
toDigit uses LowPriorityImplicits.fallbackStringCanBuildFrom[T]: CanBuildFrom[String, T, immutable.IndexedSeq[T]], with T = Int, which just defers to a general IndexedSeq builder.
toDigitFast uses a direct Array implicit of type CanBuildFrom[Array[_], T, Array[T]], which is unarguably faster.
Passing the following CBF for toDigit explicitly makes the two methods on par:
object FastStringToArrayBuild {
def canBuildFrom[T : ClassManifest] = new CanBuildFrom[String, T, Array[T]] {
private def newBuilder = scala.collection.mutable.ArrayBuilder.make()
def apply(from: String) = newBuilder
def apply() = newBuilder
}
}
You're being fooled by running out of memory. The toDigit version does create more intermediate objects, but if you have plenty of memory then the GC won't be heavily impacted (and it'll all run faster). For example, if instead of creating 1.2 million numbers, I create 12k 100x in a row, I get approximately equal times for the two methods. If I create 1.2k 5-digit numbers 1000x in a row, I find that toDigit is about 5% faster.
Given that the toDigit method produces an immutable collection, which is better when all else is equal since it is easier to reason about, and given that all else is equal for all but highly demanding tasks, I think the library is as it should be.
When trying to improve performance, of course one needs to keep all sorts of tricks in mind; one of these is that arrays have better memory characteristics for collections of known length than do the fancy collections in the Scala library. Also, one needs to know that map isn't the fastest way to get things done; if you really wanted this to be fast you should
final def toDigitReallyFast(num: Int, accum: Long = 0L, iter: Int = 0): Array[Byte] = {
if (num==0) {
val ans = new Array[Byte](math.max(1,iter))
var i = 0
var ac = accum
while (i < ans.length) {
ans(ans.length-i-1) = (ac & 0xF).toByte
ac >>= 4
i += 1
}
ans
}
else {
val next = num/10
toDigitReallyFast(next, (accum << 4) | (num-10*next), iter+1)
}
}
which on my machine is at 4x faster than either of the others. And you can get almost 3x faster yet again if you leave everything in a Long and pack the results in an array instead of using 1 to N:
final def toDigitExtremelyFast(num: Int, accum: Long = 0L, iter: Int = 0): Long = {
if (num==0) accum | (iter.toLong << 48)
else {
val next = num/10
toDigitExtremelyFast(next, accum | ((num-10*next).toLong<<(4*iter)), iter+1)
}
}
// loop, instead of 1 to N map, for the 1.2k number case
{
var i = 10000
val a = new Array[Long](1201)
while (i<=11200) {
a(i-10000) = toDigitReallyReallyFast(i)
i += 1
}
a
}
As with many things, performance tuning is highly dependent on exactly what you want to do. In contrast, library design has to balance many different concerns. I do think it's worth noticing where the library is sub-optimal with respect to performance, but this isn't really one of those cases IMO; the flexibility is worth it for the common use cases.

print the closure definition/source in Groovy

Anyone who knows how the print the source of a closure in Groovy?
For example, I have this closure (binded to a)
def a = { it.twice() }
I would like to have the String "it.twice()" or "{ it.twice() }"
Just a simple toString ofcourse won't work:
a.toString(); //results in: Script1$_run_closure1_closure4_closure6#12f1bf0
short answer is you can't. long answer is:
depending on what you need the code for, you could perhaps get away with
// file: example1.groovy
def a = { it.twice() }
println a.metaClass.classNode.getDeclaredMethods("doCall")[0].code.text
// prints: { return it.twice() }
BUT
you will need the source code of the script available in the classpath AT RUNTIME as explained in
groovy.lang.MetaClass#getClassNode()
"Obtains a reference to the original
AST for the MetaClass if it is
available at runtime
#return The
original AST or null if it cannot be
returned"
AND
the text trick does not really return the same code, just a code like representation of the AST, as can be seen in this script
// file: example2.groovy
def b = {p-> p.twice() * "p"}
println b.metaClass.classNode.getDeclaredMethods("doCall")[0].code.text
// prints: { return (p.twice() * p) }
still, it might be useful as it is if you just want to take a quick look
AND, if you have too much time on your hands and don't know what to do you could write your own org.codehaus.groovy.ast.GroovyCodeVisitor to pretty print it
OR, just steal an existing one like groovy.inspect.swingui.AstNodeToScriptVisitor
// file: example3.groovy
def c = {w->
[1,2,3].each {
println "$it"
(1..it).each {x->
println 'this seems' << ' somewhat closer' << ''' to the
original''' << " $x"
}
}
}
def node = c.metaClass.classNode.getDeclaredMethods("doCall")[0].code
def writer = new StringWriter()
node.visit new groovy.inspect.swingui.AstNodeToScriptVisitor(writer)
println writer
// prints: return [1, 2, 3].each({
// this.println("$it")
// return (1.. it ).each({ java.lang.Object x ->
// return this.println('this seems' << ' somewhat closer' << ' to the \n original' << " $x")
// })
// })
now.
if you want the original, exact, runnable code ... you are out of luck
i mean, you could use the source line information, but last time i checked, it wasn't really getting them right
// file: example1.groovy
....
def code = a.metaClass.classNode.getDeclaredMethods("doCall")[0].code
println "$code.lineNumber $code.columnNumber $code.lastLineNumber $code.lastColumnNumber"
new File('example1.groovy').readLines()
... etc etc you get the idea.
line numbers shuld be at least near the original code though
That isn't possible in groovy. Even when a groovy script is run directly, without compiling it first, the script is converted into JVM bytecode. Closures aren't treated any differently, they are compiled like regular methods. By the time the code is run, the source code isn't available any more.

Resources