How to concurrently create Future from Try? - multithreading

The scala Future has a fromTry method which
Creates an already completed Future with the specified result or
exception.
The problem is that the newly created Future is already completed. Is it possible to have the evaluation of the Try done concurrently?
As an example, given a function that returns a Try:
def foo() : Try[Int] = {
Thread sleep 1000
Success(43)
}
How can the evaluation of foo be done concurrently?
A cursory approach would be to simply wrap a Future around the function:
val f : Future[Try[Int]] = Future { foo() }
But the desired return type would be a Future[Int]
val f : Future[Int] = ???
Effectively, how can the Try be flattened within a Future similar to the fromTry method?
There is a similar question, however the question & answer wait for the evaluation of the Try before constructing the completed Future.

Least ceremony is probably:
Future.unit.transform(_ => foo())
Special mention goes to #Dima for the suggestion of Future(foo().get) which is a bit shorter—but might be slightly less readable.

Based on the comments,
Scala >= 2.13:
val f : Future[Int] = Future.delegate(Future.fromTry(foo()))
Scala < 2.13:
val f : Future[Int] = Future(foo()).flatMap(Future.fromTry)

Related

Scala - Executing every element until they all have finished

I cannot figure out why my function invokeAll does not give out the correct output/work properly. Any solutions? (No futures or parallel collections allowed and the return type needs to be Seq[Int])
def invokeAll(work: Seq[() => Int]): Seq[Int] = {
//this is what we should return as an output "return res.toSeq"
//res cannot be changed!
val res = new Array[Int](work.length)
var list = mutable.Set[Int]()
var n = res.size
val procedure = (0 until n).map(work =>
new Runnable {
def run {
//add the finished element/Int to list
list += work
}
}
)
val threads = procedure.map(new Thread(_))
threads.foreach(x => x.start())
threads.foreach (x => (x.join()))
res ++ list
//this should be the final output ("return res.toSeq")
return res.toSeq
}
OMG, I know a java programmer, when I see one :)
Don't do this, it's not java!
val results: Future[Seq[Int]] = Future.traverse(work)
This is how you do it in scala.
This gives you a Future with the results of all executions, that will be satisfied when all work is finished. You can use .map, .flatMap etc. to access and transform those results. For example
val sumOfAll: Future[Int] = results.map(_.sum)
Or (in the worst case, when you want to just give the result back to imperative code), you could block and wait on the future to get ahold of the actual result (don't do this unless you are absolutely desperate): Await.result(results, 1 year)
If you want the results as array, results.map(_.toArray) will do that ... but you really should not: arrays aren't really a good choice for the vast majority of use cases in scala. Just stick with Seq.
The main problem in your code is that you are using fixed size array and trying to add some elements using ++ (concatenate) operator: res ++ list. It produces new Seq but you don't store it in some val.
You could remove last line return res.toSeq and see that res ++ lest will be return value. It will be your work.length array of zeros res with some list sequence at the end. Try read more about scala collections most of them immutable and there is a good practice to use immutable data structures. In scala Arrays doesn't accumulate values using ++ operator in left operand. Array's in scala are fixed size.

takeRightWhile() method in scala

I might be missing something but recently I came across a task to get last symbols according to some condition. For example I have a string: "this_is_separated_values_5". Now I want to extract 5 as Int.
Note: number of parts separated by _ is not defined.
If I would have a method takeRightWhile(f: Char => Boolean) on a string it would be trivial: takeRightWhile(ch => ch != '_'). Moreover it would be efficient: a straightforward implementation would actually involve finding the last index of _ and taking a substring while the use of this method would save first step and provide better average time complexity.
UPDATE: Guys, all the variations of str.reverse.takeWhile(_!='_').reverse are quite inefficient as you actually use additional O(n) space. If you want to implement method takeRightWhile efficiently you could iterate starting from the right, accumulating result in string builder of whatever else, and returning the result. I am asking about this kind of method, not implementation which was already described and declined in the question itself.
Question: Does this kind of method exist in scala standard library? If no, is there method combination from the standard library to achieve the same in minimum amount of lines?
Thanks in advance.
Possible solution:
str.reverse.takeWhile(_!='_').reverse
Update
You can go from right to left with following expression using foldRight:
str.toList.foldRight(List.empty[Char]) {
case (item, acc) => item::acc
}
Here you need to check condition and stop adding items after condition met. For this you can pass a flag to accumulated value:
val (_, list) = str.toList.foldRight((false, List.empty[Char])) {
case (item, (false, list)) if item!='_' => (false, item::list)
case (_, (_, list)) => (true, list)
}
val res = list.mkString.toInt
This solution is even more inefficient then solution with double reverse:
Implementation of foldRight uses combination of List reverse and foldLeft
You cannot break foldRight execution, so you need flag to skip all items after condition met
I'd go with this:
val s = "string_with_following_number_42"
s.split("_").reverse.head
// res:String = 42
This is a naive attempt and by no means optimized. What it does is splitting the String into an Array of Strings, reverses it and takes the first element. Note that, because the reversing happens after the splitting, the order of the characters is correct.
I am not exactly sure about the problem you are facing. My understanding is that you want have a string of format xxx_xxx_xx_...._xxx_123 and you want to extract the part at the end as Int.
import scala.util.Try
val yourStr = "xxx_xxx_xxx_xx...x_xxxxx_123"
val yourInt = yourStr.split('_').last.toInt
// But remember that the above is unsafe so you may want to take it as Option
val yourIntOpt = Try(yourStr.split('_').last.toInt).toOption
Or... lets say your requirement is to collect a right-suffix till some boolean condition remains true.
import scala.util.Try
val yourStr = "xxx_xxx_xxx_xx...x_xxxxx_123"
val rightSuffix = yourStr.reverse.takeWhile(c => c != '_').reverse
val yourInt = rightSuffix.toInt
// but above is unsafe so
val yourIntOpt = Try(righSuffix.toInt).toOption
Comment if your requirement is different from this.
You can use StringBuilder and lastIndexWhere.
val str = "this_is_separated_values_5"
val sb = new StringBuilder(str)
val lastIdx = sb.lastIndexWhere(ch => ch != '_')
val lastCh = str.charAt(lastIdx)

Means for performing background operations in Scala

I'd like to do some time consuming task in background. So I need to start computation in a different thread, be able to check if it is completed (maybe failed) and be able to abort the computation when it becomes unnecessary. After computation is ended it should call synchronized callback function to store computed value.
It may be programmed as some wrapper over the Thread class. But I suppose that this basic functionality is implemented in some scala library already. I've tried to search but find only Akka that is too much for my simple task. scala.concurrent.ExecutionContext has useful execute method but it return no object to check status of the computation and abort it on demand.
What library contains already described functionality?
I've checked scala.concurrent.Future. It lacks ability to abort computation, that is crucial. I use following strategy: compute some consuming function in background and provide reasonable default. If arguments to the function is changed, I drop the original computation and start new. I could not imagine how to rewrite this strategy in terms of Future.flatMap.
I'll give a demonstration of how use futures with Twitter's implementation, since you asked for cancellation:
import com.twitter.util.{ Await, Future, FuturePool }
def computeFast(i: Int) = { Thread.sleep(1000); i + 1 }
def computeSlow(i: Int) = { Thread.sleep(1000000); i + 1 }
val fastComputation = FuturePool.unboundedPool(computeFast(1))
val slowComputation = FuturePool.unboundedPool(computeSlow(1))
Now you can poll for a result:
scala> fastComputation.poll
res0: Option[com.twitter.util.Try[Int]] = Some(Return(2))
scala> slowComputation.poll
res1: Option[com.twitter.util.Try[Int]] = None
Or set callbacks:
fastComputation.onSuccess(println)
slowComputation.onFailure(println)
Most of the time it's better to use map and flatMap to describe how to compose computations, though.
Cancellation is a little more complicated (this is just a demo—you'll want to provide your own cancellation logic):
import com.twitter.util.Promise
def cancellableComputation(i: Int): Future[Int] = {
val p = Promise[Int]
p.setInterruptHandler {
case t =>
println("Cancelling the computation")
p.setException(t)
}
FuturePool.unboundedPool(computeSlow(i)).onSuccess(p.setValue)
p
}
And then:
scala> val myFuture = cancellableComputation(10)
myFuture: com.twitter.util.Future[Int] = Promise#129588027(state=Interruptible(List(),<function1>))
scala> myFuture.poll
res4: Option[com.twitter.util.Try[Int]] = None
scala> myFuture.raise(new Exception("Stop this thing"))
Cancelling the computation
scala> myFuture.poll
res6: Option[com.twitter.util.Try[Int]] = Some(Throw(java.lang.Exception: Stop this thing))
You could probably do something similar with the standard library's futures.

Lazy evaluation of chained functional methods in Groovy

What I've seen in Java
Java 8 allows lazy evaluation of chained functions in order to avoid performance penalties.
For instance, I can have a list of values and process it like this:
someList.stream()
.filter( v -> v > 0)
.map( v -> v * 4)
.filter( v -> v < 100)
.findFirst();
I pass a number of closures to the methods called on a stream to process the values in a collection and then only grab the first one.
This looks as if the code had to iterate over the entire collection, filter it, then iterate over the entire result and apply some logic, then filter the whole result again and finally grab just a single element.
In reality, the compiler handles this in a smarter way and optimizes the number of iterations required.
This is possible because no actual processing is done until findFirst is called. This way the compiler knows what I want to achieve and it can figure out how to do it in an efficient manner.
Take a look at this video of a presentation by Venkat Subramaniam for a longer explanation.
What I'd like to do in Groovy
While answering a question about Groovy here on StackOverflow I figured out a way to perform the task the OP was trying to achieve in a more readable manner. I refrained from suggesting it because it meant a performance decrease.
Here's the example:
collectionOfSomeStrings.inject([]) { list, conf -> if (conf.contains('homepage')) { list } else { list << conf.trim() } }
Semantically, this could be rewritten as
collectionOfSomeStrings.grep{ !it.contains('homepage')}.collect{ it.trim() }
I find it easier to understand but the readability comes at a price. This code requires a pass of the original collection and another iteration over the result of grep. This is less than ideal.
It doesn't look like the GDK's grep, collect and findAll methods are lazily evaluated like the methods in Java 8's streams API. Is there any way to have them behave like this? Is there any alternative library in Groovy that I could use?
I imagine it might be possible to use Java 8 somehow in Groovy and have this functionality. I'd welcome an explanation on the details but ideally, I'd like to be able to do that with older versions of Java.
I found a way to combine closures but it's not really what I want to do. I'd like to chain not only closures themselves but also the functions I pass them to.
Googling for Groovy and Streams mostly yields I/O related results. I haven't found anything of interest by searching for lazy evaluation, functional and Groovy as well.
Adding the suggestion as an answer taking cfrick's comment as an example:
#Grab( 'com.bloidonia:groovy-stream:0.8.1' )
import groovy.stream.Stream
List integers = [ -1, 1, 2, 3, 4 ]
//.first() or .last() whatever is needed
Stream.from integers filter{ it > 0 } map{ it * 4 } filter{ it < 15 }.collect()
Tim, I still know what you did few summers ago. ;-)
Groovy 2.3 supports jdk8 groovy.codehaus.org/Groovy+2.3+release+notes. your example works fine using groovy closures:
[-1,1,2,3,4].stream().filter{it>0}.map{it*4}.filter{it < 100}.findFirst().get()
If you can't use jdk8, you can follow the suggestion from the other answer or achieve "the same" using RxJava/RxGroovy:
#Grab('com.netflix.rxjava:rxjava-groovy:0.20.7')
import rx.Observable
Observable.from( [-1, 1, 2, 3, 4, 666] )
.filter { println "f1 $it"; it > 0 }
.map { println "m1 $it"; it * 4 }
.filter { println "f2 $it"; it < 100 }
.subscribe { println "result $it" }

asyncio with map&reduce flavor and without flooding the event loop

I am trying to use asyncio in real applications and it doesn't go that
easy, a help of asyncio gurus is needed badly.
Tasks that spawn other tasks without flooding event loop (Success!)
Consider a task like crawling the web starting from some "seeding" web-pages. Each
web-page leads to generation of new downloading tasks in exponential(!)
progression. However we don't want neither to flood the event loop nor to
overload our network. We'd like to control the task flow. This is what I
achieve well with modification of nice Maxime's solution proposed here:
https://mail.python.org/pipermail/python-list/2014-July/687823.html
map & reduce (Fail)
Well, but I'd need as well a very natural thing, kind of map() & reduce()
or functools.reduce() if we are on python3 already. That is, I'd need to
call a "summarizing" function for all the downloading tasks completed on
links from a page. This is where i fail :(
I'd propose an oversimplified but still a nice test to model the use case:
Let's use fibonacci function implementation in its ineffective form.
That is, let the coro_sum() be applied in reduce() and coro_fib be what we apply with
map(). Something like this:
#asyncio.coroutine
def coro_sum(x):
return sum(x)
#asyncio.coroutine
def coro_fib(x):
if x < 2:
return 1
res_coro =
executor_pool.spawn_task_when_arg_list_of_coros_ready(coro=coro_sum,
arg_coro_list=[coro_fib(x - 1), coro_fib(x - 2)])
return res_coro
So that we could run the following tests.
Test #1 on one worker:
executor_pool = ExecutorPool(workers=1)
executor_pool.as_completed( coro_fib(x) for x in range(20) )
Test #2 on two workers:
executor_pool = ExecutorPool(workers=2)
executor_pool.as_completed( coro_fib(x) for x in range(20) )
It would be very important that both each coro_fib() and coro_sum()
invocations are done via a Task on some worker, not just spawned implicitly
and unmanaged!
It would be cool to find asyncio gurus interested in this very natural goal.
Your help and ideas would be very much appreciated.
best regards
Valery
There are multiple ways to compute fibonacci series asynchroniously. First, check that the explosive variant fails in your case:
#asyncio.coroutine
def coro_sum(summands):
return sum(summands)
#asyncio.coroutine
def coro_fib(n):
if n == 0: s = 0
elif n == 1: s = 1
else:
summands, _ = yield from asyncio.wait([coro_fib(n-2), coro_fib(n-1)])
s = yield from coro_sum(f.result() for f in summands)
return s
You could replace summands with:
a = yield from coro_fib(n-2) # don't return until its ready
b = yield from coro_fib(n-1)
s = yield from coro_sum([a, b])
In general, to prevent the exponential growth, you could use asyncio.Queue (synchronization via communication), asyncio.Semaphore (synchonization using mutex) primitives.

Resources