Weird Behavior of Scala Future and Thread.sleep - multithreading

I'm currently writing codes to extend the Future companion object. One function I want to implement is Any
//returns the future that computes the first value computed from the list. If the first one fails, fail.
def any[T](fs: List[Future[T]]): Future[T] = {
val p = Promise[T]()
fs foreach { f => {
f onComplete {
case Success(v) => p trySuccess v
case Failure(e) => p tryFailure e
}
} }
p.future
}
I tried to test my code with
test("A list of Futures return only the first computed value") {
val nums = (0 until 10).toList
val futures =
nums map { n => Future { Thread.sleep(n*1000); n } }
val v = Await.result(Future.any(futures), Duration.Inf)
assert(v === 0)
}
But the returned value is 1, not 0. When I switched sleeping time to n*1000 to (n+1)*1000, it works fine(returns 0).
Is there any special effect when called sleep on 0?

Thread.sleep is a blocking operation in your Future but you are not signaling to the ExecutionContext that you are doing so, so the behavior will vary depending on what ExecutionContext you use and how many processors your machine has. Your code works as expected with ExecutionContext.global if you add blocking:
nums map { n => Future { blocking { Thread.sleep(n*1000); n } } }

I think the function name is any so I think you implement any right way.
But if you want the first one you just get the 1st element from the List argument fs and complete with a promise.

Related

How to find first desired result from kotlin coroutines Deferred<> (server)

I’ve built a sharding library and i’m trying to add coroutine functionality to it. In the following snippet it returns the first true result that it finds:
override fun emailExists(email: String): Boolean {
return runBlocking {
shards
.asyncAll { userDao.emailExists(email) }
.map { it.await() }
.firstOrNull { it }
} ?: false
}
the shards.asyncAll method is:
fun <T> async(
shardId: Long,
context: CoroutineContext = EmptyCoroutineContext,
start: CoroutineStart = CoroutineStart.DEFAULT,
block: suspend CoroutineScope.() -> T): Deferred<T> {
return scope.async(context, start) {
selectShard(shardId)
block()
}
}
fun <T> asyncAll(
shardIds: Collection<Long> = this.shardIds,
context: CoroutineContext = EmptyCoroutineContext,
start: CoroutineStart = CoroutineStart.DEFAULT,
block: suspend CoroutineScope.() -> T): List<Deferred<T>> {
return shardIds.map { async(it, context, start, block) }
}
This works, but it consults the shards in order for their return, meaning if the first shard takes a very long time to return and it doesn't return true but the second shard returns immediately with a value of true we're still waiting as long as the first shard took to return. Is there a better way to wait on values for a collection of Deferred<>'s and process them in the order that they return so that I can exit as early as possible?
Even if you were to get your answer early, runBlocking would still wait for all the coroutines you started to complete before returning.
In order to run the kind of coroutine race you're looking for:
When the first task completes with true, it needs store that result and cancel the parent job of all the other tasks; and
The other tasks should properly abort when cancelled.
Unfortunately, I'm pretty sure Kotlin doesn't include a function that does this, so you have to do it yourself. The easiest way is probably to have each throw an exception that indicates a true result. You can then use awaitAll on the group, catch the exception, and extract the result.

How to safely select across channels where some may get concurrently closed?

While answering a question I attempted to implement a setup where the main thread joins the efforts of the CommonPool to execute a number of independent tasks in parallel (this is how java.util.streams operates).
I create as many actors as there are CommonPool threads, plus a channel for the main thread. The actors use rendezvous channels:
val resultChannel = Channel<Double>(UNLIMITED)
val poolComputeChannels = (1..commonPool().parallelism).map {
actor<Task>(CommonPool) {
for (task in channel) {
task.execute().also { resultChannel.send(it) }
}
}
}
val mainComputeChannel = Channel<Task>()
val allComputeChannels = poolComputeChannels + mainComputeChannel
This allows me to distribute the load by using a select expression to find an idle actor for each task:
select {
allComputeChannels.forEach { chan ->
chan.onSend(task) {}
}
}
So I send all the tasks and close the channels:
launch(CommonPool) {
jobs.forEach { task ->
select {
allComputeChannels.forEach { chan ->
chan.onSend(task) {}
}
}
}
allComputeChannels.forEach { it.close() }
}
Now I have to write the code for the main thread. Here I decided to serve both the mainComputeChannel, executing the tasks submitted to the main thread, and the resultChannel, accumulating the individual results into the final sum:
return runBlocking {
var completedCount = 0
var sum = 0.0
while (completedCount < NUM_TASKS) {
select<Unit> {
mainComputeChannel.onReceive { task ->
task.execute().also { resultChannel.send(it) }
}
resultChannel.onReceive { result ->
sum += result
completedCount++
}
}
}
resultChannel.close()
sum
}
This gives rise to the situation where mainComputeChannel may be closed from a CommonPool thread, but the resultChannel still needs serving. If the channel is closed, onReceive will throw an exception and onReceiveOrNull will immediately select with null. Neither option is acceptable. I didn't find a way to avoid registering the mainComputeChannel if it's closed, either. If I use if (!mainComputeChannel.isClosedForReceive), it will not be atomic with the registration call.
This leads me to my question: what would be a good idiom to select over channels where some may get closed by another thread while others are still live?
The kotlinx.coroutines library is currently missing a primitive to make it convenient. The outstanding proposal is to add receiveOrClose function and onReceiveOrClosed clause for select that would make writing code like this possible.
However, you will still have to manually track the fact that your mainComputeChannel was closed and stop selecting on it when it was. So, using a proposed onReceiveOrClosed clause you'll write something like this:
// outside of loop
var mainComputeChannelClosed = false
// inside loop
select<Unit> {
if (!mainComputeChannelClosed) {
mainComputeChannel.onReceiveOrClosed {
if (it.isClosed) mainComputeChannelClosed = true
else { /* do something with it */ }
}
}
// more clauses
}
See https://github.com/Kotlin/kotlinx.coroutines/issues/330 for details.
There are no proposals on the table to further simplify this kind of pattern.

Scala future execution

I have two futures. I want to execute them in order. For example:
val ec: ExecutionContextExecutor = ExecutionContext.Implicits.global
val first=Future.successful(...)
val second=Future.successful(...)
When first is completed then second should be executed. The problem is that second should return Future[Object] not Future[Unit] so
I can not use completed, andThen etc. functions
I can not block the process using await or Thread.sleep(...)
I can not use for loop since execution context is defined like this.
first.flatmap( _=> second) will not execute in order.
How can I do that?
As soon as you assign a Future to a val, that Future is scheduled and will execute as soon as possible. To prevent this you have two options:
Define the Future in a def
Define the Future where you want to use it.
Here's an example of #1:
def first: Future[Int] = Future { Thread.sleep(5000); 1 }
def second(i: Int): Future[Unit] = Future { println(i) }
first.flatMap(i => second(i))
And here's an example of #2:
for(
i <- Future { Thread.sleep(5000); 1 };
_ <- Future { println(i) }
) yield ()
Both examples will wait for 5 seconds and print then 1

Returning a list of a computations from a method with that uses a sequence of Futures

I want to return the list of the computations from a method that uses a list of Futures:
def foo: List[Long] = {
val res = List(1, 2, 3) map {
x => Future { someCalculation(x) }
}
Future.sequence(res)
// what to do next?
}
def someCalculation(a: Int): Long = //....
How can I do this?
There is a key point to understand when it comes to futures: if you wanna go from Future[T] to T you need to await the result of the operation, but this is something you would like to avoid not to affect the performance of your program. The correct approach is to keep working with asynchronous abstractions as much as you can, and move blocking up to your calling stack.
The Future class has a number of methods you can use to enchain other async operations, such as map, onComplete, onSuccess, etc etc.
If you really need to wait the result, then there is Await.result
val listOfFutures:List[Future[Long]] = val res = List(1, 2, 3) map {
x => Future { someCalculation(x) }
}
// now we have a Future[List[Long]]
val futureList:Future[List[Long]] = Future.sequence(listOfFutures)
// Keep being async here, compute the results asynchronously. Remember the map function on future allows you to pass a f: A=>B on Future[A] and obtain a Future[B]. Here we call the sum method on the list
val yourOperationAsync:Future[Long] = futureList.map{_.sum}
// Do this only when you need to get the result
val result:Long = Await.result(yourOperationAsync, 1 second)
Well the whole point of using Future is to make it asynchronous. i.e
def foo: Future[List[Long]] = {
val res = List(1, 2, 3) map {
x => Future { someCalculation(x) }
}
Future.sequence(res)
}
This would be the ideal solution. But In case if you wish to wait, then you could wait for the result and then return:
val ans = Future.sequence(res)
Await.ready(ans, Duration.inf)

How does one return from a groovy closure and stop its execution?

I would like to return from a closure, like one would if using a break statement in a loop.
For example:
largeListOfElements.each{ element->
if(element == specificElement){
// do some work
return // but this will only leave this iteration and start the next
}
}
In the above if statement I would like to stop iterating through the list and leave the closure to avoid unnecessary iterations.
I've seen a solution where an exception is thrown within the closure and caught outside, but I'm not too fond of that solution.
Are there any solutions to this, other than changing the code to avoid this kind of algorithm?
I think you want to use find instead of each (at least for the specified example). Closures don't directly support break.
Under the covers, groovy doesn't actually use a closure either for find, it uses a for loop.
Alternatively, you could write your own enhanced version of find/each iterator that takes a conditional test closure, and another closure to call if a match is found, having it break if a match is met.
Here's an example:
Object.metaClass.eachBreak = { ifClosure, workClosure ->
for (Iterator iter = delegate.iterator(); iter.hasNext();) {
def value = iter.next()
if (ifClosure.call(value)) {
workClosure.call(value)
break
}
}
}
def a = ["foo", "bar", "baz", "qux"]
a.eachBreak( { it.startsWith("b") } ) {
println "working on $it"
}
// prints "working on bar"
I think you're working on the wrong level of abstraction. The .each block does exactly what it says: it executes the closure once for each element. What you probably want instead is to use List.indexOf to find the right specificElement, and then do the work you need to do on it.
If you want to process all elements until a specific one was found you could also do something like this:
largeListOfElements.find { element ->
// do some work
element == specificElement
}
Although you can use this with any kind of "break condition".
I just used this to process the first n elements of a collection by returning
counter++ >= n
at the end of the closure.
As I understand groovy, the way to shortcut these kinds of loops would be to throw a user-defined exception. I don't know what the syntax would be (not a grrovy programmer), but groovy runs on the JVM so it would be something something like:
class ThisOne extends Exception {Object foo; ThisOne(Object foo) {this.foo=foo;}}
try { x.each{ if(it.isOk()) throw new ThisOne(it); false} }
catch(ThisOne x) { print x.foo + " is ok"; }
After paulmurray's answer I wasn't sure myself what would happen with an Exception thrown from within a closure, so I whipped up a JUnit Test Case that is easy to think about:
class TestCaseForThrowingExceptionFromInsideClosure {
#Test
void testEearlyReturnViaException() {
try {
[ 'a', 'b', 'c', 'd' ].each {
System.out.println(it)
if (it == 'c') {
throw new Exception("Found c")
}
}
}
catch (Exception exe) {
System.out.println(exe.message)
}
}
}
The output of the above is:
a
b
c
Found c
But remember that "one should NOT use Exceptions for flow control", see in particular this Stack Overflow question: Why not use exceptions as regular flow of control?
So the above solution is less than ideal in any case. Just use:
class TestCaseForThrowingExceptionFromInsideClosure {
#Test
void testEarlyReturnViaFind() {
def curSolution
[ 'a', 'b', 'c', 'd' ].find {
System.out.println(it)
curSolution = it
return (it == 'c') // if true is returned, find() stops
}
System.out.println("Found ${curSolution}")
}
}
The output of the above is also:
a
b
c
Found c
Today I faced a similar problem while working with each closure. I wanted to break the flow of execution based on my condition but couldn't do it.
The easiest way to do in groovy is to use any() on a list instead of each if you wish to return a boolean based on some condition.
Good ole for loop still works in Groovy for your use case
for (element in largeListOfElements) {
if(element == specificElement){
// do some work
return
}
}

Resources