Bridge channel to a sequence

Bridge channel to a sequence - multithreading

This code is based on Coroutines guide example: Fan-out
val inputProducer = produce<String>(CommonPool) {
(0..inputArray.size).forEach {
send(inputArray[it])
}
}
val resultChannel = Channel<Result>(10)
repeat(threadCount) {
launch(CommonPool) {
inputProducer.consumeEach {
resultChannel.send(getResultFromData(it))
}
}
}
What is the right way to create a Sequence<Result> that will provide results?

You can get the channel .iterator() from the ReceiveChannel and then wrap that channel iterator into a Sequence<T>, implementing its normal Iterator<T> that blocks waiting for the result on each request:
fun <T> ReceiveChannel<T>.asSequence(context: CoroutineContext) =
Sequence {
val iterator = iterator()
object : AbstractIterator<T>() {
override fun computeNext() = runBlocking(context) {
if (!iterator.hasNext())
done() else
setNext(iterator.next())
}
}
}
val resultSequence = resultChannel.asSequence(CommonPool)

I had the same problem, and in the end I came up with this rather unusual/convoluted solution:
fun Channel<T>.asSequence() : Sequence<T> {
val itr = this.iterator()
return sequence<Int> {
while ( runBlocking {itr.hasNext()} ) yield( runBlocking<T> { itr.next() } )
}
}
I do not think it is particular efficient (go with the one provided by #hotkey), but it has a certain appeal to me at least.

Related

Alternative way to check if a string contains multiple strings in Kotlin?

I currently have the following code
fun main() {
// Build.Fingerprint sample strings
val debug_fingerprint: String? = "Company/device/device:11/3526.4353/0064504902000:userdebug/com-d,dev-keys"
val dev_fingerprint: String? = "Company/device/device:11/526.4353/0064504902000:user/com-d,dev-keys"
val user_fingerprint: String? = "Company/device/device:11/526.4353/0064504902000:user/com-p,release-keys"
//Testing
val osVarient: String = user_fingerprint?.let {
when {
check(listOf("userdebug", "dev-keys"), it) -> "Userdebug"
check(listOf("user", "dev-keys"), it) -> "Userdevsigned"
check(listOf("user", "release-keys"), it) -> "User"
else -> "Unknown variant"
}
} ?: run {
"Unknown variant null"
}
print(osVarient)
}
fun check(args: List<String>,fingerprint: String): Boolean {
for(arg in args) {
if(!fingerprint.contains(arg)){
return false
}
}
return true
}
The above code works but I'm wondering if there is more elegant way of writing the code.
Are there any alternatives in Kotlin to compare multiple substrings to a string?

You may use a dictionary for OS variants.
Iterable<T>.all is a builtin kotlin stdlib function to check a predicate on all elements.
fun main() {
val userFingerprint: String? = "Company/device/device:11/526.4353/0064504902000:user/com-p,release-keys"
val variantMap = mapOf(
"Userdebug" to listOf("userdebug", "dev-keys"),
"Userdevsigned" to listOf("user", "dev-keys"),
"User" to listOf("user", "release-keys")
)
val osVariant = userFingerprint?.let { fingerprint ->
variantMap.entries
.firstOrNull { check(it.value, fingerprint) }
?.key
} ?: "Unknown variant null"
println(osVariant)
}
fun check(fingerprintTypes: List<String>, fingerprint: String): Boolean {
return fingerprintTypes.all { it in fingerprint }
// return fingerprintTypes.all { fingerprint.contains(it) }
}

Update list with the results of different threads in Kotlin

I want to update a list with the result of different threads.
mainFunction(): List<A> {
var x: List<A> = listOf<A>()
val job = ArrayList<Job>()
val ans = mainScope.async {
var i = 0
for (j in (0..5)) {
job.add(
launch {
val res = async {
func1()
}
x += res.await()
}
)
}
job.joinAll()
}
ans.await()
return x
}
fun func1(): List<A> {
//Perform some operation to get list abc
var abc: List<A> = listOf<A>()
delay(1000)
return abc
}
The list "x" is not getting updated properly.
Sometimes, it appends the "res".. sometimes it does not.
Is there a thread-safe way to modify lists like this?
New implementation:
mainFunction(): List<A> {
var x: List<A> = listOf<A>()
val ans = mainScope.async {
List(6) {
async{
func1()
}
}.awaitAll()
}
print(ans)
for (item in ans) {
x+= item as List<A> // item is of type Kotlin.Unit
}
}

Short answer
Here is a simpler version of what you're doing, which avoids the synchronization problems you might be running into:
suspend fun mainFunction(): List<A> {
return coroutineScope {
List(6) { async { func1() } }.awaitAll()
}
}
You can read the long answer if you want to unpack this. I will explain the different things in the original code that are not really idiomatic and could be replaced.
Long answer
There are multiple non-idiomatic things in the code in the question, so I'll try to address each of them.
Indexed for loop for 0-based range
If you just want to repeat an operation several times, it's simpler to just use repeat(6) instead of for (j in 0..5). It's easier to read, especially when you don't need the index variable:
suspend fun mainFunction(): List<A> {
var x: List<A> = listOf<A>()
val job = ArrayList<Job>()
val ans = mainScope.async {
repeat(6) {
job.add(
launch {
val res = async {
func1()
}
x += res.await()
}
)
}
job.joinAll()
}
ans.await()
return x
}
Creating lists with a loop
If what you want is to create a list out of that loop, you can also use List(size) { computeElement() } instead of repeat (or for), which makes use of the List factory function:
suspend fun mainFunction(): List<A> {
var x: List<A> = listOf<A>()
val ans = mainScope.async {
val jobs = List(6) {
launch {
val res = async {
func1()
}
x += res.await()
}
}
jobs.joinAll()
}
ans.await()
return x
}
Extra async
There is no need to wrap your launches here with an extra async, you can just use your scope on the launches directly:
suspend fun mainFunction(): List<A> {
var x: List<A> = listOf<A>()
val jobs = List(6) {
mainScope.launch {
val res = async {
func1()
}
x += res.await()
}
}
jobs.joinAll()
return x
}
async + immediate await
Using async { someFun() } and then immediately await-ing this Deferred result is equivalent to just calling someFun() directly (unless you're using a different scope or context, which you aren't doing here for the inner most logic).
So you can replace the inner-most part:
val res = async {
func1()
}
x += res.await()
By just x += func1(), which gives:
suspend fun mainFunction(): List<A> {
var x: List<A> = listOf<A>()
val jobs = List(6) {
mainScope.launch {
x += func1()
}
}
jobs.joinAll()
return x
}
launch vs async
If you want results, it is usually more practical to use async instead of launch. When you use launch, you have to store the result somewhere manually (which makes you run into synchronization problems like you have now). With async, you get a Deferred<T> value which you can then await(), and when you have a list of Deferred there is no synchronization issues when you await them all.
So the general idea of the previous code is bad practice and might bite you because it requires manual synchronization. You can replace it by:
suspend fun mainFunction(): List<A> {
val deferredValues = List(6) {
mainScope.async {
func1()
}
}
val x = deferredValues.awaitAll()
return x
}
Or simpler:
suspend fun mainFunction(): List<A> {
return List(6) {
mainScope.async {
func1()
}
}.awaitAll()
}
Manual joins vs coroutineScope
It is usually a smell to join() jobs manually. If you want to wait for some coroutines to finish, it is more idiomatic to launch all those coroutines within a coroutineScope { ... } block, which will suspend until all child coroutines finish.
Here we have already replaced all launch that we join() with async calls that we await, so this doesn't really apply anymore, because we still need to await() the deferred values in order to get the results. However, since we are in a suspend function already, we can still use coroutineScope instead of an external scope like mainScope to ensure that we don't leak any coroutines:
suspend fun mainFunction(): List<A> {
return coroutineScope {
List(6) { async { func1() } }.awaitAll()
}
}

If you just want to perform some tasks concurrently and get a list of all the finished results at the end, you could do this:
val jobs = (0..5).map { mainScope.async { func1() } }
val results = jobs.awaitAll()

Get return value from thread, is this Kotlin code thread safe?

I would like to run some treads, wait till all of them are finished and get the results.
Possible way to do that would be in the code below. Is it thread safe though?
import kotlin.concurrent.thread
sealed class Errorneous<R>
data class Success<R>(val result: R) : Errorneous<R>()
data class Fail<R>(val error: Exception) : Errorneous<R>()
fun <R> thread_with_result(fn: () -> R): (() -> Errorneous<R>) {
var r: Errorneous<R>? = null
val t = thread {
r = try { Success(fn()) } catch (e: Exception) { Fail(e) }
}
return {
t.join()
r!!
}
}
fun main() {
val tasks = listOf({ 2 * 2 }, { 3 * 3 })
val results = tasks
.map{ thread_with_result(it) }
.map{ it() }
println(results)
}
P.S.
Are there better built-in tools in Kotlin to do that? Like process 10000 tasks with pool of 10 threads?
It should be threads, not coroutines, as it will be used with legacy code and I don't know if it works well with coroutines.

Seems like Java has Executors that doing exactly that
fun <R> execute_in_parallel(tasks: List<() -> R>, threads: Int): List<Errorneous<R>> {
val executor = Executors.newFixedThreadPool(threads)
val fresults = executor.invokeAll(tasks.map { task ->
Callable<Errorneous<R>> {
try { Success(task()) } catch (e: Exception) { Fail(e) }
}
})
return fresults.map { future -> future.get() }
}

how to cap kotlin coroutines maximum concurrency

I've got a Sequence (from File.walkTopDown) and I need to run a long-running operation on each of them. I'd like to use Kotlin best practices / coroutines, but I either get no parallelism, or way too much parallelism and hit a "too many open files" IO error.
File("/Users/me/Pictures/").walkTopDown()
.onFail { file, ex -> println("ERROR: $file caused $ex") }
.filter { ... only big images... }
.map { file ->
async { // I *think* I want async and not "launch"...
ImageProcessor.fromFile(file)
}
}
This doesn't seem to run it in parallel, and my multi-core CPU never goes above 1 CPU's worth. Is there a way with coroutines to run "NumberOfCores parallel operations" worth of Deferred jobs?
I looked at Multithreading using Kotlin Coroutines which first creates ALL the jobs then joins them, but that means completing the Sequence/file tree walk completly bfore the heavy processing join step, and that seems... iffy! Splitting it into a collect and a process step means the collection could run way ahead of the processing.
val jobs = ... the Sequence above...
.toSet()
println("Found ${jobs.size}")
jobs.forEach { it.await() }

This isn't specific to your problem, but it does answer the question of, "how to cap kotlin coroutines maximum concurrency".
EDIT: As of kotlinx.coroutines 1.6.0 (https://github.com/Kotlin/kotlinx.coroutines/issues/2919), you can use limitedParallelism, e.g. Dispatchers.IO.limitedParallelism(123).
Old solution: I thought to use newFixedThreadPoolContext at first, but 1) it's deprecated and 2) it would use threads and I don't think that's necessary or desirable (same with Executors.newFixedThreadPool().asCoroutineDispatcher()). This solution might have flaws I'm not aware of by using Semaphore, but it's very simple:
import kotlinx.coroutines.async
import kotlinx.coroutines.awaitAll
import kotlinx.coroutines.coroutineScope
import kotlinx.coroutines.sync.Semaphore
import kotlinx.coroutines.sync.withPermit
/**
* Maps the inputs using [transform] at most [maxConcurrency] at a time until all Jobs are done.
*/
suspend fun <TInput, TOutput> Iterable<TInput>.mapConcurrently(
maxConcurrency: Int,
transform: suspend (TInput) -> TOutput,
) = coroutineScope {
val gate = Semaphore(maxConcurrency)
this#mapConcurrently.map {
async {
gate.withPermit {
transform(it)
}
}
}.awaitAll()
}
Tests (apologies, it uses Spek, hamcrest, and kotlin test):
import kotlinx.coroutines.ExperimentalCoroutinesApi
import kotlinx.coroutines.delay
import kotlinx.coroutines.launch
import kotlinx.coroutines.runBlocking
import kotlinx.coroutines.test.TestCoroutineDispatcher
import org.hamcrest.MatcherAssert.assertThat
import org.hamcrest.Matchers.greaterThanOrEqualTo
import org.hamcrest.Matchers.lessThanOrEqualTo
import org.spekframework.spek2.Spek
import org.spekframework.spek2.style.specification.describe
import java.util.concurrent.atomic.AtomicInteger
import kotlin.test.assertEquals
#OptIn(ExperimentalCoroutinesApi::class)
object AsyncHelpersKtTest : Spek({
val actionDelay: Long = 1_000 // arbitrary; obvious if non-test dispatcher is used on accident
val testDispatcher = TestCoroutineDispatcher()
afterEachTest {
// Clean up the TestCoroutineDispatcher to make sure no other work is running.
testDispatcher.cleanupTestCoroutines()
}
describe("mapConcurrently") {
it("should run all inputs concurrently if maxConcurrency >= size") {
val concurrentJobCounter = AtomicInteger(0)
val inputs = IntRange(1, 2).toList()
val maxConcurrency = inputs.size
// https://github.com/Kotlin/kotlinx.coroutines/issues/1266 has useful info & examples
runBlocking(testDispatcher) {
print("start runBlocking $coroutineContext\n")
// We have to run this async so that the code afterwards can advance the virtual clock
val job = launch {
testDispatcher.pauseDispatcher {
val result = inputs.mapConcurrently(maxConcurrency) {
print("action $it $coroutineContext\n")
// Sanity check that we never run more in parallel than max
assertThat(concurrentJobCounter.addAndGet(1), lessThanOrEqualTo(maxConcurrency))
// Allow for virtual clock adjustment
delay(actionDelay)
// Sanity check that we never run more in parallel than max
assertThat(concurrentJobCounter.getAndAdd(-1), lessThanOrEqualTo(maxConcurrency))
print("action $it after delay $coroutineContext\n")
it
}
// Order is not guaranteed, thus a Set
assertEquals(inputs.toSet(), result.toSet())
print("end mapConcurrently $coroutineContext\n")
}
}
print("before advanceTime $coroutineContext\n")
// Start the coroutines
testDispatcher.advanceTimeBy(0)
assertEquals(inputs.size, concurrentJobCounter.get(), "All jobs should have been started")
testDispatcher.advanceTimeBy(actionDelay)
print("after advanceTime $coroutineContext\n")
assertEquals(0, concurrentJobCounter.get(), "All jobs should have finished")
job.join()
}
}
it("should run one at a time if maxConcurrency = 1") {
val concurrentJobCounter = AtomicInteger(0)
val inputs = IntRange(1, 2).toList()
val maxConcurrency = 1
runBlocking(testDispatcher) {
val job = launch {
testDispatcher.pauseDispatcher {
inputs.mapConcurrently(maxConcurrency) {
assertThat(concurrentJobCounter.addAndGet(1), lessThanOrEqualTo(maxConcurrency))
delay(actionDelay)
assertThat(concurrentJobCounter.getAndAdd(-1), lessThanOrEqualTo(maxConcurrency))
it
}
}
}
testDispatcher.advanceTimeBy(0)
assertEquals(1, concurrentJobCounter.get(), "Only one job should have started")
val elapsedTime = testDispatcher.advanceUntilIdle()
print("elapsedTime=$elapsedTime")
assertThat(
"Virtual time should be at least as long as if all jobs ran sequentially",
elapsedTime,
greaterThanOrEqualTo(actionDelay * inputs.size)
)
job.join()
}
}
it("should handle cancellation") {
val jobCounter = AtomicInteger(0)
val inputs = IntRange(1, 2).toList()
val maxConcurrency = 1
runBlocking(testDispatcher) {
val job = launch {
testDispatcher.pauseDispatcher {
inputs.mapConcurrently(maxConcurrency) {
jobCounter.addAndGet(1)
delay(actionDelay)
it
}
}
}
testDispatcher.advanceTimeBy(0)
assertEquals(1, jobCounter.get(), "Only one job should have started")
job.cancel()
testDispatcher.advanceUntilIdle()
assertEquals(1, jobCounter.get(), "Only one job should have run")
job.join()
}
}
}
})
Per https://play.kotlinlang.org/hands-on/Introduction%20to%20Coroutines%20and%20Channels/09_Testing, you may also need to adjust compiler args for the tests to run:
compileTestKotlin {
kotlinOptions {
// Needed for runBlocking test coroutine dispatcher?
freeCompilerArgs += "-Xuse-experimental=kotlin.Experimental"
freeCompilerArgs += "-Xopt-in=kotlin.RequiresOptIn"
}
}
testImplementation 'org.jetbrains.kotlinx:kotlinx-coroutines-test:1.4.1'

The problem with your first snippet is that it doesn't run at all - remember, Sequence is lazy, and you have to use a terminal operation such as toSet() or forEach(). Additionally, you need to limit the number of threads that can be used for that task via constructing a newFixedThreadPoolContext context and using it in async:
val pictureContext = newFixedThreadPoolContext(nThreads = 10, name = "reading pictures in parallel")
File("/Users/me/Pictures/").walkTopDown()
.onFail { file, ex -> println("ERROR: $file caused $ex") }
.filter { ... only big images... }
.map { file ->
async(pictureContext) {
ImageProcessor.fromFile(file)
}
}
.toList()
.forEach { it.await() }
Edit:
You have to use a terminal operator (toList) befor awaiting the results

I got it working with a Channel. But maybe I'm being redundant with your way?
val pipe = ArrayChannel<Deferred<ImageFile>>(20)
launch {
while (!(pipe.isEmpty && pipe.isClosedForSend)) {
imageFiles.add(pipe.receive().await())
}
println("pipe closed")
}
File("/Users/me/").walkTopDown()
.onFail { file, ex -> println("ERROR: $file caused $ex") }
.forEach { pipe.send(async { ImageFile.fromFile(it) }) }
pipe.close()

This doesn't preserve the order of the projection but otherwise limits the throughput to at most maxDegreeOfParallelism. Expand and extend as you see fit.
suspend fun <TInput, TOutput> (Collection<TInput>).inParallel(
maxDegreeOfParallelism: Int,
action: suspend CoroutineScope.(input: TInput) -> TOutput
): Iterable<TOutput> = coroutineScope {
val list = this#inParallel
if (list.isEmpty())
return#coroutineScope listOf<TOutput>()
val brake = Channel<Unit>(maxDegreeOfParallelism)
val output = Channel<TOutput>()
val counter = AtomicInteger(0)
this.launch {
repeat(maxDegreeOfParallelism) {
brake.send(Unit)
}
for (input in list) {
val task = this.async {
action(input)
}
this.launch {
val result = task.await()
output.send(result)
val completed = counter.incrementAndGet()
if (completed == list.size) {
output.close()
} else brake.send(Unit)
}
brake.receive()
}
}
val results = mutableListOf<TOutput>()
for (item in output) {
results.add(item)
}
return#coroutineScope results
}
Example usage:
val output = listOf(1, 2, 3).inParallel(2) {
it + 1
} // Note that output may not be in same order as list.

Why not use the asFlow() operator and then use flatMapMerge?
someCoroutineScope.launch(Dispatchers.Default) {
File("/Users/me/Pictures/").walkTopDown()
.asFlow()
.filter { ... only big images... }
.flatMapMerge(concurrencyLimit) { file ->
flow {
emit(runInterruptable { ImageProcessor.fromFile(file) })
}
}.catch { ... }
.collect()
}
Then you can limit the simultaneous open files while still processing them concurrently.

To limit the parallelism to some value there is limitedParallelism function starting from the 1.6.0 version of the kotlinx.coroutines library. It can be called on CoroutineDispatcher object. So to limit threads for parallel execution we can write something like:
val parallelismLimit = Runtime.getRuntime().availableProcessors()
val limitedDispatcher = Dispatchers.Default.limitedParallelism(parallelismLimit)
val scope = CoroutineScope(limitedDispatcher) // we can set limitedDispatcher for the whole scope
scope.launch { // or we can set limitedDispatcher for a coroutine launch(limitedDispatcher)
File("/Users/me/Pictures/").walkTopDown()
.onFail { file, ex -> println("ERROR: $file caused $ex") }
.filter { ... only big images... }
.map { file ->
async {
ImageProcessor.fromFile(file)
}
}.toList().awaitAll()
}
ImageProcessor.fromFile(file) will be executed in parallel using parallelismLimit number of threads.

This will cap coroutines to workers. I'd recommend watching https://www.youtube.com/watch?v=3WGM-_MnPQA
package com.example.workers
import kotlinx.coroutines.*
import kotlinx.coroutines.channels.ReceiveChannel
import kotlinx.coroutines.channels.produce
import kotlin.system.measureTimeMillis
class ChannellibgradleApplication
fun main(args: Array<String>) {
var myList = mutableListOf<Int>(3000,1200,1400,3000,1200,1400,3000)
runBlocking {
var myChannel = produce(CoroutineName("MyInts")) {
myList.forEach { send(it) }
}
println("Starting coroutineScope ")
var time = measureTimeMillis {
coroutineScope {
var workers = 2
repeat(workers)
{
launch(CoroutineName("Sleep 1")) { theHardWork(myChannel) }
}
}
}
println("Ending coroutineScope $time ms")
}
}
suspend fun theHardWork(channel : ReceiveChannel<Int>)
{
for(m in channel) {
println("Starting Sleep $m")
delay(m.toLong())
println("Ending Sleep $m")
}
}

Is it possible to mock accessors by Mockito in Kotlin?

Is it possible to mock getter and setter of the property by Mockito? Something like this:
#Test
fun three() {
val m = mock<Ddd>() {
// on { getQq() }.doReturn("mocked!")
}
assertEquals("mocked!", m.qq)
}
open class Ddd {
var qq : String = "start"
set(value) {
field = value + " by setter"
}
get() {
return field + " by getter"
}
}

To mock getter just write:
val m = mock<Ddd>()
`when`(m.qq).thenReturn("42")
also i suggest to use mockito-kotlin, to use useful extensions and functions like whenever:
val m = mock<Ddd>()
whenever(m.qq).thenReturn("42")

Complementing IRus' answer, you could also use the following syntax:
val mockedObj = mock<SomeClass> {
on { funA() } doReturn "valA"
on { funB() } doReturn "valB"
}
or
val mockedObj = mock<SomeClass> {
on(it.funA()).thenReturn("valA")
on(it.funB()).thenReturn("valB")
}

Develop Reference

node.js excel linux python-3.x azure haskell apache-spark rust .htaccess string

Bridge channel to a sequence - multithreading

Related

Alternative way to check if a string contains multiple strings in Kotlin?

Update list with the results of different threads in Kotlin

Get return value from thread, is this Kotlin code thread safe?

how to cap kotlin coroutines maximum concurrency

Is it possible to mock accessors by Mockito in Kotlin?

Categories

Resources