Reactive streams map operator never getting executed - groovy

For the below code
Mono<String> input =
Mono.just("input")
.map {
println "inside map"
it + "added"
}
.transform {
Mono.just("hello")
}
input.subscribe {println it}
The console looks like as below.
16:11:49.056 [main] DEBUG reactor.util.Loggers$LoggerFactory - Using Slf4j logging framework
hello
The code inside the map function was never executed. I understand that transform method executes at assembly time rather than the subscription.
Why did Reactor just decide to not process my upstream map operator. Did it intelligently decide that since I am not in anyway referring to the output of the map operator that it need not execute map at all ?
Is this behaviour configurable ?

The reason is that transform does not automatically subscribe to your original Mono. It's your responsibility to chain your logic onto it. Since nothing subscribes to it, it will never get triggered.
As the example you sent is dummy, it's difficult to say what would be the right thing to do. It depends on your use case.
A few thing you can do, though:
Get rid of transform and just simply use then operator:
Mono<String> input =
Mono.just("input")
.map {
println "inside map"
it + "added"
}
.then(Mono.just("hello"))
If for some reason you need transform, then chain your logic onto your original Mono:
Mono<String> input =
Mono.just("input")
.map {
println "inside map"
it + "added"
}
.transform {
it.then(Mono.just("hello"))
}

Related

Coordinating emission and subscription in Kotlin coroutines with hot flows

I am trying to design an observable task-like entity which would have the following properties:
Reports its current state changes reactively
Shares state and result events: new subscribers will also be notified if the change happens after they've subscribed
Has a lifecycle (backed by CoroutineScope)
Doesn't have suspend functions in the interface (because it has a lifecycle)
The very basic code is something like this:
class Worker {
enum class State { Running, Idle }
private val state = MutableStateFlow(State.Idle)
private val results = MutableSharedFlow<String>()
private val scope = CoroutineScope(Dispatchers.Default)
private suspend fun doWork(): String {
println("doing work")
return "Result of the work"
}
fun start() {
scope.launch {
state.value = State.Running
results.emit(doWork())
state.value = State.Idle
}
}
fun state(): Flow<State> = state
fun results(): Flow<String> = results
}
The problems with this arise when I want to "start the work after I'm subscribed". There's no clear way to do that. The simplest thing doesn't work (understandably):
fun main() {
runBlocking {
val worker = Worker()
// subscriber 1
launch {
worker.results().collect { println("received result $it") }
}
worker.start()
// subscriber 2 can also be created "later" and watch
// for state()/result() changes
}
}
This prints only "doing work" and never prints a result. I understand why this happens (because collect and start are in separate coroutines, not synchronized in any way).
Adding a delay(300) to coroutine inside doWork "fixes" things, results are printed, but I'd like this to work without artificial delays.
Another "solution" is to create a SharedFlow from results() and use its onSubscription to call start(), but that didn't work either last time I've tried.
My questions are:
Can this be turned into something that works or is this design initially flawed?
If it is flawed, can I take some other approach which would still hit all the goals I have specified in the beginning of the post?
Your problem is that your SharedFlow has no buffer set up, so it is emitting results to its (initially zero) current collectors and immediately forgetting them. The MutableSharedFlow() function has a replay parameter you can use to determine how many previous results it should store and replay to new collectors. You will need to decide what replay amount to use based on your use case for this class. For simply displaying latest results in a UI, a common choice is a replay of 1.
Depending on your use case, you may want to give your CoroutineScope a SupervisorJob() in its context so it isn't destroyed by any child job failing.
Side note, your state() and results() functions should be properties by Kotlin convention, since they do nothing but return references. Personally, I would also have them return read-only StateFlow/SharedFlow instead of just Flow to clarify that they are not cold.

In Kotlin Native, how to keep an object around in a separate thread, and mutate its state from any other thead without using C pointers?

I'm exploring Kotlin Native and have a program with a bunch of Workers doing concurrent stuff
(running on Windows, but this is a general question).
Now, I wanted to add simple logging. A component that simply logs strings by appending them as new lines to a file that is kept open in 'append' mode.
(Ideally, I'd just have a "global" function...
fun log(text:String) {...} ]
...that I would be able to call from anywhere, including from "inside" other workers and that would just work. The implication here is that it's not trivial to do this because of Kotlin Native's rules regarding passing objects between threads (TLDR: you shouldn't pass mutable objects around. See: https://github.com/JetBrains/kotlin-native/blob/master/CONCURRENCY.md#object-transfer-and-freezing ).
Also, my log function would ideally accept any frozen object. )
What I've come up with are solutions using DetachedObjectGraph:
First, I create a detached logger object
val loggerGraph = DetachedObjectGraph { FileLogger("/foo/mylogfile.txt")}
and then use loggerGraph.asCPointer() ( asCPointer() ) to get a COpaquePointer to the detached graph:
val myPointer = loggerGraph.asCPointer()
Now I can pass this pointer into the workers ( via the producer lambda of the Worker's execute function ), and use it there. Or I can store the pointer in a #ThreadLocal global var.
For the code that writes to the file, whenever I want to log a line, I have to create a DetachedObjectGraph object from the pointer again,
and attach() it in order to get a reference to my fileLogger object:
val fileLogger = DetachedObjectGraph(myPointer).attach()
Now I can call a log function on the logger:
fileLogger.log("My log message")
This is what I've come up with looking at the APIs that are available (as of Kotlin 1.3.61) for concurrency in Kotlin Native,
but I'm left wondering what a better approach would be ( using Kotlin, not resorting to C ). Clearly it's bad to create a DetachedObjectGraph object for every line written.
One could pose this question in a more general way: How to keep a mutable resource open in a separate thread ( or worker ), and send messages to it.
Side comment: Having Coroutines that truly use threads would solve this problem, but the question is about how to solve this task with the APIs currently ( Kotlin 1.3.61 ) available.
You definitely shouldn't use DetachedObjectGraph in the way presented in the question. There's nothing to prevent you from trying to attach on multiple threads, or if you pass the same pointer, trying to attach to an invalid one after another thread as attached to it.
As Dominic mentioned, you can keep the DetachedObjectGraph in an AtomicReference. However, if you're going to keep DetachedObjectGraph in an AtomicReference, make sure the type is AtomicRef<DetachedObjectGraph?> and busy-loop while the DetachedObjectGraph is null. That will prevent the same DetachedObjectGraph from being used by multiple threads. Make sure to set it to null, and repopulate it, in an atomic way.
However, does FileLogger need to be mutable at all? If you're writing to a file, it doesn't seem so. Even if so, I'd isolate the mutable object to a separate worker and send log messages to it rather than doing a DetachedObjectGraph inside an AtomicRef.
In my experience, DetachedObjectGraph is super uncommon in production code. We don't use it anywhere at the moment.
To isolate mutable state to a Worker, something like this:
class MutableThing<T:Any>(private val worker:Worker = Worker.start(), producer:()->T){
private val arStable = AtomicReference<StableRef<T>?>(null)
init {
worker.execute(TransferMode.SAFE, {Pair(arStable, producer).freeze()}){
it.first.value = StableRef.create(it.second()).freeze()
}
}
fun <R> access(block:(T)->R):R{
return worker.execute(TransferMode.SAFE, {Pair(arStable, block).freeze()}){
it.second(it.first.value!!.get())
}.result
}
}
object Log{
private val fileLogger = MutableThing { FileLogger() }
fun log(s:String){
fileLogger.access { fl -> fl.log(s) }
}
}
class FileLogger{
fun log(s:String){}
}
The MutableThing uses StableRef internally. producer makes the mutable state you want to isolate. To log something, call Log.log, which will wind up calling the mutable FileLogger.
To see a basic example of MutableThing, run the following test:
#Test
fun goIso(){
val mt = MutableThing { mutableListOf("a", "b")}
val workers = Array(4){Worker.start()}
val futures = mutableListOf<Future<*>>()
repeat(1000) { rcount ->
val future = workers[rcount % workers.size].execute(
TransferMode.SAFE,
{ Pair(mt, rcount).freeze() }
) { pair ->
pair.first.access {
val element = "ttt ${pair.second}"
println(element)
it.add(element)
}
}
futures.add(future)
}
futures.forEach { it.result }
workers.forEach { it.requestTermination() }
mt.access {
println("size: ${it.size}")
}
}
The approach you've taken is pretty much correct and the way it's supposed to be done.
The thing I would add is, instead of passing around a pointer around. You should pass around a frozen FileLogger, which will internally hold a reference to a AtomicRef<DetachedObjectGraph>, the the attaching and detaching should be done internally. Especially since DetachedObjectGraphs are invalid once attached.

"this" argument in boost bind

I am writing multi-threaded server that handles async read from many tcp sockets. Here is the section of code that bothers me.
void data_recv (void) {
socket.async_read_some (
boost::asio::buffer(rawDataW, size_t(648*2)),
boost::bind ( &RPC::on_data_recv, this,
boost::asio::placeholders::error,
boost::asio::placeholders::bytes_transferred));
} // RPC::data_recvW
void on_data_recv (boost::system::error_code ec, std::size_t bytesRx) {
if ( rawDataW[bytesRx-1] == ENDMARKER { // <-- this code is fine
process_and_write_rawdata_to_file
}
else {
read_socket_until_endmarker // <-- HELP REQUIRED!!
process_and_write_rawadata_to_file
}
}
Nearly always the async_read_some reads in data including the endmarker, so it works fine. Rarely, the endmarker's arrival is delayed in the stream and that's when my program fails. I think it fails because I have not understood how boost bind works.
My first question:
I am confused with this boost totorial example , in which "this" does not appear in the handler declaration. ( Please see code of start_accept() in the example.) How does this work? Does compiler ignore the "this" ?
my second question:
In the on_data_recv() method, how do I read data from the same socket that was read in the on_data() method? In other words, how do I pass the socket as argument from calling method to the handler? when the handler is executed in another thread? Any help in form of a few lines of code that can fit into my "read_socket_until_endmarker" will be appreciated.
My first question: I am confused with this boost totorial example , in which "this" does not appear in the handler declaration. ( Please see code of start_accept() in the example.) How does this work? Does compiler ignore the "this" ?
In the example (and I'm assuming this holds for your functions as well) the start_accept() is a member function. The bind function is conveniently designed such that when you use & in front of its first argument, it interprets it as a member function that is applied to its second argument.
So while a code like this:
void foo(int x) { ... }
bind(foo, 3)();
Is equivalent to just calling foo(3)
Code like this:
struct Bar { void foo(int x); }
Bar bar;
bind(&foo, &bar, 3)(); // <--- notice the & before foo
Would be equivalent to calling bar.foo(3).
And thus as per your example
boost::bind ( &RPC::on_data_recv, this, // <--- notice & again
boost::asio::placeholders::error,
boost::asio::placeholders::bytes_transferred)
When this object is invoked inside Asio it shall be equivalent to calling this->on_data_recv(error, size). Checkout this link for more info.
For the second part, it is not clear to me how you're working with multiple threads, do you run io_service.run() from more than one thread (possible but I think is beyond your experience level)? It might be the case that you're confusing async IO with multithreading. I'm gonna assume that is the case and if you correct me I'll change my answer.
The usual and preferred starting point is to have just one thread running the io_service.run() function. Don't worry, this will allow you to handle many sockets asynchronously.
If that is the case, your two functions could easily be modified as such:
void data_recv (size_t startPos = 0) {
socket.async_read_some (
boost::asio::buffer(rawDataW, size_t(648*2)) + startPos,
boost::bind ( &RPC::on_data_recv, this,
startPos,
boost::asio::placeholders::error,
boost::asio::placeholders::bytes_transferred));
} // RPC::data_recvW
void on_data_recv (size_t startPos,
boost::system::error_code ec,
std::size_t bytesRx) {
// TODO: Check ec
if (rawDataW[startPos + bytesRx-1] == ENDMARKER) {
process_and_write_rawdata_to_file
}
else {
// TODO: Error if startPos + bytesRx == 648*2
data_recv(startPos + bytesRx);
}
}
Notice though that the above code still has problems, the main one being that if the other side sent two messages quickly one after another, we could receive (in one async_read_some call) the full first message + part of the second message, and thus missing the ENDMARKER from the first one. Thus it is not enough to only test whether the last received byte is == to the ENDMARKER.
I could go on and modify this function further (I think you might get the idea on how), but you'd be better off using async_read_until which is meant exactly for this purpose.

Workaround for lack of generators/yield keyword in Groovy

Wondering if there is a way I can use sql.eachRow like a generator, to use it in a DSL context where a Collection or Iterator is expected. The use case I'm trying to go for is streaming JSON generation - what I'm trying to do is something like:
def generator = { sql.eachRow { yield it } }
jsonBuilder.root {
status "OK"
rows generator()
}
You would need continuation support (or similiar) for this to work to some extend. Groovy does not have continuations, the JVM also not. Normally continuation passing style works, but then the method eachRow would have to support that, which it of course does not. So the only way I see is a makeshift solution using threads or something like that. So maybe something like that would work for you:
def sync = new java.util.concurrent.SynchronousQueue()
Thread.start { sql.eachRow { sync.put(it) } }
jsonBuilder.root {
status "OK"
rows sync.take()
}
I am not stating, that this is a good solution, just a random consumer-producer-work-around for your problem.

How do I print a Groovy stack trace?

How do I print a Groovy stack trace? The Java method, Thread.currentThread().getStackTrace() produces a huge stack trace, including a lot of the Groovy internals. I'm seeing a function called twice from a StreamingMarkupBuilder that looks like it should only be called once and I would like to see why Groovy thinks it should be calling it twice.
Solution:
org.codehaus.groovy.runtime.StackTraceUtils.sanitize(new Exception()).printStackTrace()
Original answer:
A Google search returns the following information:
Apparently, there is a method in org.codehaus.groovy.runtime.StackTraceUtils called printSanitizedStackTrace. There isn't much documentation for the method, though there is a method called sanitize which is described as
remove all apparently groovy-internal
trace entries from the exception
instance This modifies the original
instance and returns it, it does not
clone
So I would try org.codehaus.groovy.runtime.StackTraceUtils.printSanitizedStackTrace(Throwable t) (it is static)
and see if that works for you.
I found this questions when searching for "spock print full stack trace".
My unit tests are written in Groovy, using the Spock testing framework and they're run in the context of a Gradle build.
The fix for me was as simple as adding exceptionFormat = 'full' to my Gradle test task specification:
test {
testLogging {
exceptionFormat = 'full'
}
}
I have designed this simple code for stack trace printing, based on artificial simulation of a NullPointerException.
This code produces the same output in both modes: from a Jenkinsfile (Pipeline) and from a normal .groovy script in a command line.
def getStackTrace() {
try {
null.class.toString() // simulate NPE
} catch (NullPointerException e) {
return e.getStackTrace()
}
return null
}
def printStackTrace() {
def stackTraceStr = ""
def callingFuncFound = false
for (StackTraceElement ste : getStackTrace()) {
if (callingFuncFound) {
stackTraceStr += ste.toString() + '\n'
}
if (!callingFuncFound && ste.toString().startsWith(this.class.name + '.printStackTrace(')) {
callingFuncFound = true
}
}
println(stackTraceStr)
}
Some explanations:
The output is concatenated into a single string to avoid being mixed with "[Pipeline] echo" message prefix of Jenkins Pipeline's println()).
The number of "unnecessary" upper stack trace elements related to the NPE is different in Jenkinsfile and in a normal command line. This is why I calculate callingFuncFound and don't use just something like e.getStackTrace()[2..-1] to skip them.

Resources