RxScala: How to keep the thread doing Observable.interval alive? - multithreading

I am trying to write a simple RxScala program:
import rx.lang.scala.Observable
import scala.concurrent.duration.DurationInt
import scala.language.{implicitConversions, postfixOps}
object Main {
def main(args: Array[String]): Unit = {
val o = Observable.interval(1 second)
o.subscribe(println(_))
}
}
When I run this program, I do not see anything printed out. I suspect that this is because that thread producing the numbers in Observable.interval dies. I noticed a call to waitFor(o) in the RxScalaDemo, but I can't figure out where that is imported from.
How do I keep this program running for ever printing the number sequence?

Here is one way to block the main thread from exiting:
val o = Observable.interval(1 second)
val latch = new CountDownLatch(1)
o.subscribe(i => {
print(i)
if (i >= 5) latch.countDown()
})
latch.await()
This is a fairly common pattern, use CountDownLatch.await to block the main thread and then countDown the latch when you are done with what you are doing, thus releasing the main thread

You're not seeing anything because your main method exits immediately after you subscribe to the Observable. At that point, your program is done.
A common trick for test programs like this is to read a byte from stdin once you've subscribed.

Related

How do I check something on an interval asynchronously in python3?

Let's say I have a file that I know is going to be updated sometime in the next minute. I want a Python function to fire (roughly) as soon as that happens.
I set up something rudimentary to await that input:
import os, time
modification_time = os.path.getmtime(file)
def wait_for_update(file):
modified = False
while not modified:
if modification_time == os.path.getmtime(file):
time.sleep(1)
else:
modified = True
file_contents = file.read()
Theoretically, that should work fine.
However, I'm coming from a Javascript background. In Javascript, a single-threaded language, the equivalent of that function will stop the entire programme until that while loop finishes - and precious seconds of time that could be spent processing other input is wasted.
In Javascript, there's a handy window.setTimeout() function that queues up a function to be fired at some point in the future when the main thread is clear of processes. It's not truly asynchronous, it just offsets that operation a little bit into the future, leaving other stuff to happen in the meantime.
So that same process written in recursive JS might look something like this:
function waitForUpdate(file) {
var modified = false;
modified = compareFileModificationDates();
if(modified) {
modified = true;
// do other things
} else {
setTimeout(waitForUpdate(file), 1000);
}
}
Do I need to do something similar in Python? Or is time.sleep fine?

Bug in Python? threading.Thread.start() does not always return

I have a tiny Python script which (in my eyes) makes threading.Thread.start() behave unexpectedly since it does not return immediately.
Inside a thread I want to call a method from a boost::python based object which will not return immediately.
To do so I wrap the object/method like this:
import threading
import time
import my_boostpython_lib
my_cpp_object = my_boostpython_lib.my_cpp_class()
def some_fn():
# has to be here - otherwise .start() does not return
# time.sleep(1)
my_cpp_object.non_terminating_fn() # blocks
print("%x: 1" % threading.get_ident())
threading.Thread(target=some_fn).start()
print("%x: 2" % threading.get_ident()) # will not always be called!!
And everything works fine as long as I run some code before my_cpp_object.non_terminating_fn(). If I don't, .start() will block the same way as calling .run() directly would.
Printing just a line before calling the boost::python function is not enough, but e.g. printing two lines or calling time.sleep() makes start() return immediately as expected.
Can you explain this behavior? How would I avoid this (apart from calling sleep() before calling a boost::python function)?
This behavior is (as in most cases when you believe in a bug in an interpreter/compiler) not a bug in Python but a race condition covering the behavior you have to expect because of the Python GIL (also discussed here).
As soon as the non-Python function my_cpp_object.non_terminating_fn() has been started the GIL doesn't get released until it returns and keeps the interpreter from executing any other command.
So time.sleep(1) doesn't help here anyway because the code following my_cpp_object.non_terminating_fn() would not be executed until the GIL gets released.
In case of boost::python and of course in case you can modify the C/C++ part you can release the GIL manually as described here.
A small example (from the link above) could look like this (in the boost::python wrapper code)
class scoped_gil_release {
public:
inline scoped_gil_release() {
m_thread_state = PyEval_SaveThread();
}
inline ~scoped_gil_release() {
PyEval_RestoreThread(m_thread_state);
m_thread_state = NULL;
}
private:
PyThreadState * m_thread_state;
};
int non_terminating_fn_wrapper() {
scoped_gil_release scoped;
return non_terminating_fn();
}

Why does scala hang evaluating a by-name parameter in a Future?

The below (contrived) code attempts to print a by-name String parameter within a future, and return when the printing is complete.
import scala.concurrent._
import concurrent.ExecutionContext.Implicits.global
import scala.concurrent.duration._
class PrintValueAndWait {
def printIt(param: => String): Unit = {
val printingComplete = future {
println(param); // why does this hang?
}
Await.result(printingComplete, Duration.Inf)
}
}
object Go {
val str = "Rabbits"
new PrintValueAndWait().printIt(str)
}
object RunMe extends App {
Go
}
However, when running RunMe, it simply hangs while trying to evaluate param. Changing printIt to take in its parameter by-value makes the application return as expected. Alternatively, changing printIt to simply print the value and return synchronously (in the same thread) seems to work fine also.
What's happening exactly here? Is this somehow related to the Go object not having been fully constructed yet, and so the str field not being visible yet to the thread attempting to print it? Is hanging the expected behaviour here?
I've tested with Scala 2.10.3 on both Mac OS Mavericks and Windows 7, on Java 1.7.
Your code is deadlocking on the initialization of the Go object. This is a known issue, see e.g. SI-7646 and this SO question
Objects in scala are lazily initialized and a lock is taken during this time to prevent two threads from racing to initialize the object. However, if two threads simultaneously try and initialize an object and one depends on the other to complete, there will be a circular dependency and a deadlock.
In this particular case, the initialization of the Go object can only complete once new PrintValueAndWait().printIt(str) has completed. However, when param is a by name argument, essentially a code block gets passed in which is evaluated when it is used. In this case the str argument in new PrintValueAndWait().printIt(str) is shorthand for Go.str, so when the thread the future runs on tries to evaluate param it is essentially calling Go.str. But since Go hasn't completed initialization yet, it will try to initialize the Go object too. The other thread initializing Go has a lock on its initialization, so the future thread blocks. So the first thread is waiting on the future to complete before it finishes initializing, and the future thread is waiting for the first thread to finish initializing: deadlock.
In the by value case, the string value of str is passed in directly, so the future thread doesn't try to initialize Go and there is no deadlock.
Similarly, if you leave param as by name, but change Go as follows:
object Go {
val str = "Rabbits"
{
val s = str
new PrintValueAndWait().printIt(s)
}
}
it won't deadlock, since the already evaluated local string value s is passed in, instead of Go.str, so the future thread won't try and initialize Go.

How to terminate a Python3 thread correctly while it's reading a stream

I'm using a thread to read Strings from a stream (/dev/tty1) while processing other things in the main loop. I would like the Thread to terminate together with the main program when pressing CTRL-C.
from threading import Thread
class myReader(Thread):
def run(self):
with open('/dev/tty1', encoding='ascii') as myStream:
for myString in myStream:
print(myString)
def quit(self):
pass # stop reading, close stream, terminate the thread
myReader = Reader()
myReader.start()
while(True):
try:
pass # do lots of stuff
KeyboardInterrupt:
myReader.quit()
raise
The usual solution - a boolean variable inside the run() loop - doesn't work here. What's the recommended way to deal with this?
I can just set the Daemon flag, but then I won't be able to use a quit() method which might prove valuable later (to do some clean-up). Any ideas?
AFAIK, there is no built-in mechanism for that in Python 3 (just as in Python 2). Have you tried the proven Python 2 approach with PyThreadState_SetAsyncExc, documented here and here, or the alternative tracing approach here?
Here's a slightly modified version of the PyThreadState_SetAsyncExc approach from above:
import threading
import inspect
import ctypes
def _async_raise(tid, exctype):
"""raises the exception, performs cleanup if needed"""
if not inspect.isclass(exctype):
exctype = type(exctype)
res = ctypes.pythonapi.PyThreadState_SetAsyncExc(ctypes.c_long(tid), ctypes.py_object(exctype))
if res == 0:
raise ValueError("invalid thread id")
elif res != 1:
# """if it returns a number greater than one, you're in trouble,
# and you should call it again with exc=NULL to revert the effect"""
ctypes.pythonapi.PyThreadState_SetAsyncExc(tid, None)
raise SystemError("PyThreadState_SetAsyncExc failed")
def stop_thread(thread):
_async_raise(thread.ident, SystemExit)
Make your thread a daemon thread. When all non-daemon threads have exited, the program exits. So when Ctrl-C is passed to your program and the main thread exits, there's no need to explicitly kill the reader.
myReader = Reader()
myReader.daemon = True
myReader.start()

Linux - Named pipes - losing data

I am using named pipes for IPC. At times the data sent between the process can be large and frequent. During these time I see lots of data loss. Are there any obvious problems in the code below that could cause this?
Thanks
#!/usr/bin/env groovy
import java.io.FileOutputStream;
def bytes = new File('/etc/passwd').bytes
def pipe = new File('/home/mohadib/pipe')
1000.times{
def fos = new FileOutputStream(pipe)
fos.write(bytes)
fos.flush()
fos.close()
}
#!/usr/bin/env groovy
import java.io.FileInputStream;
import java.io.ByteArrayOutputStream;
def pipe = new File('/home/mohadib/pipe')
def bos = new ByteArrayOutputStream()
def len = -1
byte[] buff = new byte[8192]
def i = 0
while(true)
{
def fis = new FileInputStream(pipe)
while((len = fis.read(buff)) != -1) bos.write(buff, 0, len)
fis.close()
bos.reset()
i++
println i
}
Named pipes lose their contents when the last process closes them. In your example, this can happen if the writer process does another iteration while the reader process is about to do fis.close(). No error is reported in this case.
A possible fix is to arrange that the reader process never closes the fifo. To get rid of the EOF condition when the last writer disconnects, open the fifo for writing, close the read end, reopen the read end and close the temporary write end.
This section gives me worries:
1000.times{
def fos = new FileOutputStream(pipe)
fos.write(bytes)
fos.flush()
fos.close()
}
I know that the underlying Unix write() system call does not always write the requested number of bytes. You have to check the return value to see what number was actually written.
I checked the docs for Java and it appears fos.write() has no return value, it just throws an IOException if anything goes wrong. What does Groovy do with exceptions? Are there any exceptions happening?
If you can, run this under strace and view the results of the read and write system calls. It's possible that the Java VM isn't doing the right thing with the write() system call. I know this can happen because I caught glibc's fwrite implementation doing that (ignoring the return value) two years ago.

Resources