Sharing file descriptors in Python multiprocessing - python-3.x

I am trying to use Pythons multiprocessing module to spawn a server to receive UDP messages, modify them a little, and pass them on to a grep process started with the subprocess module. Since the stdin for a Popen subprocess accepts a file descriptor, that's what I would like to pass it.
The issue I am having is getting a file descriptor that communicates with the server process that I can pass to the grep subprocess. I have done this in the past using plain os.fork() and os.pipe(), but I would like to use multiprocessing with the spawn start method now. I have tried taking a write descriptor from os.pipe, making it inheritable, and passing it as an argument to a new process via multiprocess.Process. When I try to open it in the other process with os.fdopen(fd, 'wb'), I get an OSError for bad file descriptor. Here is a snipped of the code I have tested.
def _listen_syslog(ip_address, port, write_pipe):
f = os.fdopen(write_pipe, 'wb')
#do stuff like write to the file
def listen_syslog(ip_address, port):
r, w = os.pipe()
os.set_inheritable(w, True)
proc = mp.Process(target=_listen_syslog, args=(ip_address, port, w))
proc.start()
#this process doesn't need to write, so close it
os.close(w)
#this is the descriptor I want to pass to a grep subprocess stdin
#a similar scenario has worked before using os.fork()
return r
Finally, if it's not possible to do this with a pipe created via os.pipe(), can I use multiprocessing.Pipe(), and use the file descriptors from the connections objects fileno() function to use directly? More importantly, is that safe to do as long as I don't use the connection objects for anything else?

I found a solution. I haven't figured out how to use os.pipe(), but if I use multiprocessing.Pipe(), I can use the file descriptor from each connection object by calling their fileno() functions. Another thing I found is if you want to use the file descriptors after the connection objects are no longer referenced, you have to call os.dup() on each file descriptor, or else they will close and you will get a bad file descriptor error when the connection objects get garbage collected.
import multiprocessing as mp
def _listen_syslog(ip_address, port, write_pipe):
f = os.fdopen(write_pipe.fileno(), 'wb')
#do stuff
def listen_syslog(ip_address, port):
r, w = mp.Pipe(False)
proc = mp.Process(target=_listen_syslog, args=(ip_address, port, w))
proc.start()
return os.dup(r.fileno())

Related

How do I correctly pass a shared list to a Process or Thread?

I want to start a new process that executes some .py file. The new process gets information from the parent process via the stdin. The new process sends information (=results) back to the parent process via stdout. The results should be saved in a list (or similar) in the parent process. Sounds quite easy to accomplish but I'm stuck...
Here is my code:
class MultiProcessTest:
results: list = []
path = 'xxx.py'
# This method will be called in main
def do_it(self):
# The self.results list gets passed to the newly created process
proc = Process(target=self._control, args=(self.results, self.path)
proc.start()
proc.join()
# At this point the newly created process should be done with processing and all
# generated results should be stored in the self.results list
# But the list is empty!
print(self.results)
def _control(self, results: list, path):
with Popen(path, stdout=PIPE, stderr=PIPE, stdin=PIPE) as proc:
# I'm starting a thread for error handling
err_thread = threading.Thread(target=_read_errors, args=(proc,), daemon=True)
err_thread.start()
# This thread gets the results list passed as an argument and will be waiting for
# results of the newly created process
receive_thread = threading.Thread(target=_receive, args=(results, proc.stdout))
receive_thread.start()
# In the real code I send some instructions at this point to the newly created
# process via stdin
receive_thread.join()
The receive function which is running in a own thread looks like that:
def _receive(results: list, pipe: IO):
line = pipe.readline().decode('utf-8')
while line is not '':
results.append(results)
line = pipe.readline().decode('utf-8')
I checked if the receive function is actually called, which it is. The receive function writes results to the list as expected. Unfortunately the list in the parent process remains empty.
I guess it has to do something with call-by-value/call-by-reference or with shared or not shared memory.
So...can anyone explain to me why the list stays empty?
Thanks in advance, your help is highly appreciated!
I finally found the problem.
With this line I'm spawning the process that will execute some worker program:
with Popen(path, stdout=PIPE, stderr=PIPE, stdin=PIPE) as proc:
While I do want the worker program to run in a new process, I don't want to spawn a new process at this point:
proc = Process(target=self._control, args=(self.results, self.path)
If I call self._control directly everything is working fine :)

How to redirect the stdout of a multiprocessing.Process

I'm using Python 3.7.4 and I have created two functions, the first one executes a callable using multiprocessing.Process and the second one just prints "Hello World". Everything seems to work fine until I try redirecting the stdout, doing so prevents me from getting any printed values during the process execution. I have simplified the example to the maximum and this is the current code I have of the problem.
These are my functions:
import io
import multiprocessing
from contextlib import redirect_stdout
def call_function(func: callable):
queue = multiprocessing.Queue()
process = multiprocessing.Process(target=lambda:queue.put(func()))
process.start()
while True:
if not queue.empty():
return queue.get()
def print_hello_world():
print("Hello World")
This works:
call_function(print_hello_world)
The previous code works and successfully prints "Hello World"
This does not work:
with redirect_stdout(io.StringIO()) as out:
call_function(print_hello_world)
print(out.getvalue())
With the previous code I do not get anything printed in the console.
Any suggestion would be very much appreciated. I have been able to narrow the problem to this point and I think is related to the process ending after the io.StringIO() is already closed but I have no idea how to test my hypothesis and even less how to implement a solution.
This is the workaround I found. It seems that if I use a file instead of a StringIO object I can get the things to work.
with open("./tmp_stdout.txt", "w") as tmp_stdout_file:
with redirect_stdout(tmp_stdout_file):
call_function(print_hello_world)
stdout_str = ""
for line in tmp_stdout_file.readlines():
stdout_str += line
stdout_str = stdout_str.strip()
print(stdout_str) # This variable will have the captured stdout of the process
Another thing that might be important to know is that the multiprocessing library buffers the stdout, meaning that the prints only get displayed after the function has executed/failed, to solve this you can force the stdout to flush when needed within the function that is being called, in this case, would be inside print_hello_world (I actually had to do this for a daemon process that needed to be terminated if it ran for more than a specified time)
sys.stdout.flush() # This will force the stdout to be printed

Printing from other thread when waiting for input()

I am trying to write a shell that needs to run socket connections on a seperate thread. On my testings, when print() is used while cmd.Cmd.cmdloop() waiting for input, the print is displaying wrong.
from core.shell import Shell
import time
import threading
def test(shell):
time.sleep(2)
shell.write('Doing test')
if __name__ == '__main__':
shell = Shell(None, None)
testThrd = threading.Thread(target=test, args=(shell,))
testThrd.start()
shell.cmdloop()
When the above command runs, here is what happens:
python test.py
Welcome to Test shell. Type help or ? to list commands.
>>asd
*** Unknown syntax: asd
>>[17:59:25] Doing test
As you can see, printing from another threads add output after prompt >> not in a new line. How can I do it so that it appears in a new line and prompt appears?
What you can do, is redirect stdout from your core.shell.Shell to a file like object such as StringIO. You would also redirect the output from your thread into a different file like object.
Now, you can have some third thread read both of these objects and print them out in whatever fashion you want.
You said core.shell.Shell inherits from cmd.Cmd, which allows redirection as a parameter to the constructor:
import io
import time
import threading
from core.shell import Shell
def test(output_obj):
time.sleep(2)
print('Doing test', file=output_obj)
cmd_output = io.StringIO()
thr_output = io.StringIO()
shell = Shell(stdout=cmd_output)
testThrd = threading.Thread(target=test, args=(thr_output,))
testThrd.start()
# in some other process/thread
cmd_line = cmd_output.readline()
thr_line = thr_output.readline()
That's quite difficult. Both your threads are sharing the same stdout. So the output from each of those threads are concurrently sent to your stdout buffer where they are printed in some arbitrary order.
What you need to do is coordinate the output from both threads, and that's a tough nut to crack. Even bash doesn't do that!
That said, maybe you can try using a lock to make sure your threads access stdout in a controlled manner. Check out: http://effbot.org/zone/thread-synchronization.htm

understanding of subprocess, POPEN and PIPE

I am new to python and programming and I am trying to understand this code. I have spent the past few hours reading documentation and watching videos on subprocessing but I am still confused (I added snidbits of information of what I found online to comment the code as best I could).
Here are some questions I have pertaining to the code below:
when is subprocess used?
when should I use Popen verses the more convenient handles with subprocess?
what does PIPE do?
what does close_fds do?
basically I need this line of code explained
my_process=Popen(['player',my_video_File_path], stdin=PIPE, close_fds=True)
full code here:
#run UNIX commands we need to create a subprocess
from subprocess import Popen, PIPE
import os
import time
import RPi.GPIO as GPIO
my_video_file_path='/home/pi/green1.mp4'
#stdin listens for information
# PIPE connnects the stdin with stdout
#pipe, (like a pipe sending info through a tunnel from one place to another )
#STDIN (channel 0):
#Where your command draws the input from. If you don’t specify anything special this will be your keyboard input.
#STDOUT (channel 1):
#Where your command’s output is sent to. If you don’t specify anything special the output is displayed in your shell.
#to send data to the process’s stdin, you need to create the Popen object with stdin=PIPE.
#Popen interface can be used directly with subprocess
# with pipe The return value is an open file object connected to the pipe, which can be read or written depending on whether mode is 'r' (default) or 'w'.
#If we pass everything as a string, then our command is passed to the shell;
#"On Unix, if args is a string, the string is interpreted as the name or path of the program to execute. "
my_process=Popen(['player',my_video_File_path], stdin=PIPE, close_fds=True)
GPIO.setmode(GPIO.BCM)
GPIO.setup(17,GPIO.IN,pull_up_down=GPIO.PUD_UP)
GPIO.setup(22,GPIO.IN,pull_up_down=GPIO.PUD_UP)
while True:
button_state=GPIO.input(17)
button_state1=GPIO.input(22)
if button_state==False:
print("quite video")
my_process.stdin.write("q")
time.sleep(.09)
if button_state1==False:
print("full video")
my_process.stdin.write("fs")
time.sleep(5)
Regarding the difference between subprocess and Popen, here is a line from the python 3.5 docs:
For more advanced use cases, the underlying Popen interface can be used directly. (compared to using subprocess)
So, this means that subprocess is a wrapper on Popen.
PIPE is used so that your python script communicate with the subprocess via standard IO operations (you can think of print in python as a type of stdout).
So, when you do my_process.stdin.write("fs") you are sending this text (or piping it) to the standard input of your subprocess. Then your subprocess reads this text and does whatever processing it needs to do.
To further understand subprocessing, try to read standard input into a python program to see how it works. You can follow this How do you read from stdin in Python? question to work on this exercise.
Also, it might be worthwhile to learn about piping in the more general linux style. Try to read through this article.

Closing Pipe in Python

import multiprocessing as mp
import time
"""
1. send item via pipe
2. Receive on the other end by a generator
3. if the pipe is closed on the sending side, retrieve
all item left and then quit.
"""
def foo(conn):
for i in range(7):
time.sleep(.3)
conn.send(i)
conn.close()
def bar(conn):
while True:
try:
yield conn.recv()
except EOFError:
break
if __name__ == '__main__':
"""Choose which start method is used"""
recv_conn, send_conn = mp.Pipe(False)
p = mp.Process(target = foo, args = (send_conn,)) # f can only send msg.
p.start()
# send_conn.close()
for i in bar(recv_conn):
print(i)
I'm using Python 3.4.1 on Ubuntu 14.04 and the code is not working. At the end of the program, there is no EOFError, which should terminates the code, although the Pipe has been closed. Closing the Pipe inside a function does not close the Pipe. Why is this the case?
Uncomment your send_conn.close() line. You should be closing pipe ends in processes that don't need them. The issue is that once you launch the subprocess, the kernel is tracking two open references to the send connection of the pipe: one in the parent process, one in your subprocess.
The send connection object is only being closed in your subprocess, leaving it open in the parent process, so your conn.recv() call won't raise EOFError. The pipe is still open.
This answer may be useful to you as well.
I verified that this code works in Python 2.7.6 if you uncomment the send_conn.close() call.

Resources