Closing Pipe in Python - python-3.x

import multiprocessing as mp
import time
"""
1. send item via pipe
2. Receive on the other end by a generator
3. if the pipe is closed on the sending side, retrieve
all item left and then quit.
"""
def foo(conn):
for i in range(7):
time.sleep(.3)
conn.send(i)
conn.close()
def bar(conn):
while True:
try:
yield conn.recv()
except EOFError:
break
if __name__ == '__main__':
"""Choose which start method is used"""
recv_conn, send_conn = mp.Pipe(False)
p = mp.Process(target = foo, args = (send_conn,)) # f can only send msg.
p.start()
# send_conn.close()
for i in bar(recv_conn):
print(i)
I'm using Python 3.4.1 on Ubuntu 14.04 and the code is not working. At the end of the program, there is no EOFError, which should terminates the code, although the Pipe has been closed. Closing the Pipe inside a function does not close the Pipe. Why is this the case?

Uncomment your send_conn.close() line. You should be closing pipe ends in processes that don't need them. The issue is that once you launch the subprocess, the kernel is tracking two open references to the send connection of the pipe: one in the parent process, one in your subprocess.
The send connection object is only being closed in your subprocess, leaving it open in the parent process, so your conn.recv() call won't raise EOFError. The pipe is still open.
This answer may be useful to you as well.
I verified that this code works in Python 2.7.6 if you uncomment the send_conn.close() call.

Related

I can implement Python multiprocessing with Spyder Windows PC, but why?

I'm so curious about this and need some advise about how can this happen? Yesterday I've tried to implement multiprocessing in Python script which is running on Spyder in Window PC. Here is the code I've first tried.
import multiprocessing
import time
start = time.perf_counter()
def do_something():
print('Sleeping 1 second...')
time.sleep(1)
print('Done sleeping')
p1 = multiprocessing.Process(target=do_something)
p2 = multiprocessing.Process(target=do_something)
p1.start()
p2.start()
p1.join()
p2.join()
finish = time.perf_counter()
print(f'Finished in {round(finish-start,2)} second(s)')
It's return an error.
AttributeError: Can't get attribute 'do_something' on <module '__main__' (built-in)
Then I search for survival from this problem and also my boss. And found this suggestion
Python's multiprocessing doesn't work in Spyder IDE
So I've followed it and installed Pycharm and try to run the code on PyCharm and it's seem to be work I didn't get AttributeError, however I got this new one instead of
RuntimeError:
An attempt has been made to start a new process before the
current process has finished its bootstrapping phase.
This probably means that you are not using fork to start your
child processes and you have forgotten to use the proper idiom
in the main module:
if __name__ == '__main__':
freeze_support()
...
The "freeze_support()" line can be omitted if the program
is not going to be frozen to produce an executable.
I've googled again then finally I got this
RuntimeError on windows trying python multiprocessing
what I have to do is adding this one line
if __name__ == '__main__':
before starting multiprocessing.
import multiprocessing
import time
start = time.perf_counter()
def do_something():
print('Sleeping 1 second...')
time.sleep(1)
print('Done sleeping')
if __name__ == '__main__':
p1 = multiprocessing.Process(target=do_something)
p2 = multiprocessing.Process(target=do_something)
p1.start()
p2.start()
p1.join()
p2.join()
finish = time.perf_counter()
print(f'Finished in {round(finish-start,2)} second(s)')
And it's work now moreover, it's not working only on PyCharm, now I can run this code on Spyder too. So that is why I have so curious? how come Spyder also work? This is quite persist because I'm also run this code on my other PC which is Window server 2016 with Spyder , I'm also do something.
Anyone can help explain what happen here why it's work?
Thank you.
There's a lot to unpack here, so I'll just give a brief overview. There's also some missing information like how you have spyder/pycharm configured, and what operating system you use, so I'll have to make some assumptions...
Based on the error messages you are probably using MacOS or Windows which means the default way python creates a child process is called spawn. This means it will start a completely new process from the python executable ("python.exe" on windows for example). It will then send a message to the new process telling it what function to execute (target), and optionally what arguments to call that function with. The new process will have to import the main file to have access to that function however, so if you are running the python interpreter in interactive mode, there is no "main" file to import, and you get the first error message: AttributeError.
The second error is also related to the importing of the "main" file. When you import a file, it basically just runs the file like any other python script. If you were to create a new child process during import that child would then also create a new child when it imports the same file. You would end up recursively creating infinite child processes until the computer crashed, so python disallows creating additional child processes during the import phase of a child process hence the RuntimeError.

Python start multiprocessing without print/logging statements from processes

I am starting two processes via multiprocessing and this is working fine. The only problems which I have are the print and debug statements from these two processes.
The hope is, to use the REPL and start the processes, like in the background. However, I do not get this to run. I always get the debug statements and therefore can't use the REPL anymore. This is how I call the processes:
processes = [
Process(target=start_viewer, args=()),
Process(target=start_server, args=(live, amount, fg))
]
for p in processes:
p.start()
Any idea on how to "mute" the process, or get them in the background?
If I correct understand you, you want to not show printing from one of processes.
You can achieve this by redirect output of the Python Interpreter.
Add sys.stdout = open("/dev/null", 'w') to the process which you want to "mute".
Full working example below.
from multiprocessing import Process
from time import sleep
import sys
def start_viewer():
sys.stdout = open("/dev/null", 'w')
while True:
print("start_viewer")
sleep(1)
def start_server():
while True:
print("start_server")
sleep(1)
if __name__ == '__main__':
processes = [
Process(target=start_viewer, args=()),
Process(target=start_server, args=())
]
for p in processes:
p.start()
Be aware that /dev/null is like passing prints to nowhere, if you want to save it you can use text file. Also to achieve multi os support you should use os.devnull.

How to redirect the stdout of a multiprocessing.Process

I'm using Python 3.7.4 and I have created two functions, the first one executes a callable using multiprocessing.Process and the second one just prints "Hello World". Everything seems to work fine until I try redirecting the stdout, doing so prevents me from getting any printed values during the process execution. I have simplified the example to the maximum and this is the current code I have of the problem.
These are my functions:
import io
import multiprocessing
from contextlib import redirect_stdout
def call_function(func: callable):
queue = multiprocessing.Queue()
process = multiprocessing.Process(target=lambda:queue.put(func()))
process.start()
while True:
if not queue.empty():
return queue.get()
def print_hello_world():
print("Hello World")
This works:
call_function(print_hello_world)
The previous code works and successfully prints "Hello World"
This does not work:
with redirect_stdout(io.StringIO()) as out:
call_function(print_hello_world)
print(out.getvalue())
With the previous code I do not get anything printed in the console.
Any suggestion would be very much appreciated. I have been able to narrow the problem to this point and I think is related to the process ending after the io.StringIO() is already closed but I have no idea how to test my hypothesis and even less how to implement a solution.
This is the workaround I found. It seems that if I use a file instead of a StringIO object I can get the things to work.
with open("./tmp_stdout.txt", "w") as tmp_stdout_file:
with redirect_stdout(tmp_stdout_file):
call_function(print_hello_world)
stdout_str = ""
for line in tmp_stdout_file.readlines():
stdout_str += line
stdout_str = stdout_str.strip()
print(stdout_str) # This variable will have the captured stdout of the process
Another thing that might be important to know is that the multiprocessing library buffers the stdout, meaning that the prints only get displayed after the function has executed/failed, to solve this you can force the stdout to flush when needed within the function that is being called, in this case, would be inside print_hello_world (I actually had to do this for a daemon process that needed to be terminated if it ran for more than a specified time)
sys.stdout.flush() # This will force the stdout to be printed

Printing from other thread when waiting for input()

I am trying to write a shell that needs to run socket connections on a seperate thread. On my testings, when print() is used while cmd.Cmd.cmdloop() waiting for input, the print is displaying wrong.
from core.shell import Shell
import time
import threading
def test(shell):
time.sleep(2)
shell.write('Doing test')
if __name__ == '__main__':
shell = Shell(None, None)
testThrd = threading.Thread(target=test, args=(shell,))
testThrd.start()
shell.cmdloop()
When the above command runs, here is what happens:
python test.py
Welcome to Test shell. Type help or ? to list commands.
>>asd
*** Unknown syntax: asd
>>[17:59:25] Doing test
As you can see, printing from another threads add output after prompt >> not in a new line. How can I do it so that it appears in a new line and prompt appears?
What you can do, is redirect stdout from your core.shell.Shell to a file like object such as StringIO. You would also redirect the output from your thread into a different file like object.
Now, you can have some third thread read both of these objects and print them out in whatever fashion you want.
You said core.shell.Shell inherits from cmd.Cmd, which allows redirection as a parameter to the constructor:
import io
import time
import threading
from core.shell import Shell
def test(output_obj):
time.sleep(2)
print('Doing test', file=output_obj)
cmd_output = io.StringIO()
thr_output = io.StringIO()
shell = Shell(stdout=cmd_output)
testThrd = threading.Thread(target=test, args=(thr_output,))
testThrd.start()
# in some other process/thread
cmd_line = cmd_output.readline()
thr_line = thr_output.readline()
That's quite difficult. Both your threads are sharing the same stdout. So the output from each of those threads are concurrently sent to your stdout buffer where they are printed in some arbitrary order.
What you need to do is coordinate the output from both threads, and that's a tough nut to crack. Even bash doesn't do that!
That said, maybe you can try using a lock to make sure your threads access stdout in a controlled manner. Check out: http://effbot.org/zone/thread-synchronization.htm

Sharing file descriptors in Python multiprocessing

I am trying to use Pythons multiprocessing module to spawn a server to receive UDP messages, modify them a little, and pass them on to a grep process started with the subprocess module. Since the stdin for a Popen subprocess accepts a file descriptor, that's what I would like to pass it.
The issue I am having is getting a file descriptor that communicates with the server process that I can pass to the grep subprocess. I have done this in the past using plain os.fork() and os.pipe(), but I would like to use multiprocessing with the spawn start method now. I have tried taking a write descriptor from os.pipe, making it inheritable, and passing it as an argument to a new process via multiprocess.Process. When I try to open it in the other process with os.fdopen(fd, 'wb'), I get an OSError for bad file descriptor. Here is a snipped of the code I have tested.
def _listen_syslog(ip_address, port, write_pipe):
f = os.fdopen(write_pipe, 'wb')
#do stuff like write to the file
def listen_syslog(ip_address, port):
r, w = os.pipe()
os.set_inheritable(w, True)
proc = mp.Process(target=_listen_syslog, args=(ip_address, port, w))
proc.start()
#this process doesn't need to write, so close it
os.close(w)
#this is the descriptor I want to pass to a grep subprocess stdin
#a similar scenario has worked before using os.fork()
return r
Finally, if it's not possible to do this with a pipe created via os.pipe(), can I use multiprocessing.Pipe(), and use the file descriptors from the connections objects fileno() function to use directly? More importantly, is that safe to do as long as I don't use the connection objects for anything else?
I found a solution. I haven't figured out how to use os.pipe(), but if I use multiprocessing.Pipe(), I can use the file descriptor from each connection object by calling their fileno() functions. Another thing I found is if you want to use the file descriptors after the connection objects are no longer referenced, you have to call os.dup() on each file descriptor, or else they will close and you will get a bad file descriptor error when the connection objects get garbage collected.
import multiprocessing as mp
def _listen_syslog(ip_address, port, write_pipe):
f = os.fdopen(write_pipe.fileno(), 'wb')
#do stuff
def listen_syslog(ip_address, port):
r, w = mp.Pipe(False)
proc = mp.Process(target=_listen_syslog, args=(ip_address, port, w))
proc.start()
return os.dup(r.fileno())

Resources