Python subprocess hangs as Popen when piping output - multithreading

I've been through dozens of the "Python subprocess hangs" articles here and think I've addressed all of the issues presented in the various articles in the code below.
My code intermittently hangs at the Popen command. I am running 4 threads using multiprocessing.dummy.apply_async, each of those threads starts a subprocess and then reads the output line by line and prints a modified version of it to stdout.
def my_subproc():
exec_command = ['stdbuf', '-i0', '-o0', '-e0',
sys.executable, '-u',
os.path.dirname(os.path.realpath(__file__)) + '/myscript.py']
proc = subprocess.Popen(exec_command, env=env, stdout=subprocess.PIPE, stderr=subprocess.STDOUT, bufsize=1)
print "DEBUG1", device
for line in iter(proc.stdout.readline, b''):
with print_lock:
for l in textwrap.wrap(line.rstrip(), LINE_WRAP_DEFAULT):
The code above is run from apply_async:
pool = multiprocessing.dummy.Pool(4)
for i in range(0,4):
pool.apply_async(my_subproc)
Intermittently the subprocess will hang at subprocess.Popen, the statement "DEBUG1" is not printed. Sometimes all threads will work, sometimes as few as 1 of the 4 will work.
I'm not aware that this exhibits any of the known deadlock situations for Popen. Am I wrong?

There is an insidious bug in subprocess.Popen() caused by io buffering of stdout (possibly stderr). There is a limit of around 65536 characters in the child process io buffer. If the child process writes enough output the child process will "hang" waiting for the buffer to be flushed - a deadlock situation. The authors of subprocess.py seem to believe this is a problem caused by the child, even though a subprocess.flush would be welcome. Pearson Anders pearson,
https://thraxil.org/users/anders/posts/2008/03/13/Subprocess-Hanging-PIPE-is-your-enemy/ Has a simple solution but you have to pay attention. As he says, "tempfile.TemporaryFile() is your friend." In my case I am running an application in a loop to batch process a bunch of files, the code for the solution is:
with tempfile.TemporaryFile() as fout:
sp.run(['gmat', '-m', '-ns', '-x', '-r', str(gmat_args)], \
timeout=cpto, check=True, stdout=fout, stderr=fout)
The fix above still deadlocks after processing about 20 files. An improvement but not good enough, since I need to process hundreds of files in a batch. I came up with the below "crowbar" approach.
proc = sp.Popen(['gmat', '-m', '-ns', '-x', '-r', str(gmat_args)], stdout=sp.PIPE, stderr=sp.STDOUT)
""" Run GMAT for each file in batch.
Arguments:
-m: Start GMAT with a minimized interface.
-ns: Start GMAT without the splash screen showing.
-x: Exit GMAT after running the specified script.
-r: Automatically run the specified script after loading.
Note: The buffer passed to Popen() defaults to io.DEFAULT_BUFFER_SIZE, usually 62526 bytes.
If this is exceeded, the child process hangs with write pending for the buffer to be read.
https://thraxil.org/users/anders/posts/2008/03/13/Subprocess-Hanging-PIPE-is-your-enemy/
"""
try:
(outs, errors) = proc.communicate(cpto)
"""Timeout in cpto seconds if process does not complete."""
except sp.TimeoutExpired as e:
logging.error('GMAT timed out in child process. Time allowed was %s secs, continuing', str(cpto))
logging.info("Process %s being terminated.", str(proc.pid))
proc.kill()
""" The child process is not killed by the system. """
(outs, errors) = proc.communicate()
""" And the stdout buffer must be flushed. """
The basic idea is to kill the process and flush the buffer on each timeout. I moved the TimeoutExpired exception into the batch processing loop so that after killing the process, it continues with the next. This is harmless if the timeout value is sufficient to allow gmat to complete (albeit slower). I find that the code will process from 3 to 20 files before it times out.
This really seems like a bug in subprocess.

This appears to be a bad interaction with multiprocessing.dummy. When I use multiprocessing (not the .dummy threading interface) I'm unable to reproduce the error.

Related

How can I stop a python subprocess after a certain amount of time?

I am grading python assignments in an entry level class and I run their programs with the same input as my sample program and compare the outputs:
proc = subprocess.Popen([sys.executable, arg_path],
stdin=subprocess.PIPE,
stdout=subprocess.PIPE,
stderr=subprocess.PIPE,
universal_newlines=True,
bufsize=0)
for the_arg in arg_list:
proc.stdin.write(f"{the_arg}\n")
sample_output = ""
for line in proc.stdout:
sample_output += line
The problem is if they really mess up and their program is waiting for more input than I am sending, the program will hange forever when I try to read from proc.stdout. I cannot find how, using subprocess.Popen, how to set time limit (like 10 seconds) after which I can kill the process. Since I have a wrapper that runs each student it's a real problem if one of them hangs on me.
TIA
I have tried proc.communitcate(timeout=10) and that just kills the process no matter what, even is everything is correct.

Get realtime output from a long-running executable using python

It's my first time asking a question on here so bear with me.
I'm trying to make a python3 program that runs executable files for x amount of time and creates a log of all output in a text file. For some reason the code I have so far works only with some executables. I'm new to python and especially subprocess so any help is appreciated.
import time
import subprocess
def CreateLog(executable, timeout=5):
time_start = time.time()
process = subprocess.Popen(executable, stdout=subprocess.PIPE, stderr=subprocess.DEVNULL, text=True)
f = open("log.txt", "w")
while process.poll() is None:
output = process.stdout.readline()
if output:
f.write(output)
if time.time() > time_start + timeout:
process.kill()
break
I was recently experimenting with crypto mining and came across nanominer, I tried using this python code on nanominer and the log file was empty. I am aware that nanominer already logs its output, but the point is why does the python code fail.
You are interacting through .poll() (R U dead yet?) and .readline().
It's not clear you want to do that.
There seems to be two cases for your long-lived child:
it runs "too long" silently
it runs forever, regularly producing output text at e.g. one-second intervals
The 2nd case is the easy one.
Just use for line in process.stdout:, consume the line,
peek at the clock, and maybe send a .kill() just as you're already doing.
No need for .poll(), as child exiting will produce EOF on that pipe.
For the 1st case, you will want to set an alarm.
See https://docs.python.org/3/library/signal.html#example
signal.signal(signal.SIGALRM, handler)
signal.alarm(5)
After "too long", five seconds, your handler will run.
It can do anything you desire.
You'll want it to have access to the process handle,
which will let you send a .kill().

Python: running many subprocesses from different threads is slow

I have a program with 1 process that starts a lot of threads.
Each thread might use subprocess.Popen to run some command.
I see that the time to run the command increases with the number of threads.
Example:
>>> def foo():
... s = time.time()
... subprocess.Popen('ip link show'.split(), stdout=subprocess.PIPE, stderr=subprocess.PIPE, universal_newlines=True).communicate()
... print(time.time() - s)
...
>>> foo()
0.028950929641723633
>>> [threading.Thread(target=foo).start() for _ in range(10)]
0.058995723724365234
0.07323050498962402
0.09158825874328613
0.11541390419006348 # !!!
0.08147192001342773
0.05238771438598633
0.0950784683227539
0.10175108909606934 # !!!
0.09703755378723145
0.06497764587402344
Is there another way of executing a lot of commands from single process in parallel that doesn't decrease the performance?
Python's threads are, of course, concurrent, but they do not really run in parallel because of the GIL. Therefore, they are not suitable for CPU-bound applications. If you need to truly parallelize something and allow it to run on all CPU cores, you will need to use multiple processes. Here is a nice answer discussing this in more detail: What are the differences between the threading and multiprocessing modules?.
For the above example, multiprocessing.pool may be a good choice (note that there is also a ThreadPool available in this module).
from multiprocessing.pool import Pool
import subprocess
import time
def foo(*args):
s = time.time()
subprocess.Popen('ip link show'.split(), stdout=subprocess.PIPE, stderr=subprocess.PIPE, universal_newlines=True).communicate()
return time.time() - s
if __name__ == "__main__":
with Pool(10) as p:
result = p.map(foo, range(10))
print(result)
# [0.018695592880249023, 0.009021520614624023, 0.01150059700012207, 0.02113938331604004, 0.014114856719970703, 0.01342153549194336, 0.011168956756591797, 0.014746427536010742, 0.013572454452514648, 0.008752584457397461]
result = p.map_async(foo, range(10))
print(result.get())
# [0.00636744499206543, 0.011589527130126953, 0.010645389556884766, 0.0070612430572509766, 0.013571739196777344, 0.009610414505004883, 0.007040739059448242, 0.010993719100952148, 0.012415409088134766, 0.0070383548736572266]
However, if your function is similar to the example in that it mostly just launches other processes and doesn't do a lot of calculations - I doubt parallelizing it will make much of a difference because the subprocesses can already run in parallel. Perhaps the slowdown occurs because your whole system gets overwhelmed for a moment because of all those processes (could be CPU usage is high or too many disk reads/writes are attempted within a short time). I would suggest taking a close look at system resources (Task Manager etc.) while running the program.
maybe it has nothing to do with python: Opening a new shell = opening a new file since basically everything is a file on linux
take a look at your limit for open files with this command (default is 1024):
ulimit
and try to raise it with this command to see if your code gets faster :
ulimit -n 2048

python3 multiprocessing.Pool with maxtasksperchild=1 does not terminate

When using multiprocessing.Pool in python 3.6 or 3.7 with maxtasksperchild=1, I noticed that some processes spawned by the pool are hanging and do not quit, even though the callback to their tasks was already executed. As a result, Pool.join() will block forever, even though all tasks are finished. In the process tree, running but idle child processes can be seen. The problem does not occur if maxtasksperchild=None.
The problem seems to be related to what the callback precisely does. The docs point out that the callback "should return immediately", as it will block other threads managing the pool.
A minimal example to reproduce this behavior on my machine is as follows: (Give it a few tries or increase the number of tasks when it does not block forever.)
from multiprocessing import Pool
from os import getpid
from random import random
from time import sleep
def do_stuff():
pass
def cb(arg):
sleep(random()) # can be replaced with print('foo')
p = Pool(maxtasksperchild=1)
number_of_tasks = 100 # a value may depend on your machine -- for mine 20 is sufficient to trigger the behavior
for i in range(number_of_tasks):
p.apply_async(do_stuff, callback=cb)
p.close()
print("joining ... (this should take just seconds)")
print("use the following command to watch the process tree:")
print(" watch -n .2 pstree -at -p %i" % getpid())
p.join()
Contrary to what I expected, p.join() in the last line will block forever even though do_stuff and cb were both called 100 times.
I am aware that sleep(random()) is in violation of the docs, but is print() also taking 'too long'? The way the docs are written suggest that a non-blocking callback function is required for performance and efficiency and make not clear that a 'slow' callback function will break the pool entirely.
Is print() forbidden in any multiprocessing.Pool callback function? (How to replace that functionality? What is "returning immediately", what is not?)
If yes, should the python documentation be updated to make this clear?
If yes, is it good python practice to rely on "fast" execution of python threads? Does this violate the rule that one should not make assumptions on execution order of threads?
Should I report this to the python bug tracker?

Preventing threaded subprocess.popen from terminating my main script when child is killed?

Python 2.7.3 on Solaris 10
Questions
When my subprocess has an internal Segmentation Fault(core) issue or a user externally kills it from the shell with a SIGTERM or SIGKILL, my main program's signal handler handles a SIGTERM(-15) and my parent program exits. Is this real? or is it a bad python build?
Background and Code
I have a python script that first spawns a worker management thread. The worker management thread then spawns one or more worker threads. I have other stuff going on in my main thread that I cannot block. My management thread stuff and worker threads are rock-solid. My services run for years without restarts but then we have this subprocess.Popen scenario:
In the run method of the worker thread, I am using:
class workerThread(threading.Thread):
def __init__(self) :
super(workerThread, self).__init__()
...
def run(self)
...
atempfile = tempfile.NamedTempFile(delete=False)
myprocess = subprocess.Popen( ['third-party-cmd', 'with', 'arguments'], shell=False, stdin=subprocess.PIPE, stdout=atempfile, stderr=subprocess.STDOUT,close_fds=True)
...
I need to use myprocess.poll() to check for process termination because I need to scan the atempfile until I find relevant information (the file may be > 1 GiB) and I need to terminate the process because of user request or because the process has been running too long. Once I find what I am looking for, I will stop checking the stdout temp file. I will clean it up after the external process is dead and before the worker thread terminates. I need the stdin PIPE in case I need to inject a response to something interactive in the child's stdin stream.
In my main program, I set a SIGINT and SIGTERM handler for me to perform cleanup, if my main python program is terminated with SIGTERM or SIGINT(Ctrl-C) if running from the shell.
Does anyone have a solid 2.x recipe for child signal handling in threads?
ctypes sigprocmask, etc.
Any help would be very appreciated. I am just looking for an 'official' recipe or the BEST hack, if one even exists.
Notes
I am using a restricted build of Python. I must use 2.7.3. Third-party-cmd is a program I do not have source for - modifying it is not possible.
There are many things in your description that look strange. First thing, you have a couple of different threads and processes. Who is crashing, who's receinving SIGTERM and who's receiving SIGKILL and due to which operations ?
Second: why does your parent receive SIGTERM ? It can't be implicitly sent. Someone is calling kill to your parent process, either directly or indirectly (for example, by killing the whole parent group).
Third point: how's your program terminating when you're handling SIGTERM ? By definition, the program terminates if it's not handled. If it's handled, it's not terminated. What's really happenning ?
Suggestions:
$ cat crsh.c
#include <stdio.h>
int main(void)
{
int *f = 0x0;
puts("Crashing");
*f = 0;
puts("Crashed");
return 0;
}
$ cat a.py
import subprocess, sys
print('begin')
p = subprocess.Popen('./crsh')
a = raw_input()
print(a)
p.wait()
print('end')
$ python a.py
begin
Crashing
abcd
abcd
end
This works. No signal delivered to the parent. Did you isolate the problem in your program ?
If the problem is a signal sent to multiple processes: can you use setpgid to set up a separate process group for the child ?
Is there any reason for creating the temporary file ? It's 1 GB files being created in your temporary directory. Why not piping stdout ?
If you're really sure you need to handle signals in your parent program (why didn't you try/except KeyboardInterrupt, for example ?): could signal() unspecified behavior with multi threaded programs be causing those problems (for example, dispatching a signal to a thread that does not handle signals) ?
NOTES
The effects of signal() in a multithreaded process are unspecified.
Anyway, try to explain with more precision what are the threads and process of your program, what they do, how were the signal handlers set up and why, who is sending signals, who is receiving, etc, etc, etc, etc, etc.

Resources