subprocess.Popen hangs main application

subprocess.Popen hangs main application - python-3.x

I'm trying to follow an example of code that I found here and I've modified the code a bit so it looks like this in my main application
def send_to_printer(pdffile):
acrobat = r'C:\Program Files (x86)\Adobe\Acrobat 11.0\Acrobat\Acrobat.exe'
# '"%s"'is to wrap double quotes around paths
# as subprocess will use list2cmdline internally if we pass it a list
# which escapes double quotes and Adobe Reader doesn't like that
cmd = '"{}" /N /T "{}" "{}"'.format(acrobat, pdffile, printer_name)
proc = subprocess.Popen(cmd, stdout=subprocess.PIPE, stderr=subprocess.PIPE)
stdout, stderr = proc.communicate()
exit_code = proc.wait()
If I run this bit of code all on it's own and hand it the pdf I'm trying to print it works quite well. It is when I try and call it from my main application that it causes problems. Basically what I am doing is collecting a bunch of individual pdf's and assembling them together and then printing them so they are double sided.
The code that is calling this looks like this.
output1 = PdfFileWriter()
for pdf in args[:len(args)//2]:
page = PdfFileReader(pdf).getPage(0)
output1.addPage(page)
outputStream1 = open('front_pages_to_print.pdf', 'wb')
output1.write(outputStream1)
outputStream1.close()
send_to_printer('front_pages_to_print.pdf')
When I run the above code before sending it to the printer, it prints the first pages and then hangs. I've also tried just calling the individual files on their own, but it results in the same behavior. It prints the first page and hangs. I read up a bit on it and supposedly using proc.wait() can cause a deadlock if you don't use commuicate() as mentioned here. However in the code that I am following it has the line stdout, stderr = proc.communicate() which I am assuming is handling this? I have to be honest though, I'm trying to understand the code and don't quite get it. Anyone have any suggestions on this?
Thanks
Edit - This is on windows 10. I forgot to mention that.

So after fooling around with the debugger it was waiting for
stdout, stderr = proc.communicate()
exit_code = proc.wait()
Both of these ended up not receiving a response I guess? I don't quite understand why but when I took them out it started working. It could be that adobe opened, and sent the document to the printer and then closed before proc.communicate() could be called.

Related

Why would a python script keep running after the output is generated (strange behavior)?

Background: The purpose of this script is to take eight very large (~7GB) FASTQ files, subsample each, and concatenate each subsample into one "master" FASTQ file. The resulting file is about 60GB. Each file is subsampled to 120,000,000 lines.
The issue: The basic purpose of this script is to output a huge file. I have print statements & time stamps in my code so I know that it goes through the entire script, processes the input files and creates the output files. After I see the final print statement, I go to my directory and see that the output file has been generated, it's the correct size, and it was last modified a while ago, despite the fact that the script is still running. At this point, however, the code has still not finished running, and it will actually stall there for about 2-3 hours before I can enter anything into my terminal again.
My code is behaving like it gets stuck on the last line of the script even after it's finished creating the output file.
I'm hoping someone might be able to identify what's causing this weird behavior. Below is a dummy version of what my script does:
import random
import itertools
infile1 = "sample1_1.fastq"
inFile2 = "sample1_2.fastq"
with open(infile1, 'r') as file_1:
f1 = file_1.read()
with open(inFile2, 'r') as file_2:
f2 = file_2.read()
fastq1 = f1.split('\n')
fastq2 = f2.split('\n')
def subsampleFASTQ(compile1, compile2):
random.seed(42)
random_1 = random.sample(compile1, 30000000)
random.seed(42)
random_2 = random.sample(compile2, 30000000)
return random_1, random_2
combo1, combo2 = subsampleFASTQ(fastq1, fastq2)
with open('sampleout_1.fastq', 'w') as out1:
out1.write('\n'.join(str(i) for i in combo1))
with open('sampleout_2.fastq', 'w') as out2:
out2.write('\n'.join(str(i) for i in combo2))
My ideas of what it could be:
File size is causing some slowness
There is some background process running in this script that wont let it finish (but i have no idea how to debug that-- any resources would be appreciated)

How do I pass through or wrap the print command(stdout) so that print also calls a function every call?

I am trying to automating a long running job, and I want to be able to upload all console outputs to another log like on CloudWatch Logs. For the most part this can be done by making and using a custom function instead of print. But there are functions in MachineLearning like Model.summary() or progress bars while training that outputs to stdout on their own.
I can get all get all console outputs at the very end, via an internal console log. But what I need is real-time uploading of stdout as its called by whomever. So that one can check the progress by taking a look at the logs on Cloudwatch instead of having to log into the machine and check the internal console logs.
Basically what I need is:
From: call_to_stdout -> Console(and probably other stuff)
To: call_to_stdout -> uploadLog() -> Console(and probably other stuff)
pseudocode of what I need
class stdout_PassThru:
def __init__(self, in_old_stdout):
self.old_stdout = in_old_stdout
def write(self, msg):
self.old_stdout.write(msg)
uploadLogToCloudwatch(msg)
def uploadLogToCloudwatch(msg):
# Botocore stuff to upload to Cloudwatch
myPassThru = stdout_PassThru(sys.stdout)
sys.stdout = myPassThru
I've tried googling this, but the best I ever get is stringIO stuff, where I can capture stdout, but I cannot do anything with it until the function I called ends and I can insert code again. I would like to run my upload Log code everytime stdout is used.
Is this even possible?
Please and thank you.
EDIT: Someone suggested redirect/output to file. The problem is that, that just streams/writes to the file as things are outputted. I need to call a function that does work on each call to stdout, which is not a stream. If stdout outputs everytime it flushes itself, then having the function call then would be good too.

I solved my problem. Sort of hidden in some other answers.
The initial problem I had with this solution is that when it is tested within a Jupyter Notebook, the sys.stdout = myClass(sys.stdout) causes Jupyter to... wait? Not sure but it never finishes processing the paragraph.
But when I put it into a python file and ran with python test.py it ran perfectly and as expected.
This allows me to in a sense pass thru calls to print, while executing my own function every call to print.
def addLog(message):
# my boto function to upload Cloudwatch logs
class sendToLog:
def __init__(self, stream):
self.stream = stream
def write(self, o):
self.stream.write(o)
addLog(o)
self.stream.flush()
def writelines(self, o):
self.stream.writelines(o)
addLog(o)
self.stream.flush()
def __getattr__(self, attr):
return getattr(self.stream, attr)
sys.stdout = sendToLog(sys.stdout)

Running console window in background for GUI using tkinter on Windows 10

So I have this GUI that I made with tkinter and everything works well. What it does is connects to servers and sends commands for both Linux or Windows. I went ahead and used pyinstaller to create a windowed GUI without console and when I try to uses a specific function for sending Windows commands it will fail. If I create the GUI with a console that pops up before the GUI, it works like a charm. What I'm trying to figure out is how to get my GUI to work with the console being invisible to the user.
The part of my code that has the issue revolves around subprocess. To spare you all from the 400+ lines of code I wrote, I'm providing the specific code that has issues. Here is the snippet:
def rcmd_in(server):
import subprocess as sp
for i in command_list:
result = sp.run(['C:/"Path to executable"/rcmd.exe', '\\\\' + server, i],
universal_newlines=True, stdout=sp.PIPE, stderr=sp.STDOUT)
print(result.stdout)
The argument 'server' is passed from another function that calls to 'rcmd_in' and 'command_list' is a mutable list created in the root of the code, accessible for all functions.
Now, I have done my due diligence. I scoured multiple searches and came up with an edit to my code that makes an attempt to run my code with that console invisible, found using info from this link: recipe-subprocess. Here is what the edit looks like:
def rcmd_in(server):
import subprocess as sp
import os, os.path
si = sp.STARTUPINFO()
si.dwFlags |= sp.STARTF_USESHOWWINDOW
for i in command_list:
result = sp.run(['C:/"Path to executable"/rcmd.exe', '\\\\' + server, i],
universal_newlines=True, stdin=sp.PIPE, stdout=sp.PIPE,
stderr=sp.STDOUT, startupinfo=si, env=os.environ)
print(result.stdout)
The the problem I have now is when it runs an error of "Error:8 - Internal error -109" pops up. Let me add I tried using functions 'call()', 'Popen()', and others but only 'run()' seems to work.
I've reached a point where my brain hurts and I can use some help. Any suggestions? As always I am forever great full for anyone's help. Thanks in advance!

I figured it out and it only took me 5 days! :D
Looks like the reason the function would fail falls on how Windows handles stdin. I found a post that helped me edit my code to work with pyinstaller -w (--noconsole). Here is the updated code:
def rcmd_in(server):
import subprocess as sp
si = sp.STARTUPINFO()
si.dwFlags |= sp.STARTF_USESHOWWINDOW
for i in command_list:
result = sp.Popen(['C:/"Path to executable"/rcmd.exe', '\\\\' + server, i],
universal_newlines=True, stdin=sp.PIPE, stdout=sp.PIPE,
stderr=sp.PIPE, startupinfo=si)
print(result.stdout.read())
Note the change of functions 'run()' to 'Popen()'. The 'run()' function will not work with the print statement at the end. Also, for those of you who are curious the 'si' variable I created is preventing 'subprocess' from opening a console when being ran while using a GUI. I hope this will become useful to someone struggling with this. Cheers

How to redirect the stdout of a multiprocessing.Process

I'm using Python 3.7.4 and I have created two functions, the first one executes a callable using multiprocessing.Process and the second one just prints "Hello World". Everything seems to work fine until I try redirecting the stdout, doing so prevents me from getting any printed values during the process execution. I have simplified the example to the maximum and this is the current code I have of the problem.
These are my functions:
import io
import multiprocessing
from contextlib import redirect_stdout
def call_function(func: callable):
queue = multiprocessing.Queue()
process = multiprocessing.Process(target=lambda:queue.put(func()))
process.start()
while True:
if not queue.empty():
return queue.get()
def print_hello_world():
print("Hello World")
This works:
call_function(print_hello_world)
The previous code works and successfully prints "Hello World"
This does not work:
with redirect_stdout(io.StringIO()) as out:
call_function(print_hello_world)
print(out.getvalue())
With the previous code I do not get anything printed in the console.
Any suggestion would be very much appreciated. I have been able to narrow the problem to this point and I think is related to the process ending after the io.StringIO() is already closed but I have no idea how to test my hypothesis and even less how to implement a solution.

This is the workaround I found. It seems that if I use a file instead of a StringIO object I can get the things to work.
with open("./tmp_stdout.txt", "w") as tmp_stdout_file:
with redirect_stdout(tmp_stdout_file):
call_function(print_hello_world)
stdout_str = ""
for line in tmp_stdout_file.readlines():
stdout_str += line
stdout_str = stdout_str.strip()
print(stdout_str) # This variable will have the captured stdout of the process
Another thing that might be important to know is that the multiprocessing library buffers the stdout, meaning that the prints only get displayed after the function has executed/failed, to solve this you can force the stdout to flush when needed within the function that is being called, in this case, would be inside print_hello_world (I actually had to do this for a daemon process that needed to be terminated if it ran for more than a specified time)
sys.stdout.flush() # This will force the stdout to be printed

Output data from subprocess command line by line

I am trying to read a large data file (= millions of rows, in a very specific format) using a pre-built (in C) routine. I want to then yeild the results of this, line by line, via a generator function.
I can read the file OK, but where as just running:
<command> <filename>
directly in linux will print the results line by line as it finds them, I've had no luck trying to replicate this within my generator function. It seems to output the entire lot as a single string that I need to split on newline, and of course then everything needs reading before I can yield line 1.
This code will read the file, no problem:
import subprocess
import config
file_cmd = '<command> <filename>'
for rec in (subprocess.check_output([file_cmd], shell=True).decode(config.ENCODING).split('\n')):
yield rec
(ENCODING is set in config.py to iso-8859-1 - it's a Swedish site)
The code I have works, in that it gives me the data, but in doing so, it tries to hold the whole lot in memory. I have larger files than this to process which are likely to blow the available memory, so this isn't an option.
I've played around with bufsize on Popen, but not had any success (and also, I can't decode or split after the Popen, though I guess the fact I need to split right now is actually my problem!).

I think I have this working now, so will answer my own question in the event somebody else is looking for this later ...
proc = subprocess.Popen(shlex.split(file_cmd), stdout=subprocess.PIPE)
while True:
output = proc.stdout.readline()
if output == b'' and proc.poll() is not None:
break
if output:
yield output.decode(config.ENCODING).strip()

Develop Reference

node.js excel linux python-3.x azure haskell apache-spark rust .htaccess string

subprocess.Popen hangs main application - python-3.x

Related

Why would a python script keep running after the output is generated (strange behavior)?

How do I pass through or wrap the print command(stdout) so that print also calls a function every call?

Running console window in background for GUI using tkinter on Windows 10

How to redirect the stdout of a multiprocessing.Process

Output data from subprocess command line by line

Categories

Resources