Writing/Reading stream of output from external command in realtime - python-3.x

I was asked to run a python script in a Jenkins job, which calls external commands via subprocess package. In the Jenkins console, the output of the external commands should be printed in realtime or/and be writen to a log file.
I found a SO post about printing in realtime somehow like this:
p = subprocess.Popen(cmd, stdout=subprocess.PIPE, bufsize=1)
while True:
line = p.stdout.readline()
print (line)
if not line and p.poll() is not None:
break
So I created a function for executing external commands:
def execute_cmd(command, splitlines=True, timeout=None, output_stream=None):
pipe = Popen(command, stdout=PIPE, stderr=PIPE)
utf8_output = []
while True:
line = pipe.stdout.readline()
utf8_output.append(line)
# Add to stream here
if output_stream:
output_stream.write(line)
if not line and pipe.poll() is not None:
break
return ''.join(utf8_output)
As you see I didn't use print. That's because a requirement tells me to use a stream object which could be either streaming to a file or to output everything to the jenkins console in realtime.
So if I want to print the external output to the jenkins console, I wanted to do something like this in my job:
from io import TextIOBase
my_output_stream = TextIOBase()
func1(output_stream=my_output_stream)
Where func1 is a function, that calls an external command via the execute_cmd method.
In my understanding this should write the output of the external command to my_output_stream
Now my question is, how can I output the data writen to that stream in realtime? Don't I need any kind of asynchronous execution? Because I can't just add a loop that reads lines of the stream after the call of func1 as this would be executed after the execution of the function has finished, but not in realtime.
Sorry for the kind of weird description of my problem, if something is still unclear, please comment and I will update my question with further explanation.

Related

How can I pass and receive information dinamically within a subprocess?

I'm developing a Python code that can run two applications and exchange information between them during their run time.
The basic scheme is something like:
start a subprocess with the 1st application
start a subprocess with the 2nd application
1st application performs some calculation, writes a file A, and waits for input
2nd application reads file A, performs some calculation, writes a file B, and waits for input
1st application reads file B, performs some calculation, writes a file C, and waits for input
...and so on until some condition is met
I know how to start one Python subprocess, and now I'm learning how to pass/receive information during run time.
I'm testing my Python code using a super-simple application that just reads a file, makes a plot, closes the plot, and returns 0.
I was able to pass an input to a subprocess using subprocess.communicate() and I could tell that the subprocess used that information (plot opens and closes), but here the problems started.
I can only send an input string once. After the first subprocess.communicate() in my code below, the subprocess hangs there. I suspect I might have to use subprocess.stdin.write() instead, since I read subprocess.communicate() will wait for the end of the file and I wish to send multiple times different inputs during the application run instead. But I also read that the use of stdin.write() and stdout.read() is discouraged. I tried this second alteranative (see #alternative in the code below), but in this case the application doesn't seem to receive the inputs, i.e. it doesn't do anything and the code ends.
Debugging is complicated because I haven't found a neat way to output what the subprocess is receiving as input and giving as output. (I tried to implement the solutions described here, but I must have done something wrong: Python: How to read stdout of subprocess in a nonblocking way, A non-blocking read on a subprocess.PIPE in Python)
Here is my working example. Any help is appreciated!
import os
import subprocess
from subprocess import PIPE
# Set application name
app_folder = 'my_folder_path'
full_name_app = os.path.join(app_folder, 'test_subprocess.exe')
# Start process
out_app = subprocess.Popen([full_name_app], stdin=PIPE, stdout=PIPE)
# Pass argument to process
N = 5
for n in range(N):
str_to_communicate = f'{{\'test_{n+1}.mat\', {{\'t\', \'y\'}}}}' # funny looking string - but this how it needs to be passed
bytes_to_communicate = str_to_communicate.encode()
output_communication = out_app.communicate(bytes_to_communicate)
# output_communication = out_app.stdin.write(bytes_to_communicate) # alternative
print(f'Communication command #{n+1} sent')
# Terminate process
out_app.terminate()

How do I pass through or wrap the print command(stdout) so that print also calls a function every call?

I am trying to automating a long running job, and I want to be able to upload all console outputs to another log like on CloudWatch Logs. For the most part this can be done by making and using a custom function instead of print. But there are functions in MachineLearning like Model.summary() or progress bars while training that outputs to stdout on their own.
I can get all get all console outputs at the very end, via an internal console log. But what I need is real-time uploading of stdout as its called by whomever. So that one can check the progress by taking a look at the logs on Cloudwatch instead of having to log into the machine and check the internal console logs.
Basically what I need is:
From: call_to_stdout -> Console(and probably other stuff)
To: call_to_stdout -> uploadLog() -> Console(and probably other stuff)
pseudocode of what I need
class stdout_PassThru:
def __init__(self, in_old_stdout):
self.old_stdout = in_old_stdout
def write(self, msg):
self.old_stdout.write(msg)
uploadLogToCloudwatch(msg)
def uploadLogToCloudwatch(msg):
# Botocore stuff to upload to Cloudwatch
myPassThru = stdout_PassThru(sys.stdout)
sys.stdout = myPassThru
I've tried googling this, but the best I ever get is stringIO stuff, where I can capture stdout, but I cannot do anything with it until the function I called ends and I can insert code again. I would like to run my upload Log code everytime stdout is used.
Is this even possible?
Please and thank you.
EDIT: Someone suggested redirect/output to file. The problem is that, that just streams/writes to the file as things are outputted. I need to call a function that does work on each call to stdout, which is not a stream. If stdout outputs everytime it flushes itself, then having the function call then would be good too.
I solved my problem. Sort of hidden in some other answers.
The initial problem I had with this solution is that when it is tested within a Jupyter Notebook, the sys.stdout = myClass(sys.stdout) causes Jupyter to... wait? Not sure but it never finishes processing the paragraph.
But when I put it into a python file and ran with python test.py it ran perfectly and as expected.
This allows me to in a sense pass thru calls to print, while executing my own function every call to print.
def addLog(message):
# my boto function to upload Cloudwatch logs
class sendToLog:
def __init__(self, stream):
self.stream = stream
def write(self, o):
self.stream.write(o)
addLog(o)
self.stream.flush()
def writelines(self, o):
self.stream.writelines(o)
addLog(o)
self.stream.flush()
def __getattr__(self, attr):
return getattr(self.stream, attr)
sys.stdout = sendToLog(sys.stdout)

How to get input from user and pass it to an interactive command line program triggered by subprocess call in python?

I have a command-line program which when triggered provides a list of options and prompts the user to select an option to continue.
Now I have launched the command line program using subprocess call, now the command line program prompts for the value, how do I pass the value from user to the command line program via subprocess call?
subprocess.call just waits for the process to finish and gives return code, no way to interact with it. if you instead use subprocess.Popen that gives you the ability to communicate with the subprocess while it is running via stdin and stdout
import subprocess, sys
program = subprocess.Popen("python3",
# give us a pipes to coommunicate
stdin=subprocess.PIPE,
stdout=subprocess.PIPE,
stderr=subprocess.PIPE)
data = input("input to subprocess")
[out,err] = program.communicate((data+"\n").encode())
print(out.decode())
print(err.decode(), file=sys.stderr)
Doing a bit of input then some output then more input can get messy though since there reading from stdout is blocking so determining when the output has stopped for more input is tricky.

Output data from subprocess command line by line

I am trying to read a large data file (= millions of rows, in a very specific format) using a pre-built (in C) routine. I want to then yeild the results of this, line by line, via a generator function.
I can read the file OK, but where as just running:
<command> <filename>
directly in linux will print the results line by line as it finds them, I've had no luck trying to replicate this within my generator function. It seems to output the entire lot as a single string that I need to split on newline, and of course then everything needs reading before I can yield line 1.
This code will read the file, no problem:
import subprocess
import config
file_cmd = '<command> <filename>'
for rec in (subprocess.check_output([file_cmd], shell=True).decode(config.ENCODING).split('\n')):
yield rec
(ENCODING is set in config.py to iso-8859-1 - it's a Swedish site)
The code I have works, in that it gives me the data, but in doing so, it tries to hold the whole lot in memory. I have larger files than this to process which are likely to blow the available memory, so this isn't an option.
I've played around with bufsize on Popen, but not had any success (and also, I can't decode or split after the Popen, though I guess the fact I need to split right now is actually my problem!).
I think I have this working now, so will answer my own question in the event somebody else is looking for this later ...
proc = subprocess.Popen(shlex.split(file_cmd), stdout=subprocess.PIPE)
while True:
output = proc.stdout.readline()
if output == b'' and proc.poll() is not None:
break
if output:
yield output.decode(config.ENCODING).strip()

subprocess.Popen hangs main application

I'm trying to follow an example of code that I found here and I've modified the code a bit so it looks like this in my main application
def send_to_printer(pdffile):
acrobat = r'C:\Program Files (x86)\Adobe\Acrobat 11.0\Acrobat\Acrobat.exe'
# '"%s"'is to wrap double quotes around paths
# as subprocess will use list2cmdline internally if we pass it a list
# which escapes double quotes and Adobe Reader doesn't like that
cmd = '"{}" /N /T "{}" "{}"'.format(acrobat, pdffile, printer_name)
proc = subprocess.Popen(cmd, stdout=subprocess.PIPE, stderr=subprocess.PIPE)
stdout, stderr = proc.communicate()
exit_code = proc.wait()
If I run this bit of code all on it's own and hand it the pdf I'm trying to print it works quite well. It is when I try and call it from my main application that it causes problems. Basically what I am doing is collecting a bunch of individual pdf's and assembling them together and then printing them so they are double sided.
The code that is calling this looks like this.
output1 = PdfFileWriter()
for pdf in args[:len(args)//2]:
page = PdfFileReader(pdf).getPage(0)
output1.addPage(page)
outputStream1 = open('front_pages_to_print.pdf', 'wb')
output1.write(outputStream1)
outputStream1.close()
send_to_printer('front_pages_to_print.pdf')
When I run the above code before sending it to the printer, it prints the first pages and then hangs. I've also tried just calling the individual files on their own, but it results in the same behavior. It prints the first page and hangs. I read up a bit on it and supposedly using proc.wait() can cause a deadlock if you don't use commuicate() as mentioned here. However in the code that I am following it has the line stdout, stderr = proc.communicate() which I am assuming is handling this? I have to be honest though, I'm trying to understand the code and don't quite get it. Anyone have any suggestions on this?
Thanks
Edit - This is on windows 10. I forgot to mention that.
So after fooling around with the debugger it was waiting for
stdout, stderr = proc.communicate()
exit_code = proc.wait()
Both of these ended up not receiving a response I guess? I don't quite understand why but when I took them out it started working. It could be that adobe opened, and sent the document to the printer and then closed before proc.communicate() could be called.

Resources