Subprocess contradiction between two programs: - python-3.x

I know this may seem weird, but I'm trying to understand why the following happens:
I'm editing a Python program at work and when I run the following Python function:
def execute_shell_cmd(cmd):
process = subprocess.Popen(cmd, stdout=subprocess.PIPE, stderr=subprocess.PIPE, shell=True)
for c in iter(lambda: process.stdout.read(1), b''):
print("type_c = ", type(c))
sys.stdout.write(c)
for e in iter(lambda: process.stderr.read(1), b''):
sys.stdout.write(e)
execute_shell_cmd("ls -l")
I get in the output that type_c is "bytes" and sys.stdout.write(c) runs regularly and prints one byte at a time.
But when I run this function from a standalone program, I get the following error:
TypeError: write() argument must be str, not bytes
How is that possible?

In Python 3, sys.stdout is always str-typed, with an encoding chosen by the PYTHONIOENCODING environment variable (and/or PYTHONUTF8 on Windows).
sys.stdout.buffer (i.e. a TextIOBase::buffer) is the underlying bytestream for the text-encoded stdout stream.
Since you're reading bytes from the subprocess, you'll need to also write to the byte-typed stream.
for c in iter(lambda: process.stdout.read(1), b''):
sys.stdout.buffer.write(c)
If, on the other hand, you do expect to be working with text, you may wish to configure the subprocess object to decode output to strings.

Related

Python: how to write to stdin of a subprocess and read its output in real time

I have 2 programs.
The first (which could be written in any language, actually and therefore cannot be altered at all) looks like this:
#!/bin/env python3
import random
while True:
s = input() # get input from stdin
i = random.randint(0, len(s)) # process the input
print(f"New output {i}", flush=True) # prints processed input to stdout
It runs forever, read something from stdin, processes it and writes the result to stdout.
I am trying to write a second program in Python using the asyncio library.
It executes the first program as a subprocess and attempt to feed it input via its stdin and retrieve the result from the its stdout.
Here is my code so far:
#!/bin/env python3
import asyncio
import asyncio.subprocess as asp
async def get_output(process, input):
out, err = await process.communicate(input)
print(err) # shows that the program crashes
return out
# other attempt to implement
process.stdin.write(input)
await process.stdin.drain() # flush input buffer
out = await process.stdout.read() # program is stuck here
return out
async def create_process(cmd):
process = await asp.create_subprocess_exec(
cmd, stdin=asp.PIPE, stdout=asp.PIPE, stderr=asp.PIPE)
return process
async def run():
process = await create_process("./test.py")
out = await get_output(process, b"input #1")
print(out) # b'New output 4'
out = await get_output(process, b"input #2")
print(out) # b''
out = await get_output(process, b"input #3")
print(out) # b''
out = await get_output(process, b"input #4")
print(out) # b''
async def main():
await asyncio.gather(run())
asyncio.run(main())
I struggle to implement the get_output function. It takes a bytestring (as needed by the input parameter of the .communicate() method) as parameter, writes it to the stdin of the program, reads the response from its stdout and returns it.
Right now, only the first call to get_output works properly. This is because the implementation of the .communicate() method calls the wait() method, effectively causing the program to terminate (which it isn't meant to). This can be verified by examining the value of err in the get_output function, which shows the first program reached EOF. And thus, the other calls to get_output return an empty bytestring.
I have tried another way, even less successful, since the program gets stuck at the line out = await process.stdout.read(). I haven't figured out why.
My question is how do I implement the get_output function to capture the program's output in (near) real time and keep it running ? It doesn't have to be using asyncio, but I have found this library to be the best one so far for that.
Thank you in advance !
If the first program is guaranteed to print only one line of output in response to the line of input that it has read, you can change await process.stdout.read() to await process.stdout.readline() and your second approach should work.
The reason it didn't work for you is that your run function has a bug: it never sends a newline to the child process. Because of that, the child process is stuck in input() and never responds. If you add \n at the end of the bytes literals you're passing to get_output, the code works correctly.

asyncio readline from c subprocess stdout seems to block on windows [duplicate]

Ok so I'm trying to run a C program from a python script. Currently I'm using a test C program:
#include <stdio.h>
int main() {
while (1) {
printf("2000\n");
sleep(1);
}
return 0;
}
To simulate the program that I will be using, which takes readings from a sensor constantly.
Then I'm trying to read the output (in this case "2000") from the C program with subprocess in python:
#!usr/bin/python
import subprocess
process = subprocess.Popen("./main", stdout=subprocess.PIPE)
while True:
for line in iter(process.stdout.readline, ''):
print line,
but this is not working. From using print statements, it runs the .Popen line then waits at for line in iter(process.stdout.readline, ''):, until I press Ctrl-C.
Why is this? This is exactly what most examples that I've seen have as their code, and yet it does not read the file.
Is there a way of making it run only when there is something to be read?
It is a block buffering issue.
What follows is an extended for your case version of my answer to Python: read streaming input from subprocess.communicate() question.
Fix stdout buffer in C program directly
stdio-based programs as a rule are line buffered if they are running interactively in a terminal and block buffered when their stdout is redirected to a pipe. In the latter case, you won't see new lines until the buffer overflows or flushed.
To avoid calling fflush() after each printf() call, you could force line buffered output by calling in a C program at the very beginning:
setvbuf(stdout, (char *) NULL, _IOLBF, 0); /* make line buffered stdout */
As soon as a newline is printed the buffer is flushed in this case.
Or fix it without modifying the source of C program
There is stdbuf utility that allows you to change buffering type without modifying the source code e.g.:
from subprocess import Popen, PIPE
process = Popen(["stdbuf", "-oL", "./main"], stdout=PIPE, bufsize=1)
for line in iter(process.stdout.readline, b''):
print line,
process.communicate() # close process' stream, wait for it to exit
There are also other utilities available, see Turn off buffering in pipe.
Or use pseudo-TTY
To trick the subprocess into thinking that it is running interactively, you could use pexpect module or its analogs, for code examples that use pexpect and pty modules, see Python subprocess readlines() hangs. Here's a variation on the pty example provided there (it should work on Linux):
#!/usr/bin/env python
import os
import pty
import sys
from select import select
from subprocess import Popen, STDOUT
master_fd, slave_fd = pty.openpty() # provide tty to enable line buffering
process = Popen("./main", stdin=slave_fd, stdout=slave_fd, stderr=STDOUT,
bufsize=0, close_fds=True)
timeout = .1 # ugly but otherwise `select` blocks on process' exit
# code is similar to _copy() from pty.py
with os.fdopen(master_fd, 'r+b', 0) as master:
input_fds = [master, sys.stdin]
while True:
fds = select(input_fds, [], [], timeout)[0]
if master in fds: # subprocess' output is ready
data = os.read(master_fd, 512) # <-- doesn't block, may return less
if not data: # EOF
input_fds.remove(master)
else:
os.write(sys.stdout.fileno(), data) # copy to our stdout
if sys.stdin in fds: # got user input
data = os.read(sys.stdin.fileno(), 512)
if not data:
input_fds.remove(sys.stdin)
else:
master.write(data) # copy it to subprocess' stdin
if not fds: # timeout in select()
if process.poll() is not None: # subprocess ended
# and no output is buffered <-- timeout + dead subprocess
assert not select([master], [], [], 0)[0] # race is possible
os.close(slave_fd) # subproces don't need it anymore
break
rc = process.wait()
print("subprocess exited with status %d" % rc)
Or use pty via pexpect
pexpect wraps pty handling into higher level interface:
#!/usr/bin/env python
import pexpect
child = pexpect.spawn("/.main")
for line in child:
print line,
child.close()
Q: Why not just use a pipe (popen())? explains why pseudo-TTY is useful.
Your program isn't hung, it just runs very slowly. Your program is using buffered output; the "2000\n" data is not being written to stdout immediately, but will eventually make it. In your case, it might take BUFSIZ/strlen("2000\n") seconds (probably 1638 seconds) to complete.
After this line:
printf("2000\n");
add
fflush(stdout);
See readline docs.
Your code:
process.stdout.readline
Is waiting for EOF or a newline.
I cannot tell what you are ultimately trying to do, but adding a newline to your printf, e.g., printf("2000\n");, should at least get you started.

Continuous communication between parent and child subprocess in Python (Windows)?

I have this script:
import subprocess
p = subprocess.Popen(["myProgram.exe"],
stdin=subprocess.PIPE,
stdout=subprocess.PIPE)
while True:
out, _ = p.communicate(input().encode())
print(out.decode())
which works fine until the second input where I get:
ValueError: Cannot send input after starting communication
Is there a way to have multiple messages sent between the parent and child process in Windows ?
[EDIT]
I don't have access to the source code of myProgram.exe
It is an interactive command line application returning results from queries
Running >> myProgram.exe < in.txt > out.txt works fine with in.txt:
query1;
query2;
query3;
Interacting with another running process via stdin/stdout
To simulate the use case where a Python script starts a command line interactive process and sends/receives text over stdin/stdout, we have a primary script that starts another Python process running a simple interactive loop.
This can also be applied to cases where a Python script needs to start another process and just read its output as it comes in without any interactivity beyond that.
primary script
import subprocess
import threading
import queue
import time
if __name__ == '__main__':
def enqueue_output(outp, q):
for line in iter(outp.readline, ''):
q.put(line)
outp.close()
q = queue.Queue()
p = subprocess.Popen(["/usr/bin/python", "/test/interact.py"],
stdin = subprocess.PIPE,
stdout = subprocess.PIPE,
# stderr = subprocess.STDOUT,
bufsize = 1,
encoding ='utf-8')
th = threading.Thread(target=enqueue_output, args=(p.stdout, q))
th.daemon = True
th.start()
for i in range(4):
print("dir()", file=p.stdin)
print(f"Iteration ({i}) Parent received: {q.get()}", end='')
# p.stdin.write("dir()\n")
# while q.empty():
# time.sleep(0)
# print(f"Parent: {q.get_nowait()}")
interact.py script
if __name__ == '__main__':
for i in range(2):
cmd = raw_input()
print("Iteration (%i) cmd=%s" % (i, cmd))
result = eval(cmd)
print("Iteration (%i) result=%s" % (i, str(result)))
output
Iteration (0) Parent received: Iteration (0) cmd=dir()
Iteration (1) Parent received: Iteration (0) result=['__builtins__', '__doc__', '__file__', '__name__', '__package__', 'cmd', 'i']
Iteration (2) Parent received: Iteration (1) cmd=dir()
Iteration (3) Parent received: Iteration (1) result=['__builtins__', '__doc__', '__file__', '__name__', '__package__', 'cmd', 'i', 'result']
This Q&A was leveraged to simulate non-blocking reads from the target process: https://stackoverflow.com/a/4896288/7915759
This method provides a way to check for output without blocking in the main thread; q.empty() will tell you if there's no data. You can play around with blocking calls too using q.get() or with a timeout q.get(2) - the parameter is number of seconds. It can be a float value less than zero.
Text based interaction between processes can be done without the thread and queue, but this implementation gives more options on how to retrieve the data coming back.
The Popen() parameters, bufsize=1 and encoding='utf-8' make it possible to use <stdout>.readline() from the primary script and sets the encoding to an ascii compatible codec understood by both processes (1 is not the size of the buffer, it's a symbolic value indicating line buffering).
With this configuration, both processes can simply use print() to send text to each other. This configuration should be compatible for a lot of interactive text based command line tools.

Python - Using timeout while printing line by line in a subprocess with Popen

(in Python 3.5)
I am having difficulties to print stdout line by line (while running the program), and maintain the timeout function (to stop the program after sometime).
I have:
import subprocess as sub
import io
file_name = 'program.exe'
dir_path = r'C:/directory/'
p = sub.Popen(file_name, cwd = dir_path, shell=True, stdout = sub.PIPE, stderr = sub.STDOUT)
And while running "p", do these 2 things:
for line in io.TextIOWrapper(p.stdout, encoding="utf-8"):
print(line)
And do:
try:
outs = p.communicate(timeout=15) # Just to use timeout
except Exception as e:
print(str(e))
p.kill()
The program should print every output line but should not run the simulation for more than 15 seconds.
If I use the "p.communicate" before the "p.stdout", it will wait for the timeout ou the program to finish. If I use it on the other way, the program will not count the timeout.
I would like to do it without threading, and if possible without io too, it seems to be possible, but I donĀ“t know how (need more practice and study). :-(
PS: The program I am running was written in fortran and is used to simulate water flow. If I run the exe from windows, it opens a cmd and prints a line on each timestep. And I am doing a sensitivity analysis changing the inputs on exe file.
That's because your process\child processes are not getting killed correctly
just modify your try,except as below
try:
pid_id=p.pid
outs = p.communicate(timeout=15) # Just to use timeout
except Exception as e:
print(str(e))
import subprocesss
#This will kill all the process and child process associated with p forcefully
subprocess.Popen('taskkill /F /T /PID %i' % pid_id)

Python subprocess: chaining commands with subprocess.run

I'm experimenting with subprocess.run in Python 3.5. To chain two commands together, I would have thought that the following should work:
import subprocess
ps1 = subprocess.run(['ls'], universal_newlines=True, stdout=subprocess.PIPE)
ps2 = subprocess.run(['cowsay'], stdin=ps1.stdout)
However, this fails with:
AttributeError: 'str' object has no attribute 'fileno'
ps2 was expecting a file-like object, but the output of ps1 is a simple string.
Is there a way to chain commands together with subprocess.run?
subprocess.run() can't be used to implement ls | cowsay without the shell because it doesn't allow to run the individual commands concurrently: each subprocess.run() call waits for the process to finish that is why it returns CompletedProcess object (notice the word "completed" there). ps1.stdout in your code is a string that is why you have to pass it as input and not the stdin parameter that expects a file/pipe (valid .fileno()).
Either use the shell:
subprocess.run('ls | cowsay', shell=True)
Or use subprocess.Popen, to run the child processes concurrently:
from subprocess import Popen, PIPE
cowsay = Popen('cowsay', stdin=PIPE)
ls = Popen('ls', stdout=cowsay.stdin)
cowsay.communicate()
ls.wait()
See How do I use subprocess.Popen to connect multiple processes by pipes?
Turns out that subprocess.run has an input argument to handle this:
ps1 = subprocess.run(['ls'], universal_newlines=True, stdout=subprocess.PIPE)
ps2 = subprocess.run(['cowsay'], universal_newlines=True, input=ps1.stdout)
Also, the following works as well, which doesn't use input:
ps1 = subprocess.run(['ls'], universal_newlines=True, stdout=subprocess.PIPE)
ps2 = subprocess.run(['cowsay', ps1.stdout], universal_newlines=True)

Resources