Does select() behave differently on python2 and python3? - linux

I want to read stdout and stderr from a subprocess in the same thread as described in this post. While running the code inside Python2.7 works as expected, the select() call in Python3.3 seems to do not what it should.
Have a look - here is a script that would print two lines on both stdout and stderr, then wait and then repeat this a couple of times:
import time, sys
for i in range(5):
sys.stdout.write("std: %d\n" % i)
sys.stdout.write("std: %d\n" % i)
sys.stderr.write("err: %d\n" % i)
sys.stderr.write("err: %d\n" % i)
time.sleep(2)
The problematic script will start the script above in a subprocess and read it's stdout and stderr as described in the posted link:
import subprocess
import select
p = subprocess.Popen(['/usr/bin/env', 'python', '-u', 'test-output.py'],
stdout=subprocess.PIPE, stderr=subprocess.PIPE)
r = [p.stdout.fileno(), p.stderr.fileno()]
while p.poll() is None:
print("select")
ret = select.select(r, [], [])
for fd in ret[0]:
if fd == p.stdout.fileno():
print("readline std")
print("stdout: " + p.stdout.readline().decode().strip())
if fd == p.stderr.fileno():
print("readline err")
print("stderr: " + p.stderr.readline().decode().strip())
Note that I start the Python subprocess using the -u option which causes Python to not buffer stdout and stderr. Also I print some text before calling select() and readline() to see, where the script blocks.
And here is the problem: running the script in Python3, after each cycle the output blocks for 2 seconds despite the fact, that two more lines are waiting to be read. And as indicated by the text before each call of select() you can see that it's select() which is blocking (not readline()).
My first thought was that select() only resumes on a flush on Python3 while Python2 it returns always when there's data available but in this case only one line would be read each 2 seconds (which is not the case!)
So my question is: is this a bug in Python3-select()? Did I misunderstand the behavior of select()? And is there a way to workaround this behavior without having to start a thread for each pipe?
Output when running Python3:
select
readline std
stdout: std: 0
readline err
stderr: err: 0
select <--- here the script blocks for 2 seconds
readline std
stdout: std: 0
select
readline std
stdout: std: 1
readline err
stderr: err: 0
select <--- here the script should block (but doesn't)
readline err
stderr: err: 1
select <--- here the script blocks for 2 seconds
readline std
stdout: std: 1
readline err
stderr: err: 1
select <--- here the script should block (but doesn't)
readline std
stdout: std: 2
readline err
stderr: err: 2
select
.
.
Edit: Please note that it has no influence whether the child process is a Python script. The following C++ program has the same effect:
int main() {
for (int i = 0; i < 4; ++i) {
std::cout << "out: " << i << std::endl;
std::cout << "out: " << i << std::endl;
std::cerr << "err: " << i << std::endl;
std::cerr << "err: " << i << std::endl;
fflush(stdout);
fflush(stderr);
usleep(2000000);
}}

It seems that the reason is buffering in subprocess.PIPE and first readline() call reads all available data (i.e., two lines) and returns first one.
After that, there is no unread data in pipe, so select() is not returning immediately. You can check this by doubling readline calls:
print("stdout: " + p.stdout.readline().decode().strip())
print("stdout: " + p.stdout.readline().decode().strip())
and ensuring that second readline() call doesn't block.
One solution is to disable buffering using bufsize=0:
p = subprocess.Popen(['/usr/bin/env', 'python', '-u', 'test-output.py'],
stdout=subprocess.PIPE, stderr=subprocess.PIPE, bufsize=0)
Another possible solution is to do a non-blocking readline() or to ask pipe file object its read buffer size, but I don't know is it possible.
You can also read directly from p.stdout.fileno() to implement non-blocking readline().
Update: Python2 vs. Python3
The reason why Python3 differs from Python2 here is likely to be in new I/O module (PEP 31136). See this note:
The BufferedIOBase methods signatures are mostly identical to that of RawIOBase (exceptions: write() returns None , read() 's argument is optional), but may have different semantics. In particular, BufferedIOBase implementations may read more data than requested or delay writing data using buffers.

Related

Go pipe write end being closed, why?

I just read some Go code that does something along the following lines:
type someType struct {
...
...
rpipe io.ReadCloser
wpipe io.WriteCloser
}
var inst someType
inst.rpipe, inst.wpipe, _ := os.Pipe()
cmd := exec.Command("some_binary", args...)
cmd.Stdout = inst.wpipe
cmd.Stderr = inst.wpipe
if err := cmd.Start(); err != nil {
....
}
inst.wpipe.Close()
inst.wpipe = nil
some_binary is a long running process.
Why is inst.wpipe closed and set to nil? What would happen if its not closed? Is it common/necessary to close inst.wpipe?
Is dup2(pipe_fd[1], 1) the C analogue of cmd.Stdout = inst.wpipe; inst.wpipe.Close()?
That code is typical of a program that wants to read output generated by some other program. The os.Pipe() function returns a connected pair of os.File entities (or, on error—which should not be simply ignored—doesn't) where a write on the second (w or wpipe) entity shows up as readable bytes on the first (r / rpipe) entity. But—this is the key to half the answer to your first question—how will a reader know when all writers are finished writing?
For a reader to get an EOF indication, all writers that have or had access to the write side of the pipe must call the close operation. By passing the write side of the pipe to a program that we start with cmd.Start(), we allow that command to access the write side of the pipe. When that command closes that pipe, one of the entities with access has closed the pipe. But another entity with access hasn't closed it yet: we have write access.
To see an EOF, then, we must close off access to our wpipe, with wpipe.Close(). So that answer the first half of:
Why is inst.wpipe closed and set to nil?
The set-to-nil part may or may not have any function; you should inspect the rest of the code to find out if it does.
Is dup2(pipe_fd[1], 1) the C analogue of cmd.Stdout = inst.wpipe; inst.wpipe.Close()?
Not precisely. The dup2 level is down in the POSIX OS area, while cmd.Stdout is at a higher (OS-independent) level. The POSIX implementation of cmd.Start() will wind up calling dup2 (or something equivalent) like this after calling fork (or something equivalent). The POSIX equivalent of inst.wipe.Close() is close(wfd) where wfd is the POSIX file number in wpipe.
In C code that doesn't have any higher level wrapping around it, we'd have something like:
int fds[2];
if (pipe(fds) < 0) ... handle error case ...
pid = fork();
switch (pid) {
case -1: ... handle error ...
case 0: /* child */
if (dup2(fds[1], 1) < 0 || dup2(fds[1], 2) < 0) ... handle error ...
if (execve(prog, args, env) < 0) ... handle error ...
/* NOTREACHED */
default: /* parent */
if (close(fds[1]) < 0) ... handle error ...
... read from fds[0] ...
}
(although if we're careful enough to check for an error from close, we probably should be careful enough to check whether the pipe system call gave us back descriptors 0 and 1, or 1 and 2, or 2 and 3, here—though perhaps we handle this earlier by making sure that 0, 1, and 2 are at least open to /dev/null).

Python3: Popen with infile and outfile doesn't stop at the end of infile

I want to call an interactive command line program (c++ using cin/cout).
I want to define the input through an input file.
I want the output to be written in an output file.
After all the inputs inside the input file are used, the executed program should be killed, even, if it would naturally run on.
Target machine is windows 10.
To achieve this, I tried using the following:
from subprocess import *
with open("input.txt", "r") as ifile:
with open("output.txt","w") as ofile:
p = Popen(["./example.exe"], stdin=ifile, stdout=ofile, stderr=None, universal_newlines=True )
p.wait()
I also tried it with call instead of Popen, but same result.
I also tried p.stdin.write/ readline, but everything I came up with hangs (eg, because the exe program waits inside cin) or mashes up the order.
This is my cpp code for testing:
#include<iostream>
#include<string.h>
using namespace std;
int main()
{
string inputString("");
for (int i = 0; i< 200; ++i){
cout << "type a number: " << endl;
cin >> inputString;
if(inputString == "1"){
cout << "eins" << endl;
} else if (inputString == "2"){
cout << "zwei" << endl;
} else if (inputString == "blub"){
break;
}
else{
cout << "unknown" << endl;
}
}
return 0;
}
I would expect this the python code to stop running, after all input lines inside the input file are used.
Instead, the last line gets repeatedly used as input (arbitrary* number of times) or the program gets terminated in the middle.
*arbitrary: Outfile always has exactly 400 lines.

Circle Piping to and from 2 Python Subprocesses

I needed help regarding the subprocess module. This question might sound repeated, and I have seen a number of articles related to it in a number of ways. But even so I am unable to solve my problem. It goes as follows:
I have a C program 2.c it's contents are as follows:
#include<stdio.h>
int main()
{
int a;
scanf("%d",&a);
while(1)
{
if(a==0) //Specific case for the first input
{
printf("%d\n",(a+1));
break;
}
scanf("%d",&a);
printf("%d\n",a);
}
return 0;
}
I need to write a python script which first compiles the code using subprocess.call() and then opens two process using Popen to execute the respective C-program. Now the output of the first process must be the input of the second and vice versa. So essentially, if my initial input was 0, then the first process outputs 2, which is taken by second process. It in turn outputs 3 and so on infinitely.
The below script is what I had in mind, but it is flawed. If someone can help me I would very much appreciate it.
from subprocess import *
call(["gcc","2.c"])
a = Popen(["./a.out"],stdin=PIPE,stdout=PIPE) #Initiating Process
a.stdin.write('0')
temp = a.communicate()[0]
print temp
b = Popen(["./a.out"],stdin=PIPE,stdout=PIPE) #The 2 processes in question
c = Popen(["./a.out"],stdin=PIPE,stdout=PIPE)
while True:
b.stdin.write(str(temp))
temp = b.communicate()[0]
print temp
c.stdin.write(str(temp))
temp = c.communicate()[0]
print temp
a.wait()
b.wait()
c.wait()
If you want the output of the first command a to go as the input of the second command b and in turn b's output is a's input—in a circle like a snake eating its tail— then you can't use .communicate() in a loop: .communicate() doesn't return until the process is dead and all the output is consumed.
One solution is to use a named pipe (if open() doesn't block in this case on your system):
#!/usr/bin/env python3
import os
from subprocess import Popen, PIPE
path = 'fifo'
os.mkfifo(path) # create named pipe
try:
with open(path, 'r+b', 0) as pipe, \
Popen(['./a.out'], stdin=PIPE, stdout=pipe) as b, \
Popen(['./a.out'], stdout=b.stdin, stdin=pipe) as a:
pipe.write(b'10\n') # kick-start it
finally:
os.remove(path) # clean up
It emulates a < fifo | b > fifo shell command from #alexander barakin answer.
Here's more complex solution that funnels the data via the python parent process:
#!/usr/bin/env python3
import shutil
from subprocess import Popen, PIPE
with Popen(['./a.out'], stdin=PIPE, stdout=PIPE, bufsize=0) as b, \
Popen(['./a.out'], stdout=b.stdin, stdin=PIPE, bufsize=0) as a:
a.stdin.write(b'10\n') # kick-start it
shutil.copyfileobj(b.stdout, a.stdin) # copy b's stdout to a' stdin
This code connects a's output to b's input using redirection via OS pipe (as a | b shell command does).
To complete the circle, b's output is copied to a's input in the parent Python code using shutil.copyfileobj().
This code may have buffering issues: there are multiple buffers in between the processes: C stdio buffers, buffers in Python file objects wrapping the pipes (controlled by bufsize).
bufsize=0 turns off the buffering on the Python side and the data is copied as soon as it is available. Beware, bufsize=0 may lead to partial writes—you might need to inline copyfileobj() and call write() again until all read data is written.
Call setvbuf(stdout, (char *) NULL, _IOLBF, 0), to make the stdout line-buffered inside your C program:
#include <stdio.h>
int main(void)
{
int a;
setvbuf(stdout, (char *) NULL, _IOLBF, 0); /* make line buffered stdout */
do {
scanf("%d",&a);
printf("%d\n",a-1);
fprintf(stderr, "%d\n",a); /* for debugging */
} while(a > 0);
return 0;
}
Output
10
9
8
7
6
5
4
3
2
1
0
-1
The output is the same.
Due to the way the C child program is written and executed, you might also need to catch and ignore BrokenPipeError exception at the end on a.stdin.write() and/or a.stdin.close() (a process may be already dead while there is uncopied data from b).
Problem is here
while True:
b.stdin.write(str(temp))
temp = b.communicate()[0]
print temp
c.stdin.write(str(temp))
temp = c.communicate()[0]
print temp
Once communicate has returned, it does noting more. You have to run the process again. Plus you don't need 2 processes open at the same time.
Plus the init phase is not different from the running phase, except that you provide the input data.
what you could do to simplify and make it work:
from subprocess import *
call(["gcc","2.c"])
temp = str(0)
while True:
b = Popen(["./a.out"],stdin=PIPE,stdout=PIPE) #The 2 processes in question
b.stdin.write(temp)
temp = b.communicate()[0]
print temp
b.wait()
Else, to see 2 processes running in parallel, proving that you can do that, just fix your loop as follows (by moving the Popen calls in the loop):
while True:
b = Popen(["./a.out"],stdin=PIPE,stdout=PIPE) #The 2 processes in question
c = Popen(["./a.out"],stdin=PIPE,stdout=PIPE)
b.stdin.write(str(temp))
temp = b.communicate()[0]
print temp
c.stdin.write(str(temp))
temp = c.communicate()[0]
print temp
better yet. b output feeds c input:
while True:
b = Popen(["./a.out"],stdin=PIPE,stdout=PIPE) #The 2 processes in question
c = Popen(["./a.out"],stdin=b.stdout,stdout=PIPE)
b.stdin.write(str(temp))
temp = c.communicate()[0]
print temp

Wait for a thread to join with time limit

I've got a thread that invokes a function MyFunc with parameters params. Basically it outputs dots in a stream while MyFunc is running, with timeout 500 ms. I need to wait for a thread for 1 minute, then I need to output either "MyFunc successfully completed" if the function finishes its work within 1 min or "Timeout" if after 1 min it is still running. How can I do that ?
std::future<void> f = std::async(std::launch::async, MyFunc, params);
std::chrono::milliseconds span(500);
while (f.wait_for(span) == std::future_status::timeout)
std::cout << '.';
You can use wait_for(),without a problem.
std::future<void> f = std::async(std::launch::async, MyFunc, params);
auto because = std::async(std::launch::async,[&]()
{
// for your use, you may want to change it from 0 seconds to something
// like 1 second, or 500 ms
while(f.wait_for(std::chrono::seconds(0)) != std::future_status::ready)
std::cout << ".";
}).wait_for(std::chrono::seconds(60));
if(because == std::future_status::ready)
std::cout << "Successfully Completed\n";
else
std::cout << "Timeout";
Remember when you started waiting, or count the number of times you waited. Then you check those values on each iteration and determine whether more than 1min has passed. In that case you exit the loop.

Qt program with Shell

I want to write a testing program. It will open a special *.tests file and test direct program with tests from the file.
I need to:
Run some program. e.g ./main -testing 45 563 67
Listen to result.
How I can to do it? I want to run program main with some tests and listen to its result.
You should ues the QProcess class to start your program.
QString program = "./main";
QStringList arguments;
arguments << "-testing" << "45" << "563" << ...;
QProcess *myProcess = new QProcess(parent);
myProcess->start(program, arguments);
Then you can use the waitForFinished to wait for it to finish.
exitCode will give you the return code.
The readAllStandardOutput (or *Error) methods allow you to read what the process has output to the console.

Resources