Find execution time for subprocess.Popen python - multithreading

Here's the Python code to run an arbitrary command returning its stdout data, or raise an exception on non-zero exit codes:
proc = subprocess.Popen(
cmd,
stderr=subprocess.STDOUT, # Merge stdout and stderr
stdout=subprocess.PIPE,
shell=True)
The subprocess module does not support execution-time and if it exceeds specific threshold => timeout(ability to kill a process running for more than X number of seconds)
What is the simplest way to implement get_execution_time and timeout in Python2.6 program meant to run on Linux?

Good question. Here is the complete code for this:
import time, subprocess # Importing modules.
timeoutInSeconds = 1 # Our timeout value.
cmd = "sleep 5" # Your desired command.
proc = subprocess.Popen(cmd,shell=True) # Starting main process.
timeStarted = time.time() # Save start time.
cmdTimer = "sleep "+str(timeoutInSeconds) # Waiting for timeout...
cmdKill = "kill "+str(proc.pid)+" 2>/dev/null" # And killing process.
cmdTimeout = cmdTimer+" && "+cmdKill # Combine commands above.
procTimeout = subprocess.Popen(cmdTimeout,shell=True) # Start timeout process.
proc.communicate() # Process is finished.
timeDelta = time.time() - timeStarted # Get execution time.
print("Finished process in "+str(timeDelta)+" seconds.") # Output result.

Related

gsutil without -m multithreading / parallel default behavior

I am trying to find out if gsutil mv is called without the -m option, what the defaults are. I see in the config.py source code that it looks like even without the -m option the default would be to calculate the number of CPU cores and set that along with 5 threads. So by default if you had a 4 core machine you would get 4 processes and 5 threads, basically multi-threaded out of the box. How would we find out what -m does, I think I saw in some documentation that -m defaults to 10 threads, but how many processes are spawned? I know you can override these settings but whats default with -m?
should_prohibit_multiprocessing, unused_os =ShouldProhibitMultiprocessing()
if should_prohibit_multiprocessing:
DEFAULT_PARALLEL_PROCESS_COUNT = 1
DEFAULT_PARALLEL_THREAD_COUNT = 24
else:
DEFAULT_PARALLEL_PROCESS_COUNT = min(multiprocessing.cpu_count(), 32)
DEFAULT_PARALLEL_THREAD_COUNT = 5
Also would a mv command in a for loop take advantage of -m or will it just feed the gsutil command one at a time rendering parallel obsolete? The reason I ask because using the below loop with 50000 files took 24 hours to complete, I wanted to know if I used the -m option if it would of helped? Not sure if calling the gsutil command each iteration would allow full threading or would it just do it with 10 processes and 10 threads making it twice as fast?
#!/bin/bash
for files in $(cat listing2.txt) ; do
echo "Renaming: $files --> ${files#removeprefix-}"
gsutil mv gs://testbucket/$files gs://testbucket/${files#removeprefix-}
done
Thanks to the commenters #guillaume blaquiere,
I engineered a python program that would multi process the API calls to move the files in the cloud with 25 concurrent processes. I will share the code here to hopefully help others.
import time
import subprocess
import multiprocessing
class GsRenamer:
def __init__(self):
self.gs_cmd = '~/google-cloud-sdk/bin/gsutil'
def execute_jobs(self, cmd):
try:
print('RUNNING PARALLEL RENAME: [{0}]'.format(cmd))
print(cmd)
subprocess.run(cmd, check=True, shell=True)
except subprocess.CalledProcessError as e:
print('[{0}] FATAL: Command failed with error [{1}]').format(cmd,
e)
def get_filenames_from_gs(self):
self.file_list = []
cmd = [self.gs_cmd, 'ls',
'gs://gs-bucket/jason_testing']
p = subprocess.Popen(cmd, stdout=subprocess.PIPE)
output = p.stdout.readlines()
for files in output:
files = files.decode('utf-8').strip()
tokens = files.split('/')[-1]
self.file_list.append(tokens)
self.file_list = list(filter(None, self.file_list))
def rename_files(self, string_original, string_replace):
final_rename_list = []
for files in self.file_list:
renamed_files = files.replace(string_original,
string_replace)
rename_command = "{0} mv gs://gs-bucket/jason_testing/{1} " \
"gs://gs-bucket/jason_testing/{2}".format(
self.gs_cmd, files, renamed_files)
final_rename_list.append(rename_command)
final_rename_list.sort()
multiprocessing.pool = multiprocessing.Pool(
processes=25)
multiprocessing.pool.map(self.execute_jobs, final_rename_list)
def main():
gsr = GsRenamer()
gsr.get_filenames_from_gs()
#gsr.rename_files('sample', 'jason')
gsr.rename_files('jason', 'sample')
if __name__ == "__main__":
main()

Streaming read from subprocess

I need to read output from a child process as it's produced -- perhaps not on every write, but well before the process completes. I've tried solutions from the Python3 docs and SO questions here and here, but I still get nothing until the child terminates.
The application is for monitoring training of a deep learning model. I need to grab the test output (about 250 bytes for each iteration, at roughly 1-minute intervals) and watch for statistical failures.
I cannot change the training engine; for instance, I cannot insert stdout.flush() in the child process code.
I can reasonably wait for a dozen lines of output to accumulate; I was hopeful of a buffer-fill solving my problem.
Code: variations are commented out.
Parent
cmd = ["/usr/bin/python3", "zzz.py"]
# test_proc = subprocess.Popen(
test_proc = subprocess.run(
cmd,
stdout=subprocess.PIPE,
stderr=subprocess.STDOUT
)
out_data = ""
print(time.time(), "START")
while not "QUIT" in str(out_data):
out_data = test_proc.stdout
# out_data, err_data = test_proc.communicate()
print(time.time(), "MAIN received", out_data)
Child (zzz.py)
from time import sleep
import sys
for _ in range(5):
print(_, "sleeping", "."*1000)
# sys.stdout.flush()
sleep(1)
print("QUIT this exercise")
Despite sending lines of 1000+ bytes, the buffer (tested elsewhere as 2kb; here, I've gone as high as 50kb) filling doesn't cause the parent to "see" the new text.
What am I missing to get this to work?
Update with regard to links, comments, and iBug's posted answer:
Popen instead of run fixed the blocking issue. Somehow I missed this in the documentation and my experiments with both.
universal_newline=True neatly changed the bytes return to string: easier to handle on the receiving end, although with interleaved empty lines (easy to detect and discard).
Setting bufsize to something tiny (e.g. 1) didn't affect anything; the parent still has to wait for the child to fill the stdout buffer, 8k in my case.
export PYTHONUNBUFFERED=1 before execution did fix the buffering problem. Thanks to wim for the link.
Unless someone comes up with a canonical, nifty solution that makes these obsolete, I'll accept iBug's answer tomorrow.
subprocess.run always spawns the child process, and blocks the thread until it exits.
The only option for you is to use p = subprocess.Popen(...) and read lines with s = p.stdout.readline() or p.stdout.__iter__() (see below).
This code works for me, if the child process flushes stdout after printing a line (see below for extended note).
cmd = ["/usr/bin/python3", "zzz.py"]
test_proc = subprocess.Popen(
cmd,
stdout=subprocess.PIPE,
stderr=subprocess.STDOUT
)
out_data = ""
print(time.time(), "START")
while not "QUIT" in str(out_data):
out_data = test_proc.stdout.readline()
print(time.time(), "MAIN received", out_data)
test_proc.communicate() # shut it down
See my terminal log (dots removed from zzz.py):
ibug#ubuntu:~/t $ python3 p.py
1546450821.9174328 START
1546450821.9793346 MAIN received b'0 sleeping \n'
1546450822.987753 MAIN received b'1 sleeping \n'
1546450823.993136 MAIN received b'2 sleeping \n'
1546450824.997726 MAIN received b'3 sleeping \n'
1546450825.9975247 MAIN received b'4 sleeping \n'
1546450827.0094354 MAIN received b'QUIT this exercise\n'
You can also do it with a for loop:
for out_data in test_proc.stdout:
if "QUIT" in str(out_data):
break
print(time.time(), "MAIN received", out_data)
If you cannot modify the child process, unbuffer (from package expect - install with APT or YUM) may help. This is my working parent code without changing the child code.
test_proc = subprocess.Popen(
["unbuffer"] + cmd,
stdout=subprocess.PIPE,
stderr=subprocess.STDOUT
)

Setting timeout when using os.system function

Firstly, I'd like to say I just begin to learn python, And I want to execute maven command inside my python script (see the partial code below)
os.system("mvn surefire:test")
But unfortunately, sometimes this command will time out, So I wanna to know how to set a timeout threshold to control this command.
That is to say, if the executing time is beyond X seconds, the program will skip the command.
What's more, can other useful solution deal with my problem? Thanks in advance!
use the subprocess module instead. By using a list and sticking with the default shell=False, we can just kill the process when the timeout hits.
p = subprocess.Popen(['mvn', 'surfire:test'])
try:
p.wait(my_timeout)
except subprocess.TimeoutExpired:
p.kill()
Also, you can use in terminal timeout:
Do like that:
import os
os.system('timeout 5s [Type Command Here]')
Also, you can use s, m, h, d for second, min, hours, day.
You can send different signal to command. If you want to learn more, see at:
https://linuxize.com/post/timeout-command-in-linux/
Simple answer
os.system not support timeout.
you can use Python 3's subprocess instead, which support timeout parameter
such as:
yourCommand = "mvn surefire:test"
timeoutSeconds = 5
subprocess.check_output(yourCommand, shell=True, timeout=timeoutSeconds)
Detailed Explanation
in further, I have encapsulate to a function getCommandOutput for you:
def getCommandOutput(consoleCommand, consoleOutputEncoding="utf-8", timeout=2):
"""get command output from terminal
Args:
consoleCommand (str): console/terminal command string
consoleOutputEncoding (str): console output encoding, default is utf-8
timeout (int): wait max timeout for run console command
Returns:
console output (str)
Raises:
"""
# print("getCommandOutput: consoleCommand=%s" % consoleCommand)
isRunCmdOk = False
consoleOutput = ""
try:
# consoleOutputByte = subprocess.check_output(consoleCommand)
consoleOutputByte = subprocess.check_output(consoleCommand, shell=True, timeout=timeout)
# commandPartList = consoleCommand.split(" ")
# print("commandPartList=%s" % commandPartList)
# consoleOutputByte = subprocess.check_output(commandPartList)
# print("type(consoleOutputByte)=%s" % type(consoleOutputByte)) # <class 'bytes'>
# print("consoleOutputByte=%s" % consoleOutputByte) # b'640x360\n'
consoleOutput = consoleOutputByte.decode(consoleOutputEncoding) # '640x360\n'
consoleOutput = consoleOutput.strip() # '640x360'
isRunCmdOk = True
except subprocess.CalledProcessError as callProcessErr:
cmdErrStr = str(callProcessErr)
print("Error %s for run command %s" % (cmdErrStr, consoleCommand))
# print("isRunCmdOk=%s, consoleOutput=%s" % (isRunCmdOk, consoleOutput))
return isRunCmdOk, consoleOutput
demo :
isRunOk, cmdOutputStr = getCommandOutput("mvn surefire:test", timeout=5)

how do I make my python program to wait for the subprocess to be completed

I have a python program which should execute a command line (command line is a psexec command to call a batch file on the remote server)
I used popen to call the command line. The batch on the remote server produces a return code of 0.
Now I have to wait for this return code and on the basis of the return code I should continue my program execution.
I tried to use .wait() or .check_output() but for some reason did not work for me.
cmd = """psexec -u CORPORATE\user1 -p force \\\sgeinteg27 -s cmd /c "C:\\Planview\\Interfaces\\ProjectPlace_Sree\\PP_Run.bat" """
p = subprocess.Popen(cmd, bufsize=2048, shell=True,
stdin=subprocess.PIPE, stdout=subprocess.PIPE)
p.wait()
print(p.returncode)
##The below block should wait until the above command runs completely.
##And depending on the return code being ZERO i should continue the rest of
##the execution.
if p.returncode ==0:
result = tr.test_readXid.readQuery(cert,planning_code)
print("This is printed depending if the return code is zero")
Here is the EOF the batch file execution and the return code
Can anybody help me with this ?

Indicate no more input without closing pty

When controlling a process using a PTY master/slave pair, I would like to indicate to the process in question that stdin has closed and I have no more content to send, but I would still like to receive output from the process.
The catch is that I only have one file descriptor (the PTY "master") which handles both input from the child process and output to the child process. So closing the descriptor would close both.
Example in python:
import subprocess, pty, os
master,slave = pty.openpty()
proc = subprocess.Popen(["/bin/cat"], stdin=slave, stdout=slave)
os.close(slave) # now belongs to child process
os.write(master,"foo")
magic_close_fn(master) # <--- THIS is what I want
while True:
out = os.read(master,4096)
if out:
print out
else:
break
proc.wait()
You need to get separate read and write file descriptors. The simple way to do that is with a pipe and a PTY. So now your code would look like this:
import subprocess, pty, os
master, slave = pty.openpty()
child_stdin, parent_stdin = os.pipe()
proc = subprocess.Popen(["/bin/cat"], stdin=child_stdin, stdout=slave)
os.close(child_stdin) # now belongs to child process
os.close(slave)
os.write(parent_stdin,"foo") #Write to the write end (our end) of the child's stdin
#Here's the "magic" close function
os.close(parent_stdin)
while True:
out = os.read(master,4096)
if out:
print out
else:
break
proc.wait()
I had to do this today, ended up here and was sad to see no answer. I achieved this using a pair of ptys rather than a single pty.
stdin_master, stdin_slave = os.openpty()
stdout_master, stdout_slave = os.openpty()
def child_setup():
os.close(stdin_master) # only the parent needs this
os.close(stdout_master) # only the parent needs this
with subprocess.Popen(cmd,
start_new_session=True,
stderr=subprocess.PIPE,
stdin=stdin_slave,
stdout=stdout_slave,
preexec_fn=child_setup) as proc:
os.close(stdin_slave) # only the child needs this
os.close(stdout_slave) # only the child needs this
stdin_pty = io.FileIO(stdin_master, "w")
stdout_pty = io.FileIO(stdout_master, "r")
stdin_pty.write(b"here is your input\r")
stdin_pty.close() # no more input (EOF)
output = b""
while True:
try:
output += stdout_pty.read(1)
except OSError:
# EOF
break
stdout_pty.close()
I think that what you want is to send the CTRL-D (EOT - End Of Transmission) caracter, isn't you? This will close the input in some applications, but others will quit.
perl -e 'print qq,\cD,'
or purely shell:
echo -e '\x04' | nc localhost 8080
Both are just examples. BTW the CTRL-D caracter is \x04 in hexa.

Resources