I have a polling job written in Python that is executed every 15 minutes to check if the status of an entry in the job table is True. If the status is true then I need to take the values from the table and pass them as arguments to another script that executes something.
I am creating the child processes using Process in Python's Multiprocessing module but I am unable to exit the polling job(parent script) after starting these processes. The polling job keeps waiting until the children complete even if there is a sys.exit() written after creating the children.
#pollingjob.py
import sys
import multiprocessing
from multiprocessing import Process
from secondscript import secondscriptfunction
def createParallelBatches(a,b,c):
for i in [1,2,3]:
p1 = Process(target=secondscriptfunction,args=(a,b,c)).start()
sys.exit()
if __name__=='__main__':
# Check the table for *status=True* rows
# If such rows exit call CreateParallelBatches with the column values
What I am failing to understand is that, why sys.exit() won't let me exit the program leaving the spawned processes as orphans. I tried subprocess module but it also behaves in the same way. I don't want the parent process waiting on its children to complete. Any help would be appreciated. Thanks.
you need to launch an independent sub process externally. This is one way to do it :
put the secondscriptfunction into one executable python file
File runscript.py
import sys
from secondscript import secondscriptfunction
if __name__=='__main__':
secondscriptfunction(sys.argv[1:]) #passing arguments to the func
use subprocess.popen in your script:
File pollingjob.py
import subprocess
import shlex
def createParallelBatches(a,b,c):
for i in [1,2,3]:
command = "python runscript.py %s %s %s "%(a,b,c)
cmd_args = shlex.split(command)
subprocess.Popen(cmd_args)
sys.exit()
Just remove the process object from the _children set of the current process object, the the parent process will exit immediately.
Theh multiprocessing module manages child processes in a private set and join them when the current process exits. You can remove children from the set if you don't care of them.
process = multiprocessing.Process(target=proc_main)
multiprocessing.current_process()._children.discard(process)
exit(0)
Related
I have a code that is architecturally close to posted below (unfortunately i can't post full version cause it's proprietary). I have an self-updating executable and i'm trying to test this feature. We assume that full path to this file will be in A.some_path after executing input. My problem is that assertion failed, because on second call os.stat still returning the previous file stats (i suppose it thinks that nothing could changed so it's unnecessary). I have tried to launch this manually and self-updating works completely fine and the file is really removing and recreating with stats changing. Is there any guaranteed way to force os.stat re-read file stats by the same path, or alternative option to make it works (except recreating an A object)?
from pathlib import Path
import unittest
import os
class A:
some_path = Path()
def __init__(self, _some_path):
self.some_path = Path(_some_path)
def get_path(self):
return self.some_path
class TestKit(unittest.TestCase):
def setUp(self):
pass
def check_body(self, a):
some_path = a.get_path()
modification_time = os.stat(some_path).st_mtime
# Launching self-updating executable
self.assertTrue(modification_time < os.stat(some_path).st_mtime)
def check(self):
a = A(input('Enter the file path\n'))
self.check_body(a)
def Tests():
suite = unittest.TestSuite()
suite.addTest(TestKit('check'))
return suite
def main():
tests_suite = Tests()
unittest.TextTestRunner().run(tests_suite)
if __name__ == "__main__":
main()
I have found the origins of the problem: i've tried to launch self-updating via os.system which wait till the process done. But first: during the self-updating we launch several detached proccesses and actually should wait unitl all them have ended, and the second: even the signal that the proccess ends doesn't mean that OS really completely realease the file, and looks like on assertTrue we are not yet done with all our routines. For my task i simply used sleep, but normal solution should analyze the existing proccesses in the system and wait for them to finish, or at least there should be several attempts with awaiting.
I am starting two processes via multiprocessing and this is working fine. The only problems which I have are the print and debug statements from these two processes.
The hope is, to use the REPL and start the processes, like in the background. However, I do not get this to run. I always get the debug statements and therefore can't use the REPL anymore. This is how I call the processes:
processes = [
Process(target=start_viewer, args=()),
Process(target=start_server, args=(live, amount, fg))
]
for p in processes:
p.start()
Any idea on how to "mute" the process, or get them in the background?
If I correct understand you, you want to not show printing from one of processes.
You can achieve this by redirect output of the Python Interpreter.
Add sys.stdout = open("/dev/null", 'w') to the process which you want to "mute".
Full working example below.
from multiprocessing import Process
from time import sleep
import sys
def start_viewer():
sys.stdout = open("/dev/null", 'w')
while True:
print("start_viewer")
sleep(1)
def start_server():
while True:
print("start_server")
sleep(1)
if __name__ == '__main__':
processes = [
Process(target=start_viewer, args=()),
Process(target=start_server, args=())
]
for p in processes:
p.start()
Be aware that /dev/null is like passing prints to nowhere, if you want to save it you can use text file. Also to achieve multi os support you should use os.devnull.
I am trying to write a shell that needs to run socket connections on a seperate thread. On my testings, when print() is used while cmd.Cmd.cmdloop() waiting for input, the print is displaying wrong.
from core.shell import Shell
import time
import threading
def test(shell):
time.sleep(2)
shell.write('Doing test')
if __name__ == '__main__':
shell = Shell(None, None)
testThrd = threading.Thread(target=test, args=(shell,))
testThrd.start()
shell.cmdloop()
When the above command runs, here is what happens:
python test.py
Welcome to Test shell. Type help or ? to list commands.
>>asd
*** Unknown syntax: asd
>>[17:59:25] Doing test
As you can see, printing from another threads add output after prompt >> not in a new line. How can I do it so that it appears in a new line and prompt appears?
What you can do, is redirect stdout from your core.shell.Shell to a file like object such as StringIO. You would also redirect the output from your thread into a different file like object.
Now, you can have some third thread read both of these objects and print them out in whatever fashion you want.
You said core.shell.Shell inherits from cmd.Cmd, which allows redirection as a parameter to the constructor:
import io
import time
import threading
from core.shell import Shell
def test(output_obj):
time.sleep(2)
print('Doing test', file=output_obj)
cmd_output = io.StringIO()
thr_output = io.StringIO()
shell = Shell(stdout=cmd_output)
testThrd = threading.Thread(target=test, args=(thr_output,))
testThrd.start()
# in some other process/thread
cmd_line = cmd_output.readline()
thr_line = thr_output.readline()
That's quite difficult. Both your threads are sharing the same stdout. So the output from each of those threads are concurrently sent to your stdout buffer where they are printed in some arbitrary order.
What you need to do is coordinate the output from both threads, and that's a tough nut to crack. Even bash doesn't do that!
That said, maybe you can try using a lock to make sure your threads access stdout in a controlled manner. Check out: http://effbot.org/zone/thread-synchronization.htm
I have a Python3 script that uses subprocess.call to run a program on about 2,300 input files in a directory and there are two output files for each input file. I have these two outputs going into two different directories. I would like to learn how to multiprocess my script so several files can be processed at the same time. I have been reading on the multiprocess library in Python but it might be too advanced for me to understand. Below is the script if the experts have any input. Thanks so much!
Script:
import os
import subprocess
import argparse
parser = argparse.ArgumentParser(description="This script aligns DNA sequences in files in a given directory.")
parser.add_argument('--root', default="/shared/testing_macse/", help="PATH to the input directory containing CDS orthogroup files.")
parser.add_argument('--align_NT_dir', default="/shared/testing_macse/NT_aligned/", help="PATH to the output directory for NT aligned CDS orthogroup files.")
parser.add_argument('--align_AA_dir', default="/shared/testing_macse/AA_aligned/", help="PATH to the output directory for AA aligned CDS orthogroup files.")
args = parser.parse_args()
def runMACSE(input_file, NT_output_file, AA_output_file):
MACSE_command = "java -jar ~/bin/MACSE/macse_v1.01b.jar "
MACSE_command += "-prog alignSequences "
MACSE_command += "-seq {0} -out_NT {1} -out_AA {2}".format(input_file, NT_output_file, AA_output_file)
# print(MACSE_command)
subprocess.call(MACSE_command, shell=True)
Orig_file_dir = args.root
NT_align_file_dir = args.align_NT_dir
AA_align_file_dir = args.align_AA_dir
try:
os.makedirs(NT_align_file_dir)
os.makedirs(AA_align_file_dir)
except FileExistsError as e:
print(e)
for currentFile in os.listdir(args.root):
if currentFile.endswith(".fa"):
runMACSE(args.root + currentFile, args.align_NT_dir + currentFile[:-3]+"_NT_aligned.fa", args.align_AA_dir + currentFile[:-3]+"_AA_aligned.fa")
Subprocess functions run any command-line executable in a separate process. You are running java. Multiprocessing runs python code in separate processes, just as threading runs python code in separate threads. The API for the two is intentionally similar. So multiprocessing cannot substitute for non-python subprocess calls.
It would be a waste of processes to use multiple python processes to initiate multiple java processes. You could just as well use multiple threads to make multiple subprocess calls. Or use the async module.
Or make your own scheduler. Wrap your for-if in a generator function.
def fa_file(path):
for currentFile in os.listdir(path):
if currentFile.endswith(".fa"):
yield currentFile
fafiles = fa_file(arg.root)
Make an array of, say, 10 Popen objects. Sleep for some appropriate interval. Upon waking, loop through the array and replace finished subprocesses (.poll() returns something other than None) for as long as next(fafiles) returns something.
EDIT: If you did the image processing in Python code that calls compiled C code (pillow, for instance), then you could use multiprocessing and a Queue loaded with the files to process.
This is a sample demo of how I want to use threading.
import threading
import time
def run():
print("started")
time.sleep(5)
print("ended")
thread = threading.Thread(target=run)
thread.start()
for i in range(4):
print("middle")
time.sleep(1)
How can I make this threading work demo even from multiple files?
Example:
# Main.py
import background
""" Here I will have a main program and \
I want the command from the background file to constantly run. \
Not just once at the start of the program """
The second file:
# background.py
while True:
print("This text should always be printing \
even when my code in the main function is running")
Put all the lines before your for loop in background.py. When it is imported it will start the thread running. Change the run method to do your infinite while loop.
You may also want to set daemon=True when starting the thread so it will exit when the main program exits.
main.py
import time
import background
for i in range(4):
print("middle")
time.sleep(1)
background.py
import threading
import time
def run():
while True:
print("background")
time.sleep(.5)
thread = threading.Thread(target=run,daemon=True)
thread.start()
Output
background
middle
background
middle
background
background
background
middle
background
background
middle
background
background