subprocess.Popen hangs for ~70 seconds using Python3? - linux

In my program I have this utility function for executing commands in shell, here's a simplified version of it:
def run_command(cmd):
s = time.time()
print('starting subprocess')
proc = subprocess.Popen(cmd.split(),
stdout=subprocess.PIPE,
stderr=subprocess.PIPE,
universal_newlines=True)
print('subprocess started after ({0}) seconds'.format(time.time() - s))
My program uses this function from different threads to execute commands.
Occasionally the "Popen" line takes around 70 seconds to complete. I mean out of thousands of invocations in a day on different program runs this happens about 4-5 times. As far as I know Popen is non-blocking. What is weird to me is when it does happen it takes the same ~70 seconds to start. It's important to note that while this happen I have 3-4 other threads that are waiting in a loop:
while some_counter > 0:
time.sleep(0.5)
They do so for at most 60 seconds. After they give up and finish their flow I see another ~14 seconds until the "Popen" call finishes. Is there a problem running "Popen" from some threads in parallel to having other threads in a "wait loop"?
Update 1:
I now I see that this problem started after I switched from Fedora27+Python3.6 to Fedora31+python3.7.

Related

Why does the threads run even when the python script has finished its execution

I am curious why the threads started in a python script are running even when the last statement of the script is executed (which means, the script has completed (I believe)).
I have shared below the code I am talking about. Any insights on this would be helpful:
======================================================================================
import time
import threading
start=time.perf_counter()
def do_something():
print("Waiting for a sec...")
time.sleep(60)
print("Waiting is over!!!")
mid1=time.perf_counter()
t1=threading.Thread(target=do_something)
t2=threading.Thread(target=do_something)
mid2=time.perf_counter()
t1.start()
mid3=time.perf_counter()
t2.start()
finish=time.perf_counter()
print(start,mid1,mid2,mid3,finish)
What output do you see? This is what I see:
Waiting for a sec...
Waiting for a sec...
95783.4201273 95783.4201278 95783.4201527 95783.4217046 95783.4219945
Then it's quiet for a minute, and displays:
Waiting is over!!!
Waiting is over!!!
and then the script ends.
That's all as expected. As part of shutting down, the interpreter waits for all running threads to complete (unless they were created with daemon=True, which you should probably avoid until you know exactly what you're doing). You told your threads to sleep for 60 seconds before finishing, and that's what they did.

Python: running many subprocesses from different threads is slow

I have a program with 1 process that starts a lot of threads.
Each thread might use subprocess.Popen to run some command.
I see that the time to run the command increases with the number of threads.
Example:
>>> def foo():
... s = time.time()
... subprocess.Popen('ip link show'.split(), stdout=subprocess.PIPE, stderr=subprocess.PIPE, universal_newlines=True).communicate()
... print(time.time() - s)
...
>>> foo()
0.028950929641723633
>>> [threading.Thread(target=foo).start() for _ in range(10)]
0.058995723724365234
0.07323050498962402
0.09158825874328613
0.11541390419006348 # !!!
0.08147192001342773
0.05238771438598633
0.0950784683227539
0.10175108909606934 # !!!
0.09703755378723145
0.06497764587402344
Is there another way of executing a lot of commands from single process in parallel that doesn't decrease the performance?
Python's threads are, of course, concurrent, but they do not really run in parallel because of the GIL. Therefore, they are not suitable for CPU-bound applications. If you need to truly parallelize something and allow it to run on all CPU cores, you will need to use multiple processes. Here is a nice answer discussing this in more detail: What are the differences between the threading and multiprocessing modules?.
For the above example, multiprocessing.pool may be a good choice (note that there is also a ThreadPool available in this module).
from multiprocessing.pool import Pool
import subprocess
import time
def foo(*args):
s = time.time()
subprocess.Popen('ip link show'.split(), stdout=subprocess.PIPE, stderr=subprocess.PIPE, universal_newlines=True).communicate()
return time.time() - s
if __name__ == "__main__":
with Pool(10) as p:
result = p.map(foo, range(10))
print(result)
# [0.018695592880249023, 0.009021520614624023, 0.01150059700012207, 0.02113938331604004, 0.014114856719970703, 0.01342153549194336, 0.011168956756591797, 0.014746427536010742, 0.013572454452514648, 0.008752584457397461]
result = p.map_async(foo, range(10))
print(result.get())
# [0.00636744499206543, 0.011589527130126953, 0.010645389556884766, 0.0070612430572509766, 0.013571739196777344, 0.009610414505004883, 0.007040739059448242, 0.010993719100952148, 0.012415409088134766, 0.0070383548736572266]
However, if your function is similar to the example in that it mostly just launches other processes and doesn't do a lot of calculations - I doubt parallelizing it will make much of a difference because the subprocesses can already run in parallel. Perhaps the slowdown occurs because your whole system gets overwhelmed for a moment because of all those processes (could be CPU usage is high or too many disk reads/writes are attempted within a short time). I would suggest taking a close look at system resources (Task Manager etc.) while running the program.
maybe it has nothing to do with python: Opening a new shell = opening a new file since basically everything is a file on linux
take a look at your limit for open files with this command (default is 1024):
ulimit
and try to raise it with this command to see if your code gets faster :
ulimit -n 2048

Python Counter too slow

I just wanted to code a little timer on my work pc. Funny thing is, the counter is too slow, meaning it runs longer than it should. I am really confused. The delay grows the smaller the intervals of updating become. Is my pc too slow? The CPU is around 30% while running this... idk.
python3.6.3
import time
def timer(sec):
start = sec
print(sec)
while sec > 0:
sec = sec-0.1 #the smaller this value, the slower
time.sleep(0.1)
print(round(sec,2))
print("Done! {} Seconds passed.".format(start))
start = time.time() #For Testing
timer(10)
print(time.time()-start)
Sleeping you process require a system call (a call to the kernel, which triggers an hardware interruption to give hand to that kernel), and a hardware clock interruption to wake up the process once it's done. Sleeping may not be a lot of CPU computations, but waiting for the hardware interruption and the kernel to task the processes can take multiple CPU cycles.
Rather than waiting for a constant unit of time, I suggest you to wait for the time required to hit the next milestone (by getting the current time, rounding it to the next step and getting the difference)
Try this way, you can use normal operators on time.time()
import time
start = time.time()
seconds = 5
while True:
if start - time.time() > seconds:
print(seconds + " elapsed.")

Python 3: create new process when another one finishes

I have an array of data to handle and handler that executing long (1-2 minutes) and takes a lot of memory for its calculations.
raw = ['a', 'b', 'c']
def handler():
# do something long
Since handler requires a lot of memory, I want to execute it in separate subprocess and kill it after execution to release memory. Something like the following snippet:
from multiprocessing import Process
for r in raw:
process = Process(target=handler, args=(r))
process.start()
The problem is that such approach leads to immediate running len(raw) processes. And it's not good.
Also, it's not needed to interchange any kind of data between subprocesses. Just run them consequently.
Therefore it would be great to run a few processes at the same time and add a new one once existing finishes.
How could it be implemented (if it's even possible)?
to run your processes sequentially, just join each process within the loop:
from multiprocessing import Process
for r in raw:
process = Process(target=handler, args=(r))
process.start()
process.join()
that way you're sure that only one process is running at the same time (no concurrency)
That's the simplest way. To run more than one process but limit the number of processes running at the same time, you can use a multiprocessing.Pool object and apply_async
I've built a simple example which computes the square of the argument, and simulates an heavy processing:
from multiprocessing import Pool
import time
def target(r):
time.sleep(5)
return(r*r)
raw = [1,2,3,4,5]
if __name__ == '__main__':
with Pool(3) as p: # 3 processes at a time
reslist = [p.apply_async(target, (r,)) for r in raw]
for result in reslist:
print(result.get())
Running this I get:
<5 seconds wait, time to compute the results>
1
4
9
<5 seconds wait, 3 processes max can run at the same time>
16
25

What happens to threads when I dont explicitly call the join method?

I need to make some network calls for data in my program. I intend to call them in parallel but not all of them need to complete.
What i have right now is
thread1 = makeNetworkCallThread()
thread1.start()
thread2 = makeLongerNetworkCallThread()
thread2.start()
thread1.join()
foo = thread1.getData()
thread2.join()
if conditionOn(foo):
foo = thread2.getData()
# continue with code
the problem with this is that even if the shorter network call succeeded, I need to wait for the time it takes for the longer network call to complete
What will happen if I move the thread2.join() inside the if statement? The join method might never get called. Will that cause some problems with stale threads etc?
thread2 will still continue to run (subject to the caveats of the GIL, but since it is a network call that's probably not a concern) whether join is called or not. The difference is whether the main context waits for the thread to end before going on to do other things - if you're able to continue processing without that longer network call completing, then there should be no issues.
Do keep in mind that the program will not actually end (the interpreter will not exit) until all threads have been completed. Depending on the latency of this long network call to the run time of the rest of your program (in the event you don't wait), it might appear that the program reaches its end but doesn't actually exit until the network call wraps up. Consider this silly example:
# Python 2.7
import threading
import time
import logging
def wasteTime(sec):
logging.info('Thread to waste %d seconds started' % sec)
time.sleep(sec)
logging.info('Thread to waste %d seconds ended' % sec)
if __name__ == '__main__':
logging.basicConfig(format='%(asctime)s %(message)s', level=logging.INFO)
t1 = threading.Thread(target=wasteTime, args=(2,))
t2 = threading.Thread(target=wasteTime, args=(10,))
t1.start()
t2.start()
t1.join()
logging.info('Main context done')
This is the logging output:
$ time python test.py
2015-01-15 09:32:12,239 Thread to waste 2 seconds started
2015-01-15 09:32:12,239 Thread to waste 10 seconds started
2015-01-15 09:32:14,240 Thread to waste 2 seconds ended
2015-01-15 09:32:14,241 Main context done
2015-01-15 09:32:22,240 Thread to waste 10 seconds ended
real 0m10.026s
user 0m0.015s
sys 0m0.010s
Note that although the main context reached its end after 2 seconds (the amount of time it took for thread1 to complete), the program doesn't completely exit until thread2 is completed (ten seconds after start of execution). In situations like this (particularly if the output is being logged as such), it's my opinion that it is better to explicitly call join at some point and explicitly identify in your logs that this is what the program is doing so that it doesn't look to the user/operator like it has hung. For my silly example, that might look like adding lines like this to the end of the main context:
logging.info('Waiting for thread 2 to complete')
t2.join()
Which will generate somewhat less mysterious log output:
$ time python test.py
2015-01-15 09:39:18,979 Thread to waste 2 seconds started
2015-01-15 09:39:18,979 Thread to waste 10 seconds started
2015-01-15 09:39:20,980 Thread to waste 2 seconds ended
2015-01-15 09:39:20,980 Main context done
2015-01-15 09:39:20,980 Waiting for thread 2 to complete
2015-01-15 09:39:28,980 Thread to waste 10 seconds ended
real 0m10.027s
user 0m0.015s
sys 0m0.010s

Resources