Python multiprocessing not running all items

Python multiprocessing not running all items - python-3.x

I am running test cases for a matlab based program. I have several hundred test cases to run and since each test case uses a single core I have been using multiprocessing, Pool, and map to help do this work in parallel
The program takes command line arguments where I execute a bash script. I have written code which creates a csv file of the bash commands that need to be called for each test case. I read each test case from the csv file into variable testcase_to_run which creates a set of individual lists (needed in this format to be fed into the map function I believe
I have a 12 core machine so I run (12-1) instances at a time in parallel. I have noticed that with certain test-cases and their arguments not every test case gets run. I am seeing up to 20% of test cases just not being run (bash script first command is to create a new file to store results)
from multiprocessing import Pool
import subprocess
number_to_run_in_parallel = 11
testcase_to_run = ([testcase_1 arguments], [testcase_2 arguments], ....[testcase_250 arguments])
def execute_test_case(work_data):
subprocess.call(work_data, shell=True)
def pool_handler(number_to_run_in_parallel):
p = Pool(number_to_run_in_parallel)
p.map(execute_test_case, test_cases_to_run)
if __name__ == "__main__":
pool_handler(number_to_run_in_parallel)

Related

Get realtime output from a long-running executable using python

It's my first time asking a question on here so bear with me.
I'm trying to make a python3 program that runs executable files for x amount of time and creates a log of all output in a text file. For some reason the code I have so far works only with some executables. I'm new to python and especially subprocess so any help is appreciated.
import time
import subprocess
def CreateLog(executable, timeout=5):
time_start = time.time()
process = subprocess.Popen(executable, stdout=subprocess.PIPE, stderr=subprocess.DEVNULL, text=True)
f = open("log.txt", "w")
while process.poll() is None:
output = process.stdout.readline()
if output:
f.write(output)
if time.time() > time_start + timeout:
process.kill()
break
I was recently experimenting with crypto mining and came across nanominer, I tried using this python code on nanominer and the log file was empty. I am aware that nanominer already logs its output, but the point is why does the python code fail.

You are interacting through .poll() (R U dead yet?) and .readline().
It's not clear you want to do that.
There seems to be two cases for your long-lived child:
it runs "too long" silently
it runs forever, regularly producing output text at e.g. one-second intervals
The 2nd case is the easy one.
Just use for line in process.stdout:, consume the line,
peek at the clock, and maybe send a .kill() just as you're already doing.
No need for .poll(), as child exiting will produce EOF on that pipe.
For the 1st case, you will want to set an alarm.
See https://docs.python.org/3/library/signal.html#example
signal.signal(signal.SIGALRM, handler)
signal.alarm(5)
After "too long", five seconds, your handler will run.
It can do anything you desire.
You'll want it to have access to the process handle,
which will let you send a .kill().

Python pool.apply_async() doesn't call target function?

I'm writing an optimization routine to brute force search a solution space for optimal hyper parameters; and apply_async does not appear to be doing anything at all. Ubuntu Server 16.04, Python 3.5, PyCharm CE 2018. Also, I'm doing this on an Azure virtual machine. My code looks like this:
class optimizer(object):
def __init__(self,n_proc,frame):
# Set Class Variables
def prep(self):
# Get Data and prepare for optimization
def ret_func(self,retval):
self.results = self.results.append(retval)
print('Something')
def search(self):
p = multiprocessing.Pool(processes=self.n_proc)
for x, y in zip(repeat(self.data),self.grid):
job = p.apply_async(self.bot.backtest,(x,y),callback=self.ret_func)
p.close()
p.join()
self.results.to_csv('OptimizationResults.csv')
print('***************************')
print('Exiting, Optimization Complete')
if __name__ == '__main__':
multiprocessing.freeze_support()
opt = optimizer(n_proc=4,frame='ytd')
opt.prep()
print('Data Prepped, beginning search')
opt.search()
I was running this exact setup on a Windows Server VM, and I switched over due to issues with multiprocessing not utilizing all cores. Today, I configured my machine and was able to run the optimization one time only. After that, it mysteriously stopped working with no change from me. Also, I should mention that it spits out output every 1 in 10 times I run it. Very odd behavior. I expect to see:
Something
Something
Something
.....
Which would typically be the best "to-date" results of the optimization (omitted for clarity). Instead I get:
Data Prepped, beginning search
***************************
Exiting, Optimization Complete
If I call get() on the async object, the results are printed as expected, but only one core is utilized because the results are being gathered in the for loop. Why isn't apply_async doing anything at all? I should mention that I use the "stop" button on Pycharm to terminate the process, not sure if this has something to do with it?
Let me know if you need more details about prep(), or bot.backtest()

I found the error! Basically I was converting a dict() to a list() and passing the values from the list into my function! The list parameter order was different every time I ran the function, and one of the parameters needed to be an integer, not a float.
For some reason, on windows, the order of the dict was preserved when converting to a list; not the case with Ubuntu! Very interesting.

Need help for automatization of a Python function

I have here a python program written for an Enigma 2 Linux Set top box:
VirtualZap Python program for Enigma 2 based set top boxes
I want to automatize the execution of the following function every minute:
def aktualisieren(self):
self.updateInfos()
You can find the defined function within line 436 and 437.
My problem is that class VirtualZap contains only one constructor but no main method with the actual program run, therefore it is difficult to implement threads or coroutines. Is there any possibility to automatize the execution of the aktualisieren function?

There is an Advanced Python Scheduler
from apscheduler.schedulers.blocking import BlockingScheduler
def aktualisieren(self):
self.updateInfos()
scheduler = BlockingScheduler()
scheduler.add_job(aktualisieren, 'interval', hours=1)
scheduler.start()

Run test scripts in parallel in nGrinder

We are running performance tests with nGrinder. We have use cases where we would desire to run multiple test scripts in parallel.
On their website it is stated that one user can only run one test at a time. So we setup two users but I see the same behavior: only one test script is running and the others are waiting in a READY state.
Is there any way in nGrinder to run multiple test scripts in parallel?

It's only possible to run multiple test concurrently when these tests are submitted to execute by the different users if the free agents are available enough to run both tests.
I'm suspecting you don't have enough agents to run both.

You can run many scripts using one agent only . I would divide agents based on transaction groups and not on scripts.
Inside grinder there is parallel.py .I have used this only before to run scripts in parallel.
See this link https://github.com/DealerDotCom/grinder/blob/master/grinder/examples/parallel.py
from net.grinder.script.Grinder import grinder
scripts = ["TestScript1", "TestScript2", "TestScript3"]
Ensure modules are initialised in the process thread.
for script in scripts: exec("import %s" % script)
def createTestRunner(script):
exec("x = %s.TestRunner()" % script)
return x
class TestRunner:
def init(self):
tid = grinder.threadNumber
if tid % 4 == 2:
self.testRunner = createTestRunner(scripts[1])
elif tid % 4 == 3:
self.testRunner = createTestRunner(scripts[2])
else:
self.testRunner = createTestRunner(scripts[0])
# This method is called for every run.
def __call__(self):
self.testRunner()

Simultaneously executing a loop and a function in Python 3

I have a problem: I am writing a program in Python 3.2 that requires that a loop run uninterrupted and separate from the rest of the program, but at the same time it must be able to send and receive data (such as a string) from the main part of the script. The parts would work like this:
# Continuing loop (LOOP)
while True:
data.read()
if data[2] == "ff":
string += data
if request = True:
SEND(string, MAIN)
string = []
# Main program (MAIN)
hexValues = REQUEST(string, LOOP)
So, like having two processes of Python running at the same time but talking to each other.
Is this even possible? If so, how should I do it?
EDIT: I am using Ubuntu GNU/Linux and Python 3.2.

This is what the threading module is for. You can also look at multiprocessing.

Develop Reference

node.js excel linux python-3.x azure haskell apache-spark rust .htaccess string

Python multiprocessing not running all items - python-3.x

Related

Get realtime output from a long-running executable using python

Python pool.apply_async() doesn't call target function?

Need help for automatization of a Python function

Run test scripts in parallel in nGrinder

Simultaneously executing a loop and a function in Python 3

Categories

Resources