Run test scripts in parallel in nGrinder - performance-testing

We are running performance tests with nGrinder. We have use cases where we would desire to run multiple test scripts in parallel.
On their website it is stated that one user can only run one test at a time. So we setup two users but I see the same behavior: only one test script is running and the others are waiting in a READY state.
Is there any way in nGrinder to run multiple test scripts in parallel?

It's only possible to run multiple test concurrently when these tests are submitted to execute by the different users if the free agents are available enough to run both tests.
I'm suspecting you don't have enough agents to run both.

You can run many scripts using one agent only . I would divide agents based on transaction groups and not on scripts.
Inside grinder there is parallel.py .I have used this only before to run scripts in parallel.
See this link https://github.com/DealerDotCom/grinder/blob/master/grinder/examples/parallel.py
from net.grinder.script.Grinder import grinder
scripts = ["TestScript1", "TestScript2", "TestScript3"]
Ensure modules are initialised in the process thread.
for script in scripts: exec("import %s" % script)
def createTestRunner(script):
exec("x = %s.TestRunner()" % script)
return x
class TestRunner:
def init(self):
tid = grinder.threadNumber
if tid % 4 == 2:
self.testRunner = createTestRunner(scripts[1])
elif tid % 4 == 3:
self.testRunner = createTestRunner(scripts[2])
else:
self.testRunner = createTestRunner(scripts[0])
# This method is called for every run.
def __call__(self):
self.testRunner()

Related

Issue with spawning multiple processes

I am running a code and getting this error:
RuntimeError:
An attempt has been made to start a new process before the
current process has finished its bootstrapping phase.
This probably means that you are not using fork to start your
child processes and you have forgotten to use the proper idiom
in the main module:
Could someone help me with this?
Here is the code, I am running:
def heavy(n, myid):
for x in range(1, n):
for y in range(1, n):
x**y
print(myid, "is done")
start=time.perf_counter()
big=[]
for i in range(50):
p=multiprocessing.Process(target=heavy,args=(500,i))
big.append(p)
p.start()
for i in big:
i.join()
end=time.perf_counter()
print(end-start)
Depending on the OS, the new process will be either forked from the current one (in linux) or spawned (Windows, mac). A simplistic view of forking is that a copy of the current process is made: all the objects already created will exists in the new copy, alas will not be the same objects as in the initial process. Indeed, objects in one process are not directly accessible from the other process. Spawning starts a new python interpreter, re-runs the file (with a __name__ different from '__main__'), and then runs the target passed when creating the Process. The re-running of the file is what is part of the 'bootstraping phase'.
In your case, as you do not use the if __name__ == '__main__': guard, when the file is re-run tries to start new process, which would lead to an infinite tree of new processes. The multiprocessing module takes care of this cases and detects that you are trying to start new processes even before the target was run.
For a simple test of what happens when you use multiprocessing, see the code in this answer.

Get number of active instances for BackgroundScheduler jobs

I have a simple BackgroundScheduler and a simple task. The BackgroundScheduler is configured to run only a single instance for that task:
from apscheduler.schedulers.background import BackgroundScheduler
scheduler.add_job(run_task, 'interval', seconds=10)
scheduler.start()
When a tasks starts, it takes much more than 10 seconds to complete and I get the warning:
Execution of job "run_tasks (trigger: interval[0:00:10], next run at: 2020-06-17 18:25:32 BST)" skipped: maximum number of running instances reached (1)
This works as expected.
My problem is that I can't find a way to check if an instance of that task is currently running.
In the docs, there are many ways to get all and individual scheduled tasks, but I can't find a way to check if a task is currently running or not.
I would ideally want something like:
def job_in_progress():
job = scheduler.get_job(id=job_id)
instances = job.get_instances()
return instances > 0
Any ideas?
Not great because you have to access a private attribute, but the only thing I could find.
def job_in_progress():
job = scheduler.get_job(id=job_id)
instances = scheduler._instances[job_id]
return instances > 0
If someone else has another idea, don't use this.

Celery worker with multithreading - how to update results concurently

I created a Flask API with a Celery worker. User fires "start tests" button which makes a POST request that returns url which user can use to get results of tests every 5 seconds (needed to update fontend progress bar). The Celery task includes threading. My goal is to update Celery task state based on the results of threads concurently. I don't want to wait until all my threads finish to return their result. My Celery task looks like this:
#celery.task(bind=True) # bind argument instructs Celery to send a "self" argument and use it to record status updates
def run_tests(self, dialog_cases):
"""
Testing running as a background task
"""
results = []
test_case_no = 1
test_controller = TestController(dialog_cases)
bot_config = [test_controller.url, test_controller.headers, test_controller.db_name]
threads = []
queue = Queue()
start = time.perf_counter()
threads_list = list()
for test_case in test_controller.test_cases:
t = Thread(target=queue.put({randint(0,1000): TestCase(test_case, bot_config)}))
t.start()
threads_list.append(t)
for t in threads_list:
t.join()
results_dict_list = [queue.get() for _ in range(len(test_controller.test_cases))]
for result in results_dict_list:
for key, value in result.items():
cprint.info(f"{key}, {value.test_failed}")
Now: the TestCase is an object that on creation runs a function that makes a few iterations and afterwards returns whether the test failed or passed. I have another Flask endpoint which returns the status of the tasks. Question is how to get the value returned by threads simultanously without having to wait until they are all finished? I tried Queue but this can only return results when everything is over.
You can simply use update_state to modify state of the task, from each of those threads if that is what you want. Furthermore, you can create your own, custom states. As you want to know result of each test the moment it is finished, it seems like a good idea to have a custom state for teach test that you update from each thread durint runtime.
An alterantive is to refactor your code so each test is actually a Celery task. Then you use Chord or Group primitives to build your workflow. As you want to know the state during runtime, then perhaps Group is better because then you can monitor the state of the GroupResult object...

Multithread celery worker for task division

Im currently building an application that, based on some input, runs some scans.
The issue i'm encountering is that a bottleneck is being creating on some of the scans, I was wondering if there was a way to implement a different thread/worker for these tasks.
I'll ellaborate a little bit more.
I start my worker with the command
pipenv run celery -A proj worker -B -l info
### Tasks.py ###
#shared_task
def short_task_1():
return
#shared_task
def short_task_2():
return
#shared_task
def long_task_1():
return
### handler.py ###
def handle_scan():
short_task_1.delay()
short_task_2.delay()
long_task_1.delay()
A possible solution I find is assigning the short tasks to one worker and the longer ones to the other. But I cant find in the docs how to define which worker the task is assigned to with the delay() command.
Will having another worker for handling this tasks help? If another thread is the solution, whats the best way of doing it?
I ended up doing the following
delay() does not work if you are trying to use multiple task queues. Mainly because delay() is only used if the "default" queue is used. For using multiple queues, apply_async() must be used.
For example, if a task was called with .delay(arg1, arg2)
Now (With multiple queues in mind) it needs to be called with .apply_async(args=[arg1,arg2], queue='queue_name')
So, here is how I did it finally, thanks to #DejanLekic
tasks.py
#shared_task
def short_task_1():
return
#shared_task
def short_task_2():
return
#shared_task
def long_task_1():
return
Same as before. But here is the new handler
def handle_scan():
# Fast queue with args if required
short_task_1.apply_async(args=[arg1, arg2], queue='fast_queue')
short_task_2.apply_async(args=[arg1, arg2], queue='fast_queue')
# slow queue
long_task_1.apply_async(args=[arg1, arg2], queue='slow_queue')
I start the workers by doing the following (mind the pipenv):
pipenv run celery -A proj worker -B --loglevel=info -Q slow_queue,fast_queue

Python multiprocessing not running all items

I am running test cases for a matlab based program. I have several hundred test cases to run and since each test case uses a single core I have been using multiprocessing, Pool, and map to help do this work in parallel
The program takes command line arguments where I execute a bash script. I have written code which creates a csv file of the bash commands that need to be called for each test case. I read each test case from the csv file into variable testcase_to_run which creates a set of individual lists (needed in this format to be fed into the map function I believe
I have a 12 core machine so I run (12-1) instances at a time in parallel. I have noticed that with certain test-cases and their arguments not every test case gets run. I am seeing up to 20% of test cases just not being run (bash script first command is to create a new file to store results)
from multiprocessing import Pool
import subprocess
number_to_run_in_parallel = 11
testcase_to_run = ([testcase_1 arguments], [testcase_2 arguments], ....[testcase_250 arguments])
def execute_test_case(work_data):
subprocess.call(work_data, shell=True)
def pool_handler(number_to_run_in_parallel):
p = Pool(number_to_run_in_parallel)
p.map(execute_test_case, test_cases_to_run)
if __name__ == "__main__":
pool_handler(number_to_run_in_parallel)

Resources