LoadRunner and the need for pacing - memory-leaks

Running a single script with only two users as a single scenario without any pacing, just think time set to 3 seconds and random (50%-150%) I experience that the web app server runs of of memory after 10 minutes every time (I have run the test several times, and it happens at the same time every time).
First I thouhgt this was a memory leak in the application, but after some thought I figured it might have to do with the scenario design.
The entire script having just one action including log in and log out within the only action block takes about 50 seconds to run and I have the default as soon as the previous iteration ends set not the with delay after the previous iteration ends or fixed/random intervalls set.
Could not using fixed/random intervalls cause this "memory leak" to happen? I guess non of the settings mentioned would actually start a new iteration before the one before ends, this obvioulsy leading to accumulation of memory on the server resulting in this "memory leak". But with no pacing set is there a risk for this to happen?
And having no iterations in my script, could I still be using pacing?

To answer your last question: NO.
Pacing is explicitly used when a new iteration starts. The iteration start is delayed according to pacing settings.
Speculation/Conclusions:
If the web-server really runs out of memory after 10 minutes, and you only have 2 vu's, you have a problem on the web-server side. One could manually achieve this 2vu load and crash the web-server. The pacing in the scripts, or manual user speeds are irrelevant. If the web-server can be crashed from remote, it has bugs that need fixing.
Suggestion:
Try running the scenario with 4 users. Do you get OUT OF MEMORY on the web-server after 5 mins?

If there really is a leak, your script/scenario shouldn't be causing it, but I would think that you could potentially cause it to appear to be a problem sooner depending on how you run it.
For example, let's say with 5 users and reasonable pacing and think times, the server doesn't die for 16 hours. But with 50 users it dies in 2 hours. You haven't caused the problem, just exposed it sooner.

i hope its web server problem.pacing is nothing but a time gap in between iterations,it's not effect actions or transactions in your script

Related

Graceful ramp down of user load in locust

Is there a way to do graceful ramp down of my user load while running load tests using locust framework ?
For e.g. using the below command -
I want to run a single user for a time period of 5 minutes in a loop, but here what happens is that the last iteration end abruptly with lets say 5 requests on some of the tasks and 4 on some others.
Yes! There is a new flag (developed by yours truly :) called --stop-timeout that will make locust wait for the users to complete (up to the specified time span)
There is (as of now) no way to do actual ramp down (gradually reducing the load by shutting down users over time), but hopefully this is enough for you. Such a feature might be added at a later time.
You can create your own Shape that allows you to specify how many user should be present at any time of the test.
This is explained there: https://github.com/locustio/locust/pull/1505

nodejs setTimout loop stopped after many weeks of iterations

Solved : It's a node bug. Happens after ~25 days (2^32 milliseconds), See answer for details.
Is there a maximum number of iterations of this cycle?
function do_every_x_seconds() {
//Do some basic things to get x
setTimeout( do_every_x_seconds, 1000 * x );
};
As I understand, this is considered a "best practice" way of getting things to run periodically, so I very much doubt it.
I'm running an express server on Ubuntu with a number of timeout loops .
One loop that runs every second and basically prints the timestamp.
One that calls an external http request every 5 seconds
and one that runs every X 30 to 300 seconds.
It all seemes to work well enough. However after 25 days without any usage, and several million iterations later, the node instance is still up, but all three of the setTimout loops have stopped. No error messages are reported at all.
Even stranger is that the Express server is still up, and I can load http sites which prints to the same console as where the periodic timestamp was being printed.
I'm not sure if its related, but I also run nodejs with the --expose-gc flag and perform periodic garbage collection and to monitor that memory is in acceptable ranges.
It is a development server, so I have left the instance up in case there is some advice on what I can do to look further into the issue.
Could it be that somehow the event-loop dropped all it's timers?
I have a similar problem with setInterval().
I think it may be caused by the following bug in Node.js, which seem to have been fixed recently: setInterval callback function unexpected halt #22149
Update: it seems the fix has been released in Node.js 10.9.0.
I think the problem is that you are relying on setTimeout to be active over days. setTimeout is great for periodic running of functions, but I don't think you should trust it over extended time periods. Consider this question: can setInterval drift over time? and one of its linked issues: setInterval interval includes duration of callback #7346.
If you need to have things happen intermittently at particular times, a better way to attack this would be to schedule cron tasks that perform the tasks instead. They are more resilient and failures are recorded at a system level in the journal rather than from within the node process.
A good related answer/question is Node.js setTimeout for 24 hours - any caveats? which mentions using the npm package cron to do task scheduling.

What is the default / maximum value for WEBJOBS_RESTART_TIME?

I have a continuous webjob and sometimes it can take a REALLY, REALLY long time to process (i.e. several days). I'm not interested in partitioning it into smaller chunks to get it done faster (by doing it more parallel). Having it run slow and steady is fine with me. I was looking at the documentation about webjobs here where it lists out all the settings but it doesn't specify the defaults or maximums for these values. I was curious if anybody knew.
Since the docs say
"WEBJOBS_RESTART_TIME - Timeout in seconds between when a continuous job's process goes down (for any reason) and the time we re-launch it again (Only for continuous jobs)."
it doesn't matter how long your process runs.
Please clarify your question as most part of it is irrelevant to what you're asking at the end.
If you want to know the min - I'd say try 0. For max try MAX_INT (2147483647), that's 68 years. That should do it ;).
There is no "max run time" for a continuous WebJob. Note that, in practice, there aren't any assurances on how long a given instance of your Web App hosting the WebJob is going to exist, and thus your WebJob may restart anyway. It's always good design to have your continuous job idempotent; meaning it can be restarted many times, and pick back up where it left off.

Performance issue while using Parallel.foreach() with MaximumDegreeOfParallelism set as ProcessorCount

I wanted to process records from a database concurrently and within minimum time. So I thought of using parallel.foreach() loop to process the records with the value of MaximumDegreeOfParallelism set as ProcessorCount.
ParallelOptions po = new ParallelOptions
{
};
po.MaxDegreeOfParallelism = Environment.ProcessorCount;
Parallel.ForEach(listUsers, po, (user) =>
{
//Parallel processing
ProcessEachUser(user);
});
But to my surprise, the CPU utilization was not even close to 20%. When I dig into the issue and read the MSDN article on this(http://msdn.microsoft.com/en-us/library/system.threading.tasks.paralleloptions.maxdegreeofparallelism(v=vs.110).aspx), I tried using a specific value of MaximumDegreeOfParallelism as -1. As said in the article thet this value removes the limit on the number of concurrently running processes, the performance of my program improved to a high extent.
But that also doesn't met my requirement for the maximum time taken to process all the records in the database. So I further analyzed it more and found that there are two terms as MinThreads and MaxThreads in the threadpool. By default the values of Min Thread and MaxThread are 10 and 1000 respectively. And on start only 10 threads are created and this number keeps on increasing to a max of 1000 with every new user unless a previous thread has finished its execution.
So I set the initial value of MinThread to 900 in place of 10 using
System.Threading.ThreadPool.SetMinThreads(100, 100);
so that just from the start only minimum of 900 threads are created and thought that it will improve the performance significantly. This did create 900 threads, but it also increased the number of failure on processing each user very much. So I did not achieve much using this logic. So I changed the value of MinThreads to 100 only and found that the performance was much better now.
But I wanted to improve more as my requirement of time boundation was still not met as it was still exceeding the time limit to process all the records. As you may think I was using all the best possible things to get the maximum performance in parallel processing, I was also thinking the same.
But to meet the time limit I thought of giving a shot in the dark. Now I created two different executable files(Slaves) in place of only one and assigned them each half of the users from DB. Both the executable were doing the same thing and were executing concurrently. I created another Master program to start these two Slaves at the same time.
To my surprise, it reduced the time taken to process all the records nearly to the half.
Now my question is as simple as that I do not understand the logic behind Master Slave thing giving better performance compared to a single EXE with all the logic same in both the Slaves and the previous EXE. So I would highly appreciate if someone will explain his in detail.
But to my surprise, the CPU utilization was not even close to 20%.
…
It uses the Http Requests to some Web API's hosted in other networks.
This means that CPU utilization is entirely the wrong thing to look at. When using the network, it's your network connection that's going to be the limiting factor, or possibly some network-related limit, certainly not CPU.
Now I created two different executable files … To my surprise, it reduced the time taken to process all the records nearly to the half.
This points to an artificial, per process limit, most likely ServicePointManager.DefaultConnectionLimit. Try setting it to a larger value than the default at the start of your program and see if it helps.

php sleep or usleep freezes out entire server

I have a big table in a data base which is updated almost every second. I want to query this table every 5 second to get the latest entries (live streaming).
I can't query data base for each visitor (ajax post request every 5 seconds) because mysql will die. This is why I need to cache a file. I'm writing data in a file than a visitor with javascript will open/read/close the file every 5 second.
Everything works fine, but I'm having trouble with cronjob + sleep.
In cpanel I can't set 5 sec cronjobs, this is why I'm running a for() with 12 cycles within 5 sec sleep.
for($i = 0; $i <12; $i++){
mysql_query() /// writing in file, etc.
sleep(5);
}
The problem is this is freezing out the entire server for 60 seconds. Not only the cronjob php file, the entire web page is timed out.
What should I do? Am I doing it right?
Note there is no guarantee that the code you posted is going to run in precisely 60 seconds. In fact, that code is going to run in AT LEAST 60 seconds. I'm not sure if that matters, but this means there's going to be a period of overlap between two or more instances of this script running via cron.
Depending on your queries and the way you're handling your database and cache reads/writes, you could be creating a database lock or deadlock situation, which in turn could cause the script to run long. Running long means it's potentially eating up database/server resources. This situation is what I'm leaning towards as the cause for your web page timeout. I'd need more details regarding your queries and database/cache file.

Resources