JMeter: How to calculate maximum number of threads per machine - multithreading

The JMeter manual says
Your hardware's capabilities will limit the number of threads you can effectively run with JMeter. It will also depend on how fast your server is (a faster server makes JMeter work harder since it returns request quicker). The more JMeter works, the less accurate its timing information may become.
The question I want to ask is How many threads can I run from a single desktop machine and still get accurate enough results? However, I realize that's going to depend on what we define modern hardware as, or how fast my application/site is, etc.
So, the better (but harder to answer) question is, how to I profile JMeter to know when I've gone beyond the thread/user count that it's reasonable for a single machine to handle? Accurate deterministic methods are preferred, but anecdotal/rules-of-thumb are welcome.

I first suggest you follow best-practices for building JMeter test plans and running them:
http://www.ubik-ingenierie.com/blog/jmeter_performance_tuning_tips/
http://jmeter.apache.org/usermanual/best-practices.html
Then once your test plan is built, baseline it on the JMeter machine:
Monitor CPU (don't exceed 50%), swap (ensure no swap in/out at all)
Check GC for no long pauses
And don't forget issues which make Test wrong can come from lot of factors:
Networks issue between injector and application
TCP stack issues on JMeter injector
Components between the Injector and Application (Firewall, Load Balancer ...)

Related

What affects TPS, hits per second in performance testing?

I do a stress test to determine the maximum number of TPS(Transaction per second), Hits per second of a server by making HTTP requests through JMeter.
When I run script in Jmeter with different clients (tested on the same server, same script), I find that the number of tps(or hits per second) that the server can handle is different.
Assume that server can handle maximum of around 500 TPS when run script in client 1, 400 TPS when run script in client 2.
I am very confused with the following issues:
Why is there such a difference between two clients?
What affects TPS, hits per second ?
Although when doing stress test, I found that the server could only handle maximum of around 500 TPS, is there any way to increase the server's performance, increase the max number of tps that the server can handle?
Thanks in advance especially if anyone who can solve this problem for me !!
If you run the same JMeter test from different machines and get different results it might be the case JMeter cannot send requests fast enough
JMeter is a normal Java application and it's default configuration is good for tests development and/or debugging, however you need to do some tuning when it comes to load test execution.
First of all make sure to follow JMeter Best Practices
Then you need to ensure that JMeter properly utilizes operating system resources, you might want to increase JVM Heap size and play with Garbage Collector configuration in order to:
allow JMeter to use not less than 30 and not more than 80% of total avaiable heap space
GC shouldn't happen too often as it "pauses" the JVM execution
JMeter must not overload the underlying operating system, it should have enough headroom to operate in terms of CPU, RAM, etc. so it worth checking the OS health, it can be done using JMeter PerfMon Plugin
And last but not the least, if you run into the limits of the machine you can consider running JMeter in Distributed Mode so both client1 and client2 will run the same test providing cumulative 800 TPS or even more (given your server can handle such a load)
#Dimitri T, Thank you very much for your help !!!

WSO2 APIM - Tunning

I have performed some performance tests on WSO2 APIM on both WebServices (WSDL) and Gateway interfaces. Everything went good on the gateway one, however I am facing an odd behavior when using the WebServices one.
Basically I created a test that add, change password and delete a user and run a test plan using 64 threads. At the very beggining my throughput increases a lot up until reach all 64 threads (throughput peak was 1600 req/seg). However, after that the throughput start to decrease with no reason.
All 64 threads are still active and running, and the machine hosting the wso2am reduce CPU usage. It seems that APIM is given up of handling the request even though it has threads and processors for that.
The picture below shows the vmstat result for processor (user, system and idle) and the context switch and interruptions. It is possible to cpu/context switch follows the throughput.
And the next picture illustrate the jmeter test result after at the end (after decrease throughput).
Basically what I need is a clue on what may be the reason for such behavior. I have already tried to increase the pool of threads on both wso2am and tomcat, however it has no effect. It is like the requests were not arriving at all. Even though jmeter is full of power and had already send a bigger throughput before.
I would bet that a simple configuration on tomcat or wso2 is the answer for that. Any help is appreciate.
Thanks and Regards
It may be due to JMeter not being able to send the requests fast enough, try the following steps:
Upgrade JMeter to the latest version (3.1 as of now), you can get the most recent JMeter distribution from JMeter download page
Run your test in command-line non-GUI mode. JMeter GUI can be used for tests development and/or debugging only, it is not designed for running load tests.
Remove (or disable) all the listeners during test execution. Later on you can open JMeter GUI, add the listener of your choice, load .jtl results file and perform analysis or create an HTML Reporting Dashboard out of results file
See 9 Easy Solutions for a JMeter Load Test “Out of Memory” Failure article for above points explained in details and few more tips on configuring JMeter for maximum performance and throughput

Jmeter web application performance testing doubts?

I am new to jmeter and have a couple of doubt about web application performance testing.
Is it necessary to load all embedded resource in jmeter for performance testing ?
I have written a Jmeter script that exercise all REST apis. Is this enough to find the application performance at the server side ?
How Ramp up time affects the Performance test ?
For how much time the test needs to be executed, to get an accurate performance report ?
Load Generation configuration - Generating load from machines attached to application cluster / from different LAN ?
Kindly find my view on the questions below:
I believe that load test needs to be as much realistic as possible so representing real browser behavior is a must. Real browsers download embedded resources like scripts, images and styles, moreover, they use a concurrent thread pool of 2 - 8 threads to do this in parallel. So you need to configure JMeter similarly. However real browsers download these assets only once, on subsequent requests they return embedded resources from cache. So make sure that you configure JMeter to:
download embedded resources
use concurrent pool for it
add HTTP Cache Manager to your test plan
It should be enough from functional point of view as usually static content is being served separately. However see point 1, if you have possibility to simulate real user behavior - go for it
It is better to have reasonable ramp-up and ramp-down periods so the load could increment gradually so both server and load generator sides won't experience peak stress loads (unless it is your test case). See the bit on ramp-up from JMeter documentation
Ramp-up needs to be long enough to avoid too large a work-load at the start of a test, and short enough that the last threads start running before the first ones finish (unless one wants that to happen).
Start with Ramp-up = number of threads and adjust up or down as needed.
By default, the thread group is configured to loop once through its elements.
Usually peak load follows general Pareto principle, during "peak" periods application served 80% of requests during 1-2 hours time frame and remaining 20% of requests were more or less equally distributed between remaining 20 hours in a day. So it should be enough to test your application providing anticipated peak load for a couple of hours. Again if time allows I would recommend to go for Soak testing to see if there are any memory leaks and for Stress testing to determine application load boundaries and whether it recovers from stress load or not
Theoretically application shouldn't care regarding requests source (unless it uses different logic to handle requests from i.e. different geo regions). One thing is obvious: don't run load generator and application under test on the same machine. If one JMeter instance cannot create enough load to implement test scenario - go for distributed testing
I'd like to add some more perspective:
Question 1 & 2:
The Pareto principle can be applied here also - meaning, that it takes a lot of effort to properly simulate reality, downloading all resources used by a browser to render a page and to give the proper 'weights' to different URLs, simulating user behaviour accurately. This is where many load tests fail, because simulating reality accurately is very, very hard. As the previous response mentions, most static content is often served via CDNs or similar anyway, and what you really want to test is usually your own system's capability to handle traffic.
Considering the above, I would say that if you spend 20% of effort setting up a load test that tests your REST API, you will get 80% of the results you want. If you on the other hand go for a completely realistic test, you will spend another 80% of effort for only 20% more results. The effect of this is that in many cases it is better to go for the simpler test, that does not simulate reality accurately. It gives you the most return on your invested time.
Question 3: Agree fully with previous response here. Ramp up slowly, unless your specific use case sees very sudden traffic peaks (like if you're an online auction service or ticket sales or similar). Can also be a good idea to configure your test so it spends some time on a "plateau" after ramping up to peak load, and not just stopping the load test once you reach the peak.
Question 4: I would say you need to run the load test long enough to produce stable, statistically significant results. This can be 5 minutes or 5 hours depending on your scenario, but half an hour is probably a good minimum time to aim for in mostly all cases. The test duration should not be dependent on how long your site tends to experience peak load in real life though - not unless you're doing some kind of soak test.
Question 5: Traffic origin is sometimes worth thinking about, as different source locations lead to different network delay between (simulated) clients and server, which affects transaction rates. If you run a load test with 1,000 VUs on a system located in New York, and generate the traffic from Australia, you will not get a lot of transactions per second due to the high network delay. If you run the same test using a load generator in New York instead, your transaction rate will be a lot higher because the network delay is so much lower. Of course, you can always add more concurrent clients/VUs/connections and get the same transaction rate on a high-delay network link that you would on a low-delay link, but at the cost of forcing the server to keep a lot more (TCP) connection state, using more file descriptors and buffer memory. I.e. might not be a very realistic scenario.

Should each website be its own `node.js` process

We host about 150 websites (possibly scaling to 300+) that we are considering migrating to node.js. Most of the sites are fairly low traffic <1mil pageviews per month.
Should each website be it's own node.js process, or should we serve all websites using the same node.js process (or small set of load balanced processes). Is there a technical limit or a reasonable limit to the number of node processes per server?
Process per site: Feels inefficient, but I don't know if it actually is inefficient. Would ensure one buggy site doesn't affect other sites.
Process per core/small set of processes: Likely higher performance, but what happens when I need to update a sites codebase, won't it take down other sites? Also, code failures in one site would affect other sites.
Ideally, I would prefer one process per site so that we could host all sites from each worker server. That way when load increases we can just spin up another identical worker server and load balance between the two without having to arbitrarily say SiteA goes to ServerA and SiteB goes to ServerB. Any node.js gurus available to offer some wisdom?
All static file requests will be handled likely by Nginx or something like Varnish.
There are a lot of issues at play here. The big picture answer is, it depends... as it always does when you bring in the whole "performance" discussion. That being said, the simplest way to get a solid Node set up is to note the following basic facts about NodeJS, and I will also comment on their implications as they pertain to your questions.
The concurrency you get with Node works really good in certain situations, namely IO heavy operations. What we're really talking about here is minimizing the amount of downtime to wait for the next request. Because of this, Node works really well in an environment where there is one process per core on a machine. Node does really well at maximizing the amount of CPU available to serve requests under heavy load. This being said, if you have literally ZERO other work going on in your even loop, you can see minor performance increases (in terms of max requests/second/processor core) by having multiple node processes per core. But, I've never seen any benefit from increasing this number past 3. Even under circumstances where the entire event loop was literally just a file server.
On the process per site comment. This is a bad idea for many reasons. For one, a well put together node server can process thousands of requests per second. Our (company name omitted) servers, hosted through Amazon EC2 on medium clusters (lots of ram, mid CPU clock, 4 cores), typically fail around 3000 requests per second per cluster. Our servers do a fair bit of CPU work, for simple file servers I'm sure you can do much better. Strictly speaking, sure, per site, you will be able to serve more requests by launching each site in its own process/core/escalating quickly here! But it's not necessary from a cost and over complication of your architecture point of view. What I WOULD recommend, is investing in a setup with a lot of RAM. The ability for your server to cache often requested files will effect your performance infinitely more than launching an abundance of processes for a given machine.
On the whole RAM thing. The number of processes you want to launch for a given core is dependant on two things. One is how much synchronous work done in your event loop. The more synchronous work, the more time between a given request coming in and the event loop being ready to adress the next one. If you have a busy event loop, you will be in a situation where you require more processes/CPU Core. The other thing that can effect this, particularly relevant for file servers, is the amount of RAM. Node runs much better in a high ram environment, but you can say this about ANY file server really... What this has to do with, is the number of active asynchronous operations. One downside of the way node works, is under heavy loads, you can get a large number of event handlers active at once. This is great for concurrency/simplicity, however, if your server is busy waiting around for a lot of async disk/IO to happen it will slow down and crash much sooner than if you had plenty of RAM. If you don't have enough RAM to handle all of these event handlers, you will want to keep to the 1 process/core arrangement. Otherwise, it is easier for Node to spin up many event handlers simultaneously, and again cause you to crash sooner than you would otherwise.
I don't really have enough information to tell you what you SHOULD do. This depends entirely too much on the architecture of your specific server, sites, size of your sites, amount of data... etc. But these three pieces of knowledge are the basic things that help you get the most out of your Node server. To be honest, your idea about load balancing mixed with the considerations above, should do nicely for you. Surely, microoptimizations are possible, but if you do these things, you should easily see requests/second in the thousands before you start experiencing crashes because of DDOS type of conditions.
No, don't do it. Keep it simple! And check out http://12factor.net/.
A few hundred processes is nothing compared to the simplicity you otherwise lose. It would be a terrible decision, on so many levels, to have more than one site (or, "logical application unit") served by a single Node process.
If you're asking this question, you may want to explore Node more before you "migrate" to Node. Error handling and separation of concerns are more complicated in Node than in other situations. Specifically, neither the domain nor cluster APIs are mature. But really it's the philosophy of clean and simple application deployment that you'd be violating. I could go on and on.

User-Contributed Code Security

I've seen some websites that can run code from the browser, and the code is evaluated on the server.
What is the security best-practice for applications that run user-contributed code? Besides of accessing and changing the server's sensitive information.
(for example, using a Python with a stripped-down version of the standard library)
How to prevent DoS like non-halting and/or CPU-intensive programs? (we can't use static code analysis here) What about DoSing the type check system?
Python, Prolog and Haskell are suggested examples to talk about.
The "best practice" (am I really the only one who hates that phrase?) is probably just not to do it at all.
If you really must do it, set it up to run in a virtual machine (and I don't mean something like a JVM; I mean something that hosts an OS) so it's easy to restore the VM from a snapshot (or whatever the VM in question happens to call it).
In most cases, you'll need to go a bit beyond just that though. Without some extra work to lock it down, even a VM can use enough resources to reduce responsiveness so it can be difficult to kill and restart it (you usually can eventually, but "eventually" is rarely what you want). You also generally want to set some quotas to limit its total CPU usage, probably limit it to using a single CPU (and run it on a machine with at least two), limit its total memory usage, etc. In Windows, for example, you can do (at least most of that) by starting the VM in a job object, and limiting the resources available to the job object.

Resources