Translating requests per second into concurrent users - performance-testing

We're building a web service that needs to handle about 200 requests per second. But most popular load testing tools talk about running a load test with a certain number of "concurrent users".
Could anyone tell me how do I translate my requirement of "200 requests per second" into "number of concurrent users"? I'm new to the field of performance testing and from all that I've read so far, this aspect of it doesn't get addressed.
Thanks
Vimal

This translation is not possible in the general case. The problem is that a user can make multiple requests. If each user could make exactly one request (e.g. your service is completely stateless), and each request would take exactly a second, your number of concurrent users may coincide with the number of requests per seconds.
Otherwise (and those are big assumptions to make), you either track users while logging and get the respective numbers from the log or you add your assumptions into the requirements for the load test.

Related

How to handle multiple API Requests

I am working with the Google Admin SDK to create Google Groups for my organization. I can't add members to the group when creating the group, ideally, when I create a new group I'd like to add roughly 60 members. In addition, the ability to add members after the group is created in bulk (a batch request) was deprecated August 2020. Right now, after I create a group I need to make a web request to the API to add each member of the group individually (which will be about 60 members).
I am using node.js and express, is there a good way to handle 60 web requests to an api? I don't know how taxing this will be on my server. If anyone has any resources to share where I can learn about the impact this would have on a nodejs server that would be great.
Fortunately, these groups aren't created often, maybe 15 a month.
One idea I have is to offload the work to something like a cloud function so my node server makes one request to the cloud function, then the cloud function makes all the additional requests to add members to the Group. I'm not 100% sure if this is necessary and I'm curious on other approaches.
Limits and Quotas
Note that adding group members may take 10 minutes to propagate.
The rate limit for the Directory API is 3000 queries per 100 seconds per IP address. This works out to around 30 per second. 60 requests is not a large amount of requests, but if you try to send them all in a few milliseconds the system may extrapolate the rate and deem it over the limit, I wouldn't think so, though probably best to test it on your end with your system and connection etc.
Exponential Backoff
If you do need to make many requests this is the method Google recommends. It involves repeating the request if it fails and then exponentially increasing the amount of time to wait until it reaches 16 seconds. You can always implement a longer wait to retry. Its only 60 requests after all.
Batch Requests
The previously mentioned methods should work no issue for you, since there are only 60 requests to make, it won't put any "stress" on the system as such. That said, the most performant way to handle many requests to the Directory API is to use a batch request. This allows you to bundle up all your member requests into one large batch, of up to 1000 calls. This will also give you a nice cushion in case you need to increase your requests in future!
EDIT - I am sorry, I missed that you mentioned that Batching is deprecated. Only global batching is deprecated. If you send a batch request to a specific API, batching will still be supported. What you can no longer do is send a single batch request to different APIs, like Directory and Sheets in one.
References
Limits and Quotas
Exponential Backoff
Batch Requests

Advice on JMeter test plan approach structure

Im new to Jmaeter and an currently trying to get the best use out of it to create an API performance test plan.
Lets take the following scenario.
We have an APi which returns data such as part availability and order details for a range or parts.
I want to analyse the response times of the api under different load patterns.
Lets say we have 5 users.
-Each user sends a series of repeated Requests to the API.
-The request made by each user is unique only to that user.
i.e
User 1 requests parts a,b,c.
User 2 requests parts d,e,f... and so on
-All users are sanding their requests at the same time.
The way I have approached this is to create 5 separate thread groups for each user.
Within each thread group is the specific http request that gets sent by each user.
Each http request is governed by its own loop controller where i have set the number of times for each request to be sent
Since I want all users to be sending their requests at once I have unchecked
“run thread groups consecutively” in the main test plan. at a glance the test plan looks something like this:
test plan view
Since im new to using Jmeter and performance testing i have a few questions regarding my approach:
Is the way I have structured the test plan suitable and maintainable in terms of increasing the number of users that I may wish to test with?
Or would it have been better to have a single thread group with 5 child loop controllers, each containing the user specific request body data?
With my current set up, each thread group uses the default ramp up time of 1 second. I figured this is okay since each thread group represents only one user. However i think this might cause a delay on the start up of each test run. Are there any other potentially better ways to handle this such as using the scheduler or incrementing the ramp up time for each thread group so that they don all start at exactly the same time?
Thanks in advance for any advice
Your approach is correct.
If you want the requests to be in parallel they will have to be in separate Thread Groups. Each Thread Group should model a use-case. In your case, the use-case is a particular mix of requests.
By running the test for sufficiently long time you will not feel the effects of ramp-up time.
First of all your test needs to be realistic, it should represent real users (or user groups) as close as possible. If test does it - it is a good test and vice versa. Something like:
If User1 and User2 represent 2 different group of users (like User1 is authenticated and User2 is not authenticated or User1 is admin and User2 is guest) they should go into different Thread Groups.
It is better to use Thread Group iterations instead of Loop Controllers as some test elements like HTTP Cookie Manager have settings like Clear Cookies each Iteration which don't respect iterations produced by Loop or While Controller, they consider only Thread Group-driven iterations
The only way to guarantee sending requests at the same time is putting them under one Thread Group and using Synchronizing Timer
When it comes to real load test, you should be always gradually adding the load so you could correlate various metrics like response time, throughput, error rate with increased number of virtual users. Same approach should be applied for "ramping-down", you should not be turning off the load at once in order to be able to see how does your application recover after the load. You might want to use some custom Thread Groups available via JMeter Plugins project like:
Stepping Thread Group
Ultimate Thread Group
They provide flexible and convenient way to set the desired load pattern.

How to handle the API call rate limit for Docusign in nodejs

How to handle API call rate limit for Docusign in nodejs?I am getting error like "The maximum number of hourly API invocations has been exceeded ".
So, you hit your QPH (queries per hour) limit, right?
Well, there are two things you can do to minize that:
Throttling
Caching
Throttling
When an application has a query limit per period of time, throttling is often a common strategy used.
Throttling revolves around distributing your requests over a period of time. So for example, if you have a 500 limit of requests per hour, you can throttle your application to do 250 requests in the first 30 minutes, and 250 after.
This way you avoid hitting the limit. If you have more requests, then you save them for the next hour.
Caching
Throttling is good because it gives you control over time. But sometimes, requests are similar, so similar in fact that you can just save the answer and use it for later.
Caching is also often used together with throttling, by saving the answers from old requests in your system, and re-using them, you effectively loose the need to make requests against the API, and you gain the ability to answer more user requests (provided you cached them before).
Sum up
There is no silver bullet to solve your problem. There is no simple line of code to do that. Instead, you have two methods that used together will minimize your problem and perhaps even eliminate it altogether if used correctly.

In Node js. How many simultaneous requests can I send with the "request" package

How many simultaneous requests can I make with the request package?
I am expecting data back from every request confirming the request was received and processed successfully. Is this hardware or OS dependent? Where do I start looking?
One of the more recent versions of node.js does not enforce a limit on outgoing requests (older versions did). If you were literally trying to make millions of outgoing connections at the same time, then you would probably hit a limit on your own node.js server that would be OS specific. But, the practical limit is more likely going to be determined by the target host.
Since all your requests are being sent to the same host, the more likely limit will be determined by the server you are making the requests to. It will have some sort of limit for how many simultaneous requests it can have "in-flight" at the same time before it starts refusing new connections. What that number is depends entirely upon how the server is configured and built. For http://www.google.com, the number is probably hundreds of thousands or millions of requests because they have a huge server farm and requests are balanced across all of them. For some simple single CPU server, the limit would obviously be much smaller than that.
In addition, there will little use in sending zillions of requests to a single CPU server anyway because it won't be able to work on all of them at once anyway.
So, if you want to know what would work best for a given target host, you would have to set up an adjustable test harness so you could test scenarios where you send from 1, 2, 5, 10, 50, 100, 200, 500, 1000 at a time and see what the average response time is and where you start to get errors (if any).
If you don't want to do any of that type of testing, then a reasonably safe choice that doesn't attempt to fully optimize things is to put no more than 5 requests in flight at the same time.
You can either build something yourself to manage to N requests in flight at a time or you can use one of the existing libraries that will do that for you. The Bluebird promise library has a concurrency option on some of it's functions such as Promise.map() which will automatically do that for you for whatever concurrency value you set. The async library also has something similar.
If you want more specific help crafting the code to manage how many requests are in flight at a time or to build a test harness for it, please show us some of your code for the source of all the requests so we have some idea how that works (if it's a giant array of requests or what the source of the URLs is).

Instagram real-time API POST rate

I'm building an application using tag subscriptions in the real-time API and have a question related to capacity planning. We may have a large number of users posting to a subscribed hashtag at once, so the question is how often will the API actually POST to our subscription processing endpoint? E.g., if 100 users post to #testhashtag within a second or two, will I receive 100 POSTs or does the API batch those together as one update? A related question: is there a maximum rate at which POSTs can be sent (e.g., one per second or one per ten seconds, etc.)?
The Instagram API seems to lack detailed information about both how many updates are sent and what are the rate limits. From the [API docs][1]:
Limits
Be nice. If you're sending too many requests too quickly, we'll send back a 503 error code (server unavailable).
You are limited to 5000 requests per hour per access_token or client_id overall. Practically, this means you should (when possible) authenticate users so that limits are well outside the reach of a given user.
In other words, you'll need to check for a 503 and throttle your application accordingly. No information I've seen for how long they might block you, but it's best to avoid that completely. I would advise you manage this by placing a rate limiting mechanism on your own code, such as pushing your API requests through a queue with rate control. That will also give you the benefit of a retry of you're throttled so you won't lose any of the updates.
Moreover, a mechanism such as a queue in the case of real-time updates is further relevant because of the following from the API docs:
You should build your system to accept multiple update objects per payload - though often there will be only one included. Also, you should acknowledge the POST within a 2 second timeout--if you need to do more processing of the received information, you can do so in an asynchronous task.
Regarding the number of updates, the API can send you 1 update or many. The problem with this is you can absolutely murder your API calls because I don't think you can batch calls to specific media items, at least not using the official python or ruby clients or API console as far as I have seen.
This means that if you receive 500 updates either as 1 request to your server or split into many, it won't matter because either way, you need to go and fetch these items. From what I observed in a real application, these seemed to count against our quota, however the quota itself seems to consume resources erratically. That is, sometimes we saw no calls at all consumed, other times the available calls dropped by far more than we actually made. My advice is to be conservative and take the 5000 as a best guess rather than an absolute. You can check the remaining calls by parsing one of the headers they send back.
Use common sense, don't be stupid, and using a rate limiting mechanism should keep you safe and have the benefit of dealing with failures either due to outages (this happens more than you may think), network hicups, and accidental rate limiting. You could try to be tricky and use different API keys in a pooling mechanism, but this is likely a violation of the TOS and if they are doing anything via IP, you'd have to split this up to different machines with different IPs.
My final advice would be to restructure your application to not completely rely on the subscription mechanism. It's less than reliable and very expensive API wise. It's only truly useful if you just need to do something in your app that doesn't require calling back to Instgram, your number of items is small, or you can filter out the majority of items to avoid calling back to Instagram accept when a specific business rule is matched.
Instead, you can do things like query the tag or the user (ex: recent media) and scale it out that way. Normally this allows you to grab 100 items with 1 request rather than 100 items with 100 requests. If you really want to be cute, you could at least merge the subscription notifications asynchronously and combine the similar ones into a single batched request when you combine the duplicate characteristics such as tag into a single bucket. Sort of like a map/reduce but on a small data set. You could of course do an actual map/reduce from time-to-time on your own data as another way of keeping things in async. Again, be careful not to thrash instagram, but rather just use map/reduce to batch out your calls in a way that's useful to your app.
Hope that helps.

Resources