Im new to Jmaeter and an currently trying to get the best use out of it to create an API performance test plan.
Lets take the following scenario.
We have an APi which returns data such as part availability and order details for a range or parts.
I want to analyse the response times of the api under different load patterns.
Lets say we have 5 users.
-Each user sends a series of repeated Requests to the API.
-The request made by each user is unique only to that user.
i.e
User 1 requests parts a,b,c.
User 2 requests parts d,e,f... and so on
-All users are sanding their requests at the same time.
The way I have approached this is to create 5 separate thread groups for each user.
Within each thread group is the specific http request that gets sent by each user.
Each http request is governed by its own loop controller where i have set the number of times for each request to be sent
Since I want all users to be sending their requests at once I have unchecked
“run thread groups consecutively” in the main test plan. at a glance the test plan looks something like this:
test plan view
Since im new to using Jmeter and performance testing i have a few questions regarding my approach:
Is the way I have structured the test plan suitable and maintainable in terms of increasing the number of users that I may wish to test with?
Or would it have been better to have a single thread group with 5 child loop controllers, each containing the user specific request body data?
With my current set up, each thread group uses the default ramp up time of 1 second. I figured this is okay since each thread group represents only one user. However i think this might cause a delay on the start up of each test run. Are there any other potentially better ways to handle this such as using the scheduler or incrementing the ramp up time for each thread group so that they don all start at exactly the same time?
Thanks in advance for any advice
Your approach is correct.
If you want the requests to be in parallel they will have to be in separate Thread Groups. Each Thread Group should model a use-case. In your case, the use-case is a particular mix of requests.
By running the test for sufficiently long time you will not feel the effects of ramp-up time.
First of all your test needs to be realistic, it should represent real users (or user groups) as close as possible. If test does it - it is a good test and vice versa. Something like:
If User1 and User2 represent 2 different group of users (like User1 is authenticated and User2 is not authenticated or User1 is admin and User2 is guest) they should go into different Thread Groups.
It is better to use Thread Group iterations instead of Loop Controllers as some test elements like HTTP Cookie Manager have settings like Clear Cookies each Iteration which don't respect iterations produced by Loop or While Controller, they consider only Thread Group-driven iterations
The only way to guarantee sending requests at the same time is putting them under one Thread Group and using Synchronizing Timer
When it comes to real load test, you should be always gradually adding the load so you could correlate various metrics like response time, throughput, error rate with increased number of virtual users. Same approach should be applied for "ramping-down", you should not be turning off the load at once in order to be able to see how does your application recover after the load. You might want to use some custom Thread Groups available via JMeter Plugins project like:
Stepping Thread Group
Ultimate Thread Group
They provide flexible and convenient way to set the desired load pattern.
Related
I just read this article from Node.js: Don't Block the Event Loop
The Ask
I'm hoping that someone can read over the use case I describe below and tell me whether or not I'm understanding how the event loop is blocked, and whether or not I'm doing it. Also, any tips on how I can find this information out for myself would be useful.
My use case
I think I have a use case in my application that could potentially cause problems. I have a functionality which enables a group to add members to their roster. Each member that doesn't represent an existing system user (the common case) gets an account created, including a dummy password.
The password is hashed with argon2 (using the default hash type), which means that even before I get to the need to wait on a DB promise to resolve (with a Prisma transaction) that I have to wait for each member's password to be generated.
I'm using Prisma for the ORM and Sendgrid for the email service and no other external packages.
A take-away that I get from the article is that this is blocking the event loop. Since there could potentially be hundreds of records generated (such as importing contacts from a CSV or cloud contact service), this seems significant.
To sum up what the route in question does, including some details omitted before:
Remove duplicates (requires one DB request & then some synchronous checking)
Check remaining for existing user
For non-existing users:
Synchronously create many records & push each to a separate array. One of these records requires async password generation for each non-existing user
Once the arrays are populated, send a DB transaction with all records
Once the transaction is cleared, create invitation records for each member
Once the invitation records are created, send emails in a MailData[] through SendGrid.
Clearly, there are quite a few tasks that must be done sequentially. If it matters, the asynchronous functions are also nested: createUsers calls createInvites calls sendEmails. In fact, from the controller, there is: updateRoster calls createUsers calls createInvites calls sendEmails.
There are architectural patterns that are aimed at avoiding issues brought by potentially long-running operations. Note here that while your example is specific, any long running process would possibly be harmful here.
The first obvious pattern is the cluster. If your app is handled by multiple concurrent independent event-loops of a cluster, blocking one, ten or even thousand of loops could be insignificant if your app is scaled to handle this.
Imagine an example scenario where you have 10 concurrent loops, one is blocked for a longer time but 9 remaining are still serving short requests. Chances are, users would not even notice the temporary bottleneck caused by the one long running request.
Another more general pattern is a separated long-running process service or the Command-Query Responsibility Segregation (I'm bringing the CQRS into attention here as the pattern description could introduce more interesting ideas you could be not familiar with).
In this approach, some long-running operations are not handled directly by backend servers. Instead, backend servers use a Message Queue to send requests to yet another service layer of your app, the layer that is solely dedicated to running specific long-running requests. The Message Queue is configured so that it has specific throughput so that if there are multiple long-running requests in short time, they are queued, so that possibly some of them are delayed but your resources are always under control. The backend that sends requests to the Message Queue doesn't wait synchronously, instead you need another form of return communication.
This auxiliary process service can be maintained and scaled independently. The important part here is that the service is never accessed directly from the frontend, it's always behind a message queue with controlled throughput.
Note that while the second approach is often implemented in real-life systems and it solves most issues, it can still be incapable of handling some edge cases, e.g. when long-running requests come faster than they are handled and the queue grows infintely.
Such cases require careful maintenance and you either scale your app to handle the traffic or you introduce other rules that prevent users from running long processes too often.
DISCLAIMER: If this post is off-topic to this site, please recommend a site where this post would be appropriate.
On Ubuntu 18.04, in bash, I am writing a network-based, threaded application that requires multiple servers. It receives files through the network and processes them, ultimately making an API call that finishes the processing and logs the results to a database for later retrieval and reporting.
So far I have written the application using non-threaded programming models and concepts. That means the files are processed one at a time in real-time. This works great if there is no sudden burst of files and/or a backlog of files to process. The main bottle neck has been the way I sequentially send files to the API one after another, waiting until the entire operation has taken place for one file and the API returns the results. The API has a rate limit of 8 calls per second. But since each call takes from .75 to 1 second, my program waits until the operation is done and only processes about 1 file per second through the API. In short, I did not have to worry about scheduling API calls because I could barely do one call per second.
Since the capacity is there to process 8 files per second, and I need more speed, I have been converting my single-threaded, sequential application into a parallel, scalable, multi-threaded application. This new version can spawn enough threads to send 8 files per second to the REST API and much more. So now I have the opposite problem. I am sending too many requests per second to the REST API and am in danger of triggering penalties, etc. Ultimately, when my traffic is higher, I will upgrade my subscription to the API and get more calls per second, but this current dilemma has got me thinking about how to schedule the API calls with different threads.
The purpose of this post is to discuss an idea about how to schedule these REST API calls across various threads. Specifically, I want to discuss how to coordinate timing and usage of the API while maintaining efficiency and yet not overloading the API. In short, I want to coordinate a group of threads so that the API is properly used. Not too fast and not too slow.
Independent of my application, this idea could be useful in a number of generically similar scenarios.
My idea is to create an "air traffic controller" ("ATC") so that the threads of the application have a centralized timing authority to check when they are ready to submit files to the REST API. The ATC would know how many time slots/calls per time period (in this case, calls per second) the API can schedule. The ATC would be listening for the threads to request a time slot ("launch code") which would give them a time slot in the future to perform their API call. The ATC would decide based on the schedule of other launch codes that it has already handed out.
In my case, from the start of the upload of the file to the API, it could take 0.75 to 1 second to complete the processing and receive a response from the API. This does not affect the count of new API calls that can be performed. It is just a consideration of how long the threads will be waiting once they call the API. It may not be relevant to this overall discussion.
Each thread would obviously have to do some error handling. If the API timed out or threw an error, then the thread would have to handle it and get back in line with the ATC -if appropriate- and ask for a new launch code. Maybe it should report the error to the ATC for centralized logging?
In situations where the file processing needs burst above 8 files per second, there would be a scheduling backlog where the threads should wait their turn as assigned by the ATC.
Here are some other considerations:
Function
The ATC would be a lightweight daemon that does the following:
- listens on some TCP port
- receives a request
security token (?), thread id, priority
- authenticates the request (?)
- examines schedule
- reserves the next available time slot
- returns the launch code
security token (?), current time, launch timing offset to current time, URL and auth token for the API
- expunged expired launch codes
The ATC would need the following:
- to know what port it is supposed to run on
- to know how many slots per time period it was set to schedule
(e.g. 8 per second)
- to have a super fast read/write access to the schedule (associative array?)
- to know the URL and corresponding auth token for the thread to use
- maybe to know multiple URLs and auth tokens for load balancing
Here are more things to consider:
Security
How could we keep the ATC secure while ensuring high performance?
Network-level security (e.g. firewalls allowing only the IP addresses of the file-processing servers?)
Auth tokens or logins and passwords?
Performance
What would the requirements be for this ATC server? Would this be taxing to a CPU and memory?
Timing
How often would an NTP call be needed? By the ATC server? By the servers which call the API?
Scalability
Being able to provide different URLs and auth tokens would allow the ATC to load balance with different API providers.
Threading of the ATC itself
Would the ATC need to spawn threads to be able to handle each new request?
How does a web server handle requests?
How would the various threads share a common schedule?
In a non-threaded environment, the ATC would possibly keep an associative array in memory to keep performance as high as possible. How would the various threads of the ATC have access to the same schedule?
So here is my question. Does this exist? If not, what are some best practices in trying to build the above?
It seems like a beanstalkd kind of network service except it only provides permission/scheduling and is extremely dependant on timing.
I have an asp.net core Web Api application.
In my application I have Web Api method which I want to prevent multi request from the same user to enter simultaneously. I don't mind request from different users to perform simultaneously.
I am not sure how to create the lock and where to put it. I thought about creating some kind of a dictionary which will contains the user id and perform the lock on the item but I don't think i'm getting it right. Also, what will happen if there is more than one server and there is a load balancer?
Example:
Let assume each registered user can do 10 long task each month. I need to check for each user if he exceeded his monthly limit. If the user will send many simultaneously requests to the server, he might be allowed to perform more than 10 operations. I understand that I need to put a lock on the method but I do want to allow other users to perform this action simultaneously.
What you're asking for is fundamentally not how the Internet works. The HTTP and underlying IP protocols are stateless, meaning each request is supposed to run independent of any knowledge of what has occurred previously (or concurrently, as the case may be). If you're worried about excessive load, your best bet is to implement rate limiting/throttling tied to authentication. That way, once a user burns through their allotted requests, they're cut off. This will then have a natural side-effect of making the developers programming against your API more cautious about sending excessive requests.
Just to be a bit more thorough, here, the chief problem with the approach you're suggesting is that I know of no way it can be practically implemented. You can use something like SemaphoreSlim to create a lock, but that needs to be static so that the same instance is used for each request. Being static is going to limit your ability to use a dictionary of them, which is what you'll need for this. It can technically be done, I suppose, but you'd have to use a ConcurrentDictionary and even then, there's no guarantee of single-thread additions. So, concurrent requests for the same user could load concurrent semphaphores into it, which defeats the entire point. I suppose you could front-load the dictionary with a semphaphore for each user from the start, but that could become a huge waste of resources, depending on your user-base. Long and short, it's one of those things where when you're finding a solution this darn difficult, it's a good sign you're likely trying to do something you shouldn't be doing.
EDIT
After reading your example, I think this really just boils down to an issue of trying to handle the work within the request pipeline. When there's some long-running task to be completed or just some heavy work to be done, the first step should always be to pass it off to a background service. This allows you to return a response quickly. Web servers have a limited amount of threads to handle requests with, and you want to service the request and return a response as quickly as possible to keep from exhausting your threadpool.
You can use a library like Hangfire to handle your background work or you can implement an IHostedService as described here to queue work on. Once you have your background service ready, you would then just immediately hand off to that any time your get a request to this endpoint, and return a 202 Accepted response with a URL the client can hit to check the status. That solves your immediate issue of not wanting to allow a ton of requests to this long-running job to bring your API down. It's now essentially doing nothing more that just telling something else to do it and then returning immediately.
For the actual background work you'd be queuing, there, you can check the user's allowance and if they have exceeded 10 requests (your rate limit), you fail the job immediately, without doing anything. If not, then you can actually start the work.
If you like, you can also enable webhook support to notify the client when the job completes. You simply allow the client to set a callback URL that you should notify on completion, and then when you've finish the work in the background task, you hit that callback. It's on the client to handle things on their end to decide what happens when the callback is it. They might for instance decide to use SignalR to send out a message to their own users/clients.
EDIT #2
I actually got a little intrigued by this. While I still think it's better for your to offload the work to a background process, I was able to create a solution using SemaphoreSlim. Essentially you just gate every request through the semaphore, where you'll check the current user's remaining requests. This does mean that other users must wait for this check to complete, but then your can release the semaphore and actually do the work. That way, at least, you're not blocking other users during the actual long-running job.
First, add a field to whatever class you're doing this in:
private static readonly SemaphoreSlim _semaphore = new SemaphoreSlim(1, 1);
Then, in the method that's actually being called:
await _semaphore.WaitAsync();
// get remaining requests for user
if (remaining > 0)
{
// decrement remaining requests for user (this must be done before this next line)
_semaphore.Release();
// now do the work
}
else
{
_semaphore.Release();
// handle user out of requests (return error, etc.)
}
This is essentially a bottle-neck. To do the appropriate check and decrementing, only one thread can go through the semaphore at a time. That means if your API gets slammed, requests will queue up and may take a while to complete. However, since this is probably just going to be something like a SELECT query followed by an UPDATE query, it shouldn't take that long for the semaphore to release. You should definitely do some load testing and watch it, though, if you're going to go this route.
We have 2 thread groups which is dependent on previous ones response.
SIGNUP will generate some PHONE NUMBER and PASSWORD in response which will be utilized by LOGIN thread group.
I don't want to use CSV and would like to capture response from SIGNUP and use same credentials (PHONE NUMBER and PASSWORD) to execute LOGIN.
Also, which timer would be better to use.
Any idea how to proceed?
If you have 2 Thread Groups and would like to start 2nd one only when some information from 1st one is available the best way to proceed is using Inter-Thread Communication Plugin
It provides a simple FIFO queue which is accessible by different threads (even if they reside in different thread groups) so you can simply put these PHONE NUMBER and PASSWORD into the queue and configure 2nd Thread Group to operate only when the credentials are available.
There is SynchronizationPluginsExample.jmx test plan which demonstrates sharing cookies between Thread Groups, you can use it as a basis for your implementation.
Inter-Thread Communications plugin can be installed using JMeter Plugins Manager
I wanted to tailor the application I am making which communicates with the quickbooks server and adds things like customers and check expenses and I wanted my application to be as efficient as possible regarding performance. For example, my intention was to have all customer additions (batch process) on one thread and all check expenses or bills (batch process) on another thread which is logically possible as the two procedures don't interfere and are not related to one another.
My question is would such a design approach be permissible by Intuit? I guess my concern is regarding any limitations on communication with their servers.
In the docs site, the following throttling policy is mentioned.
What are the throttling limits based on QB accounts, OAuth client, and RealmId at any given time?
EDIT Following line is not valid anymore. FAQ page is updated.
Apart from an upper limit set that ensures no more than 10 requests in progress at any given time;
EDIT
we have a throttling policy across all IDS apis to permit 500 requests/minute per AuthId and per RealmId. The policy permits 200 requests/minute per AuthId for reports endpoints.
Ref - https://developer.intuit.com/docs/0025_quickbooksapi/0058_faq
So, if you follow the above throttling limit then parallel processing using multiple threads is not an issue.
PN - You can't create multiple name entities ( ex - Vendor, Employee and Customer) using parallel threads. Service puts a lock across these 3 entities to ensure an unique name is getting used while creating a new entity.
Thanks