Kestrel limiting requests concurrency - multithreading

Because i'm struggling with thread starvation symptoms like i've decided to rewrite the whole calls chain (I/O, db paths) in async/await mode.
After huge effort to rewrite async/await my terrible surprise was that has no effect on medium/high load.
So i decided to reduce the test to the following sequence:
WebStressTool -> IIS(reverse proxy) -> Kestrel(out of process) -> await MiddlewarePipeline.Next() -> async Controller.Action -> await db.StoredProcedure(sql: WAIT FOR 1000ms)
The processing time for every request should be near 1000ms in full async/await calls chain.
I expect that as the requests/second increase the response time should remain the same (1000ms).
Here is the result:
1-9 rqs/sec -> ResponseTime: 1010ms. (WebStresTool: 9 rqs/sec). Its prefect,as expected
15 rqs/sec -> ResponseTime: 2200ms. (WebStresTool: 9 rqs/sec). Bottleneck
25 rqs/sec - ResponseTime: 4500ms. (WebStresTool: 9 rqs/sec). Bottleneck
....and so on
as the requests /sec increase the response time increase exponentially.
The normal behavior in a full async/await scenario should be : as the increase or rqs/sec the response time should remain the same : time = 1000ms (Scalability).
So the symptoms are not related to thread starvation but to some bottleneck issue somewhere... who knows.
After more tests i discovered that the issue appears only when run under IIS (reverse proxy). The application threads not exceed 40 and has huge response time with load. Repeating the test without IIS (just kestrel server) the result was a success. The application is scalable.
It seems the runnig of kestrel behind IIS is the issue
Any suggestions, ideas, hints etc would be highly appreciated.
Best regards

Well i have a terrible suspicion : running the stress scenario on my development machine is limited by the hidden IIS own concurrency limitation. Only on windows server has no limitation. This means that i spent hundreds of hours to improve the response time, rewrite hot paths async, caching etc and i was misguided by this hidden limitation.
Now i'm moving the stress test on windows server 2012. I'll be back soon

Related

OpenSearch (ElasticSearch) latency issue under multi-threading using RestHighLevelClient

We use RestHighLevelClient to query AWS OpenSearch in our service. Recently we have seen some latency issues related to OpenSearch calls so I'm doing stress test to troubleshoot but observed some unexpected behaviors.
In our service when a request is received, we start 5 threads and make one OpenSearch call within each thread in parallel in order to achieve the latency performance similar to one call. During load tests even when I send traffic with 1TPS, for the same request I'm seeing very different latency numbers for different threads, specifically there's usually one or two threads seeing huge latency compared to others, which seems like that thread is being blocked by something, for example 390 ms, 300ms, 1.1 sec, 520ms, 30ms for each thread while in the mean time I don't see any search latency spike reported on OpenSearch service, with the max SearchLatency being under 350ms all the time.
I read that the low level rest client used in the RestHighLevelClient is managing a conn pool with very small default maxConn values so I've override both the DEFAULT_MAX_CONN_PER_ROUTE to be 100 and DEFAULT_MAX_CONN_TOTAL to be 200 when creating the client but it doesn't seem working based on the test results I saw before and after updating these two values.
I'm wondering if anyone has seen similar issues or has any ideas on what could be the reason for this behavior. Thanks!

very high max response and error when submit looping form submission

so my requirement is to run 90 concurrent user doing mutiple scenario (15 scenario)simultenously for 30 minutes in virtual macine.so some of the threads i use concurrent thread group and normal thread group.
now my issue is
1)after i execute all 15 scenarios, my max response for each scenario displayed very high (>40sec). is there any suggestion to reduce this high max response?
2)one of the scenario is submit web form, there is no issue if submit only one, however during the 90 concurrent user execution, some of submit web form will get 500 error code. is the error is because i use looping to achieve 30 min duration?
In order to reduce the response time you need to find the reason for this high response time, the reasons could be in:
lack of resources like CPU, RAM, etc. - make sure to monitor resources consumption using i.e. JMeter PerfMon Plugin
incorrect configuration of the middleware (application server, database, etc.), all these components need to be properly tuned for high loads, for example if you set maximum number of connections on the application server to 10 and you have 90 threads - the 80 threads will be queuing up waiting for the next available executor, the same applies to the database connection pool
use a profiler tool to inspect what's going on under the hood and why the slowest functions are that slow, it might be the case your application algorithms are not efficient enough
If your test succeeds with single thread and fails under the load - it definitely indicates the bottleneck, try increasing the load gradually and see how many users application can support without performance degradation and/or throwing errors. HTTP Status codes 5xx indicate server-side errors so it also worth inspecting your application logs for more insights

Thread management in asp.net core / kestrel

I'm troubleshooting performance / scalability issues with an asp.net app we've migrated to asp.net core 2.0. Our app is hosted on azure as an app service, and is falling over far too easily with any moderate traffic.
One thing that's puzzling me is how multiple concurrent requests are handled. From what I've read here, Kestrel uses multiple event loops to handle your requests. But the actual user code is handled on the .net thread pool (that's from here).
So, as an experiment - I've created a new asp.net core 2.0 MVC app, and added a rather nasty action method:
[AllowAnonymous]
public ActionResult Wait1()
{
System.Threading.Tasks.Task.Delay(1000).Wait();
return new StatusCodeResult((int)HttpStatusCode.OK);
}
Now, when I push this to azure, I'd expect that if I send say 100 requests at the same time, then I should be OK, because 100 requests sounds like minor load, right? And the waiting will happen on the thread pool threads, right?
So - I do just this and get some rather poor results - sample highlighted in red:
Hmm, not what I expected, about 50 seconds per request... If however I change the frequency so the requests are spaced a second apart, then the response time is fine - back to just over 1000ms as you'd expect. It seems if I go over 30 requests at the same time, it starts to suffer, which seems somewhat low to me.
So - I realise that my nasty action method blocks, but I'd have expected it to block on a thread pool thread, and therefore be able to cope with more than my 30.
Is this expected behaviour - and if so is it just a case of making sure that no IO-bound work is done without using async code?
Is this expected behaviour - and if so is it just a case of making sure that no IO-bound work is done without using async code?
Based on my experience, it seems that is as expected behaviour. We could get answer from this blog.
Now suppose you are running your ASP.Net application on IIS and your web server has a total of four CPUs. Assume that at any given point in time, there are 100 requests to be processed. By default the runtime would create four threads, which would be available to service the first four requests. Because no additional threads will be added until 500 milliseconds have elapsed, the other 96 requests will have to wait in the queue. After 500 milliseconds have passed, a new thread is created.
As you can see, it will take 100*500ms intervals to catch up with the workload.
This is a good reason for using asynchronous programming. With async programming, threads aren’t blocked while requests are being handled, so the four threads would be freed up almost immediately.
I recommand that you could use async code to improve the performance.
public async Task<ActionResult> Wait1()
{
await Task.Delay(TimeSpan.FromSeconds(15));
return new StatusCodeResult((int)HttpStatusCode.OK);
}
I also find another SO thread, you could refernce to it.

ThreadPool.SetMinThreads as Warm-Up strategy

My web app faces huge CPU spikes. Not because of traffic increasing, but because of heavy load, such as reports going out. Some of these cause the CPU to go from a healthy 30% load to 100% for the next 2-10 min... Here i'll describe as if i had only 1 server, but i've seen up to 4 servers going crazy because the alignment of the stars made around 50 of my clients want a report at the same time... i'm hosted on azure and I use the auto-scale to handle these spikes. if the load goes north of 70% for more than 2 min, a new instance goes up.
The thing is, because server 1 is 100% backed-up, when it goes up, (i hope) the load balance will direct every new request to server 2 until server 1 can handle more again. Because of this (expected) behavior, I was wondering if I should raise the min number of threads so it can faster handle the requests that are coming.
My usual rate of requests is around 15/s, so thought i should start the pool with at least 50...
what you guys think?
Edit 1 2017-07-13
So far this is working fine... i'll try a higher setting and see what happens
This strategy did prove itself very helpful and mitigated a lot of issues. Not all my problems are gone but the errors/timeouts decreased immensely.

Load testing bottleneck on nodejs with Google Compute Engine

I cannot figure out what is the cause of the bottleneck on this site, very bad response times once about 400 users reached. The site is on Google compute engine, using an instance group, with network load balancing. We created the project with sailjs.
I have been doing load testing with Google container engine using kubernetes, running the locust.py script.
The main results for one of the tests are:
RPS : 30
Spawn rate: 5 p/s
TOTALS USERS: 1000
AVG(res time): 27500!! (27,5 seconds)
The response time initially is great, below one second, but when it starts reaching about 400 users the response time starts to jump massively.
I have tested obvious factors that can influence that response time, results below:
Compute engine Instances
(2 x standard-n2, 200gb disk, ram:7.5gb per instance):
Only about 20% cpu utilization used
Outgoing network bytes: 340k bytes/sec
Incoming network bytes: 190k bytes/sec
Disk operations: 1 op/sec
Memory: below 10%
MySQL:
Max_used_connections : 41 (below total possible)
Connection errors: 0
All other results for MySQL also seem fine, no reason to cause bottleneck.
I tried the same test for a new sailjs created project, and it did better, but still had terrible results, 5 seconds res time for about 2000 users.
What else should I test? What could be the bottleneck?
Are you doing any file reading/writing? This is a major obstacle in node.js, and will always cause some issues. Caching read files or removing the need for such code should be done as much as possible. In my own experience, serving files like images, css, js and such trough my node server would start causing trouble when the amount of concurrent requests increased. The solution was to serve all of this trough a CDN.
Another proble could be the mysql driver. We had some problems with connection not being closed correctly (Not using sails.js, but I think they used the same driver at the time I encountered this), so they would cause problems on the mysql server, resulting in long delays when fetching data from the database. You should time/track the amount of mysql queries and make sure they arent delayed.
Lastly, it could be some special issue with sails.js and Google compute engine. You should make sure there arent any open issues on either of these about the same problem you are experiencing.

Resources