Does startup time for Cloudrun docker image delays autoscaling? - node.js

I am currently using express.js as the main backend. However, I have found that fastify is generally faster in performance.
But the downside of fastify is that it has a relatively slow startup time.
I am curious that would it slow down the autoscaling of Cloudrun?
I've seen that Cloudrun autoscales when the usage is over 60%.
In this case, I am thinking slow startup time can delay the response while autoscaling which can be a reason not to use fastify. How does this exactly work?

A slow cold start doesn't influence the autoscaller. The service scale according to the CPU usage and the number of queries.
If you track the number of created instance with benchmarks, you can see that you can have suddenly 50 concurrent requests, 5 to 10 instances are created by the Cloud Run autoscaller. Why? Because if the traffic suddenly increase with a lot of concurrent request, it can means that slope can continue and you can have soon 100 or 200 concurrent request and Cloud Run service is prepared to absorb that traffic.
If you scale slowly to 50 concurrent requests, you can have only 1 or 2 instances set up.
Anyway, it's just a thing that I noted in my previous tests. In addition, keep in mind that, if you have a sustained traffic, the cold start is a marginal case. You will lost a few milliseconds "rarely", and it doesn't imply a framework change.
My recommendation is to keep what is the best and the most efficient for you (you cost at least 100 times more than the Cloud Run cost!!)

Related

IIS - Worker threads not increasing beyond certain number even though the CPU usage is less than 40 percent

We are running a web API hosted in IIS 10 on an 8 core machine with 16 GB Memory and running Windows 10, and throwing a load of say 100 to 200 requests per second through JMeter on the server.
Individual transactions are taking less than 500 milliseconds. When we throw the load initially, IIS threads grow up to around 150-160 mark (monitored through resource monitor and Performance monitor) and throughput increases up to 22-24 transactions per second but throughput and number of threads stop to grow beyond this point even though the CPU usage is less than 40 per cent and we have enough physical memory also available at the peak, the resource monitor does not show any choking at the network or IO level.
The web API is making calls to the Oracle database (3-4 select calls and 2-3 inserts/updates).
We fail to understand what is stopping IIS to further grow its thread pool to process more requests in parallel while all the resources including processing power, memory, network etc are available.
We have placed many performance counters as well, there is no queue build-up (that's probably because jmeter works in synchronous mode)
Also, we have tried to set the min and max threads settings through machine.config as well as ThreadPool.SetMin and Max threads APIs but no difference was observed and seems like those setting are not taking any effect.
Important to mention that we are using synchronous calls/operations (no asnch and await). Someone has advised to convert all our blocking IO calls e.g. database calls to asynchronous mode to achieve more throughput but my understanding is that if threads cant be grown beyond this level then making async calls might not help or may indeed negatively impact the throughput. Since our code size is huge, that would be a very costly activity in terms of time and effort and we dont want to invest in it till we are sure that it would really help. If someone has anything to share on these two problems, pls do share.
Below is a screenshot of the permanence monitor.

Google Cloud Run not scaling as expected

I'm using Google Cloud Run to run a pretty basic Express / Node JS backend container. I receive fairly low number of requests per day, and only the occasional concurrent request.
However, I can see on my Cloud Run dashboard that Cloud Run sometimes scale up to 4 instances, most of the time to at least 2 instances. I know that my app load is so low that I'll pretty much never need more than 1 instance, so why is Cloud Run being so wasteful?
My settings is set as maximum 40 requests concurrently; minimum 0 containers and maximum 4 containers.
Container instance counts fluctuates substantially. Green line is idle containers and blue line is active containers.
My CPU usage is also very low:
You know your workload profile and the expected request. Cloud Run autoscaler does not. Therefore, it over provisions additional instances in case of traffic spike.
Of course, YOU know that will never happen, but IT doesn't.
Cloud Run is pretty well designed for average traffic. If you are at one extremity of this standard usage (very low traffic or very high, very spiky traffic), yes, the Cloud Run autoscaler provisioning model doesn't work so well.
However, what's the problem? You pay only when a request is processed on an instance. If there are over provisioned and not used instances, you won't pay them. It's a waste of money for Google, not for you.
Your only concern could be for the earth and the resource saving, and you have absolutely right.

ASP.NET Core 2.2 experiencing high CPU usage

So I have hosted asp.net core 2.2 web service on Azure(S2 plan). The problem is that my application sometimes getting high CPU usage(almost 99%). What I have done for now - checked process explorer on azure. I see there a lot of processes who are consuming CPU. Maybe someone knows if it's okay for these processes consume CPU?
Currently, I don't have an idea where do they come from. Maybe it's normal to have them here.
Shortly about my application:
Currently, there is not much traffic. 500-600 request in a day. Most of the request is used to communicate with MS SQL by querying records, adding, etc.
As well I am using MS Websocket, but high CPU happens when no WebSocket client is connected to web service, so I hardly believe that it's a cause. I tried to use apache ab for load testing, but there isn't any pattern, that after one request's load test, I would get high CPU. So sometimes happens, sometimes don't during load testing.
So I just update screenshot of processes, I see that lots of threads are being locked/used during the time when fluent migrator start running its logging.
Update*
I will remove fluent migrator logging middleware from Configure method. Will look forward with the situation.
UPDATE**
So I removed logging of FluentMigrator. Until now I didn't notice any CPU usage over 90%.
But still, I am confused. My CPU usage is spinning. Is it health CPU usage graph or not?
Also, I tried to make a load test on the websocket server.
I made a script that calls some functions of WebSocket every 100ms from 6-7 clients. So every 100ms there are 7 calls to WebSocket server from different clients, every function within itself queries some data/insert (approximately 3-4 queries of every WebSocket function).
What I did notice, on Azure S1 DTU 20 after 2min I am getting out of SQL pool connections, If I increase DTU to 100, it handles 7 clients properly without any errors of 'no connection pool'.
So the first question: is it a normal CPU spinning?
Second: should I get an error message of 'no SQL connection free' using this kind of load test on DTU 10 Azure SQL. I am afraid that when creating a scoped service on singleton WebSocket Service I am leaking connections.
This topic gets too long, maybe I should move it to a new topic?
-
At this stage I would say you need to profile your application and figure out what areas of your code are CPU intensive. In the past I have used dotTrace, this highlighted methods which are the most expensive with a call tree.
Once you know what areas of your code base are the least efficient, you can begin to refactor them so that they are more efficient. This could simply be changing some small operations, adding caching for queries or using distributed locking for example.
I believe the reason the other DLLs are showing CPU usage is because your code calling methods which are within those DLLs.

Is Azure Functions running in Consumption mode appropriate for massively varying, yet time critical Load?

I’m about to start work on an API that will literally go from 0 RPS to a couple hundred thousand HTTP RPS at the same time and run at that rate for ~2 mins. All processing of those 30 million requests must finish by the end of that 2 min period. This would happen 7 times a WEEK.
Going serverless with Azure Functions in Consumption Plan Hosting Mode sounds appealing. This document describes that a scale controller exists to coordinate app instances, but doesn't really discuss what I can expect from it for HTTP triggers. I can’t find any info that says the scale controller will be able to respond in the time frame I'd need.
The best info I could find was this info saying it took nearly 8 mins to scale up for his tests.
Is this a bad use case for Azure Functions in consumption mode?
Obviously, spinning up a testing harness that is capable of issuing 30 million requests within 2 minutes is an undertaking of its own, and an expensive one. I'd like to learn from others who have already done so.
Based on my experience, this scenario is not properly covered by Consumption Plan. They can scale up to many instances, but not very rapidly. 2 minutes is way too fast to rely on.
I was mostly working with queues, not HTTP, but I got delays up to 40 minutes caused by low pace of scaling up.
If you can predict which 2 minutes are going to be heavy-loaded, your best bet could be to provision the capacity with a script (or another Function).

How does one configure "Autoscale" to deal with Web instances which have long wait times due to external processes?

I am using MVC3, ASP.NET4.5, C#, Razor, EF6.1, SQL Azure
I have been doing some load testing using JMeter, and I have found some surprising results.
I have a test of 30 concurrent users, ramping up over 10 secs. The test plan is fairly simple:
Login
Navigate to page
Do query
Navigate back
Logout
I am using "small" "standard" instances.
I have noticed that web instances may be waiting on external processes, such as databases queries, so the web CPU could be low, but it is still a bottleneck. The CPU could be idling at 40% while waiting for a result set from the DB. So this could also be a reason why the extra instance may not be triggered. Actually this is a real issue. How do you trigger extra instance based on longer wait times? At the moment the only way round this is to have 2 instances up there permanently, or proactively set it up against a schedule.
Use async calls and you won't have to worry about scaling up. The waiting threads will be asleep, freeing up resources to handle other users.
If you still see lengthened response times after that it's probably the external process that's choking and in need of being scaled up

Resources