Why is my Azure node.js app becoming unresponsive? - node.js

I recently deployed a Node.js Backend Service to Azure and have the following problem. The service becomes unresponsive after a certain amount of time, and only comes back to life if a external request is sent. The problem is, that it takes about 3 minutes for the Container to start back up and actually return the request. I'm running Node 14 LTS. I also added a health check yesterday, but azure simply doesn't bother actually keeping the app alive, here is the metric off azure
I verified azure is actually trying to reach the correct endpoint, and it does. I also have "Always On" enabled. I also verified that the app itself, is not crashing. I log every request and all of a sudden requests are no longer received, which means the health endpoint doesn't respond either, but it does not result in a container restart. It just waits for an external request to appear and then decides to start everything back up, which takes too long.
I feel like it's some kind of configuration issue, because the app itself is not very complex and I never experienced crashes when doing local development.

The official document tells us that the Free pricing tier you are currently using, Always on does not take effect.
How do I decrease the response time for the first request after idle time?

Related

Azure Functions service not recognizing request sent from outside client

We have a service which pings our EP1 Premium service and yesterday we received 3 client side timeout errors after 2 minutes of waiting. When opening the trace in App insights, these requests which time out are not even logged and have no trace of ever being received Azure side, and therefore stay unanswered. By looking at the metrics provided in the Azure Functions app, I found out that 1-2 minutes after the request has been sent, the app loses all its ability to work as its Total App Domains falls to 0 as well as all connections, threads and so on and this state lasts until the next request is received, therefore "skipping" the request that happened beforehand. This is a big issue as I need to make sure requests get answered in a timely manner.
The client service sent HTTP requests to the Azure Functions app expecting an answer, only to time out while the Azure-side doesn't have any record of ever receiving the request.
I believe this issues is related to Consumption Plan of Azure Functions called Cold Start behaviour. The "skipping" mechanism is explained below:
Apps may scale to zero when idle, meaning some requests may have additional latency at startup. The consumption plan does have some optimizations to help decrease cold start time, including pulling from pre-warmed placeholder functions that already have the function host and language processes running.https://learn.microsoft.com/en-us/azure/azure-functions/functions-scale#cold-start-behavior
Please also consider of having look on this article, which explains the behaviour. https://azure.microsoft.com/en-us/blog/understanding-serverless-cold-start/

High response duration on first request for .net core api on Azure

I have deployed a .Net Core API to Azure as an App Service.
I have set the Always on feature to true.
When I log the requests, I see that Azure Always on requests are coming every 5 minutes.
My usage with API is HTTPS but Always on requests are sending with HTTP. I don't know if this is the case
For the first request, it is sometimes 10 seconds, but after the first request, it is around 100ms.
What is missing here?
I have logged the durations:
There are quite a few reasons why this might be the case:
You're connecting to resources that take time connecting to the first time
Some information is being cached and needs to be read the first time
There is initialization code present
Lazy instantiation of (static/singleton) instances
... other ...
Add some logging to your application, maybe enable Application Insights if you haven't done so already and go try to find the culprit.

Heroku - restart on failed health check

Heroku does not support health checks on its own. It will restart services that crashed, but there is nothing like health checks.
It sometimes happen that service become unresponsive, but the process is still running. In most of modern cloud solution, you can provide health endpoint which is periodically called by the cloud hosting service and if that endpoints return either error or not at all, it will shut down such service and start new one.
That seems like industrial standard these days, but I am unable to find any solution to this for Heroku. I can even use external service with Heroku CLI, but just calling some endpoint is not sufficient - if there are multiple instances, they all share same URL and load balancer calls one of them randomly -> therefore it is possible to not hit failed instance at all. Even when I hit it, usually the health checks have something like "after 3 failed health checks in a row restart that instance", which is highly unprobable if there are 10 instances and one of it become unhealthy.
Do you have any solution to this?
You are right that this is industry standard and shame that it's not provided out of box.
I can think of 2 solutions (both involve running some extra code that does all of this:
a) use heroku API which allows you to get the IP of individual dynos, and then you can call each dyno how you want
b) in each dyno instance you can send a request to webserver like https://iamaalive.com/?dyno=${process.env.HEROKU_DYNO_ID}

dotnet core webapi, docker and azure. Performance issues during first api call

We have a dotnet core 3.0 solution running in a docker image running on Azure. For now, we haven't set it up in a k8s cluster. Our app service plan is PremiumV2, which basically means that we're running on dedicated hardware and not sharing our resources with anyone else.
We have a simple api-call to get the executing user based on the JWT. This validates the JWT, gets the users mail from the claims and queries cosmos to get more information about the user. When the request is sent from Postman the first time, it takes roughly 320 ms, however the subsequential requests takes around 50 ms. If we're waiting, lets say 10 minutes more or less, the requests is back at around 300 ms and again the subsequential requests takes around 50 ms. This indicates that the behavior is re-producable. Its worth mentioning that its not only this call that we see this behavior, but every "first" requests to our api takes more time than the other following.
Looking into application insights, apparently cosmos is not the bottleneck here. We've also configured the app service to be "always on"
Any ideas on how we can trace down this issue? Has anyone else experienced the same behavior? Is there any settings or configuration we should look at in Azure?

Azure Bot Service using over 1GB of data transfer out per day. Why? How can I stop that?

I created a QnA bot using the Azure Bot service, and now I'm seeing data transfers out of my subscription of over 1 GB a day! I cannot figure out why, but since it's billable, I'd like to know why and how I can stop it.
The bot isn't being used yet, so no one is sending queries to it. I'm confused how this is happening.
Here's a screen shot of the graph for use in the last hour as well as a screen shot of the billing for the last few days showing the sudden jump in use.
Is this normal?
If you add AzureWebJobsDisableHomepage with a value of true, to the App settings, the data out will stop.
The setting itself is documented here: https://github.com/Azure/azure-webjobs-sdk-script/wiki/Configuration-Settings (although it doesn't provide an explanation for how this setting affects a bot specifically)
The reasoning behind what is happening is a little complex. Azure Functions are not normally "in memory" and available all the time. There is a small spinup time that is not ideal within a bot. So, apparently there is a job setup with consumption plan bots to ping it every 10 seconds (and by 'ping', i mean retrieve the root of the site). If you open the Log Stream, you'll see an http get request every 10 seconds. Adding AzureWebJobsDisableHomepage doesn't disable the request, but changes the status of what is returned from "OK" to "NoContent".
This will be added to the Bot Service arm template soon (so future consumption plan bots do not automatically accrue these data usages).

Resources