Keep a node application running on azure app service - node.js

I have deployed a node js web application on app service in azure. Issue is that my application occasionally getting killed for unknown reason. I have done exhaustive search through all the log fines using kudu.
If I restart app service, application starts working.
Is there any way I can restart my node application once it has crashed. Kind of run for ever no matter what. For example if any error happens in an asp.net code deployed in IIS, IIS never crashes, its keeps of serving other incoming request.
Something like using forever/pm2 in azure app service.

node.js in Azure App Services is powered by IISNode, which takes care of everything you described, including monitoring your process for failures and restarting it.
Consider the following POC:
var http = require('http');
http.createServer(function (req, res) {
if (req.url == '/bad') {
throw 'bad';
}
res.writeHead(200, {'Content-Type': 'text/plain'});
res.end('bye');
}).listen(process.env.PORT || 1337);
If I host this in a Web App and issue the following sequence of requests:
GET /
GET /bad
GET /
Then the first will yield HTTP 200, the second will throw on the server and yield HTTP 500, and the third will yield HTTP 200 without me having to do anything. IISNode will just detect the crash and restart the process.
So you shouldn't need PM2 or similar solution because this is built in with App Services. However, if you really want to, they now have App Services Preview on Linux, which is powered by PM2 and lets you configure PM2. More on this here. But again you get this out of the box already.
Another thing to consider is Always On setting which is on by default:
By default, web apps are unloaded if they are idle for some period of time. This lets the system conserve resources. In Basic or Standard mode, you can enable Always On to keep the app loaded all the time. If your app runs continuous web jobs, you should enable Always On, or the web jobs may not run reliably.
This is another possible root cause for your issue and the solution is to disable Always On for your Web App (see the link above).

I really want to thank itaysk for your support for this issue.
Issue was not what I was suspecting. Actually the node server was getting restarted on failure correctly.
There was a different issue. Why my website was getting non responsive is for a different reason. Here is what was happening-
We have used rethinkdbdash to connect to rethinkdb database and we was using connection pool. There was a coding/design issue. We have around 15 change feeds implemented with along with socket.io. And the change feed was getting initialised for every user logged in. This was increasing number of active connections in the pool. And rethinkdbdash has default limit of 1000 connection in the pool and as there were lots of live connections, all the available connection in the pool was getting exhausted resulting no more available connection. So, request was waiting for an open connection and it was not getting any available, hence waiting for ever blocking any new requests to be served.

Related

Why my NodeJS express app shutting down after half hour of idle?

i have a nodejs application, which using express.
My problem:
The app working well on localhost, until i stop that
I bought a shared hosting on Namecheap.com and deployed my app
Everything working, except one thing, this is the 24/7 availability
After half hour of idle (When server does not get request(s)) application shuts down, until it gets new request, and this causes high load time
There are no errors
My question is, is there anything that i can do to prevent this?
Like a custom code that does not let server to go idle?
Application uses: Express, MySql, Express-session, nodecache, Body and cookie parser, gz compression, nodemailer, path, fs
I am using pm2.
Code is long so i won't paste here, if you have suggestions i'll check and provide you more info!
(App registered in cPanel Application Manager powered by Phusion Passanger, NodeEnv=Production)

SignalR long polling repeatedly calls /negotiate and /hub POST and returns 404 occasionally on Azure Web App

We have enabled SignalR on our ASP.NET Core 5.0 web project running on an Azure Web App (Windows App Service Plan). Our SignalR client is an Angular client using the #microsoft/signalr NPM package (version 5.0.11).
We have a hub located at /api/hub/notification.
Everything works as expected for most of our clients, the web socket connection is established and we can call methods from client to server and vice versa.
For a few of our clients, we see a massive amount of requests to POST /api/hub/notification/negotiate and POST /api/hub/notification within a short period of time (multiple requests per minute per client). It seems like that those clients switch to long polling instead of using web sockets since we see the POST /api/hub/notification requests.
We have the suspicion that the affected clients could maybe sit behind a proxy or a firewall which forbids the web sockets and therefore the connection switches to long polling in the first place.
The following screenshot shows requests to the hub endpoints for one single user within a short period of time. The list is very long since this pattern repeats as long as the user has opened our website. We see two strange things:
The client repeatedly calls /negotiate twice every 15 seconds.
The call to POST /notification?id=<connectionId> takes exactly 15 seconds and the following call with the same connection ID returns a 404 response. Then the pattern repeats and /negotiate is called again.
For testing purposes, we enabled only long polling in our client. This works for us as expected too. Unfortunately, we currently don't have access to the browsers or the network of the users where this behavior occurs, so it is hard for us to reproduce the issue.
Some more notes:
We currently have just one single instance of the Web App running.
We use the Redis backplane for a scale-out scenario in future.
The ARR affinity cookie is enabled and Web Sockets in the Azure Web App are enabled too.
The Web App instance doesn't suffer from high CPU usage or high memory usage.
We didn't change any SignalR options except of adding the Redis backplane. We just use services.AddSignalR().AddStackExchangeRedis(...) and endpoints.MapHub<NotificationHub>("/api/hub/notification").
The website runs on HTTPS.
What could cause these repeated calls to /negotiate and the 404 returns from the hub endpoint?
How can we further debug the issue without having access to the clients where this issue occurs?
Update
We now implemented a custom logger for the #microsoft/signalr package which we use in the configureLogger() overload. This logger logs into our Application Insights which allows us to track the client side logs of those clients where our issue occurs.
The following screenshot shows a short snippet of the log entries for one single client.
We see that the WebSocket connection fails (Failed to start the transport "WebSockets" ...) and the fallback transport ServerSentEvents is used. We see the log The HttpConnection connected successfully, but after pretty exactly 15 seconds after selecting the ServerSentEvents transport, a handshake request is sent which fails with the message from the server Server returned handshake error: Handshake was canceled. After that some more consequential errors occur and the connection gets closed. After that, the connection gets established again and everything starts from new, a new handshare error occurs after those 15 seconds and so on.
Why does it take so long for the client to send the handshake request? It seems like those 15 seconds are the problem, since this is too long for the server and the server cancels the connection due to a timeout.
We still think that this has maybe something to to with the client's network (Proxy, Firewall, etc.).
Fiddler
We used Fiddler to block the WebSockets for testing. As expected, the fallback mechanism starts and ServerSentEvents is used as transport. Opposed to the logs we see from our issue, the handshake request is sent immediately and not after 15 seconds. Then everything works as expected.
You should check which pricing tier you use, Free or Standard in your project.
You should change the connectionstring which is in Standard Tier. If you still use Free tier, there are some restrictions.
Official doc: Azure SignalR Service limits

Getting 500 server error.. My site works fine and only throws these errors occasionally. How can I better diagnose the problem?

My site works fine locally. It even works fine with my backend using azure web services and front end using netlify but occasionally after several api calls (I'm not overloading the server because these api calls are done one by one) I get LOTS of errors that are all the same. 500 internal server error. I look at the logs and they give me some numbers 500 1013 109 329 2144 391
Reason for this could be
Network issue of your server
Server request time out
-Web app takes too long to respond for a request/response when connecting to any resource( database,different server) etc..
To resolve that , i would suggest you to to increase the idle timeout of your app.
in the app setting of your web app add SCM_COMMAND_IDLE_TIMEOUT = 3600
By default, Web Apps are unloaded if they are idle for some period of time. This lets the system conserve resources. In Basic or Standard mode, you can enable ‘Always On’ to keep the app loaded all the time.
You may also check the diagnostic log stream to get more details on this issue and the blog post for Troubleshooting Azure App Service Apps Using Web Server Logs.
Hope it helps.

Is it a good idea to have a separate copy of the socket.io.js file instead of relying on the file served by a socket.io app?

Consider this scenario:
Socket.io app went down (or restarted) for some reason and took about 2 seconds before it started again (considering the use of production manager app ie: PM2).
Within the 3 second down time a client tried to request the client socket.io.js script (localhost:xxxx/socket.io/socket.io.js) and resulted as a failed request (error 500, 404, or net::ERR_CONNECTION_REFUSED) before the server got started again.
After the three second downtime the server file is available again.
So now i have no other way but to inform the user to refresh to resume real time transactions.
I cannot retry to reconnect to the socket.io server because i do not have the client script.
But if it is served somewhere else, perhaps at the same dir where jQuery is, i could just listen if io is available again by writing a simple retry function that fires for every few seconds.
In general, it's a good idea to use the version served by Socket.IO, as you'll have guaranteed compatibility. However, as long as you stay on top of making sure you deploy the right versions, it's perfectly fine to host that file somewhere else. In fact, it's even preferred since you're taking the static load off your application servers and putting it elsewhere.
An easy way to do what you want is to configure Nginx or similar to cache that file and serve a stale copy when the upstream server (your Node.js with Socket.IO server) is down. https://serverfault.com/q/357541/52951

node.exe process with IISNode process stops running

I am using iisnode to run my node.js app. However, after about an hour, the node.exe process stops running (I need it running since I have a setInterval() method that pulls data from the database every few seconds). Any advice?
Also, if I set up my server with process.env.PORT, how do I connect to it using socket.io on the client-side? I understand that I have to use
io.configure(function () {
io.set("transports", ["xhr-polling"]); // no websockets
io.set("polling duration", 10);
io.set("log level", 1); // no debug msg
});
This is a correct configuration for socket.io when the application is hosted in IIS using iisnode. On the client side, you connect to the server using the regular HTTP address of the endpoint exposed by IIS. Note that in order for socket.io to work out of the box, you need to host your node.js application as an IIS WebSite rather than a virtual directory within a web site: your app should be addressable with http://foobar.com/ rather than http://foobar.com/myapp/.
When you say the node.exe process stops running do you mean it is terminated and disappears or hangs? IIS will terminate worker processes (including any child processes they spawned, which is node.exe in this case) after a period of inactivity (when no HTTP requests arrive that target this server). The duration of that time period is configurable in the Application Pool settings.
If you rely on logic in your application that requires code to be run at intervals, IIS itself does not provide the best hosting model, as process lifetime is tied closely to HTTP messaging. You really need either a durable server (e.g. Windows Service, check out http://nssm.cc/), or some form of a web cron.

Resources