Nodemon Crash after app has been running for several hours

Nodemon Crash after app has been running for several hours - node.js

I have a web application running as a service on an Ubuntu EC2 Instance. As of the past 24 hours, the application has been crashing randomly 2-4 hours after running with the message attached in the image below. The error is:
[nodemon] app crashed - waiting for file changes before starting...
I have run into this error before but usually, it is a syntax error and it will not allow me to actually start the application. In this case, the app functions normally for several hours before crashing. I have no idea where to even start as there's nothing above it that looks like it could be causing the crash. The only thing is it looks like the website receives 3 Get / Requests before the server can respond then it crashes. Most of the posts I've found online about this also block the application from running and don't mention the fact that the app runs normally then crashes.
Any help would be greatly appreciated.
Thanks!
Error Log from Journalctl

It looks like a silent error. I would try to log every input (e.g. http request and timeouts) with timestamp and also log the crash with time. When a crash occurs I would compare the time to events happening right before.
Also check your /var/log/ if the programm was terminated by the system or another programm.

Related

Manage Cuncurent request process nodejs

Different users trying to access different application routes with heavy manipulation of data. At the mid time one of the request failed due to internal server error and my whole application has been crashed. Thats why other request has been failed because build has been crashed. Is there any solution to handle this situation?

If your program has thrown an uncaught error and crashed then there's nothing you can really do other than start it back up again. You could use something like pm2 to automatically restart your node process when it crashes, and then at least future requests that come in should work (although you will lose any in memory data from before the last crash).
Another thing that I think would help you would be to move your backend onto a serverless architecture where each invocation of your code is independent of the others.
And of course try to fix the code so that it handles things gracefully and doesn't actually throw errors. :)

Google App Engine - nodejs application goes down over night

Hi I am using google app engine to host a single instance nodejs application. The application works fine and my scripts are showing no errors in the logs. The application is currently just in testing and is not getting used over night, however often I come to work the next day and the server is just returning internal server errors. No errors are shown in my application log other then the 502 errors which i get when trying to access the next day. I see like 100s of calls for /_ah/_background/ overnight some appear to have timed out. At this point I must restart my instance for the app to continue to function.
I am completely stumped.. Because my app using web-sockets I must use manual scaling and a single instance. Would appreciate any help / suggestions.

I would venture a gues that you have a deferred task stuck running. Tasks that run in the taskqueue api are set by default to continuously retry. You can visit the taskqueue api TaskQueue API
To get the tasks to stop running right now visit the Google Cloud Console
select your project. Then select App Engine. Then select Task queues. Click on the task that is running (probably default). There should be a option to Pause the queue. This should prevent the 500 errors from occurring but will not fix the reason the task is failing.

Automatic reboot whenever there's an uncaught exception in a continous WebJob

I'm currently creating a continous webjob that will do polling to an API, and then forward messages to an Azure Service Bus. I've managed to get this to work just fine, but I have one problem; what if my app crashes for whatever reason? What if there's an uncaught exception, or something goes wrong, and the app stops running. How do i get it to run again?
I created a test app, which will send a message every to the Service Bus, then on the 11th message it will crash due to an intentionally placed NullReferenceException. I did this in order to investigate behaviour whenever/if the app crashes.
What happens is that the app runs just fine for the first 10 seconds (as expected). Messages are being sent, and everything looks good. Then after the 10th second, when the exception occurs, nothing happens. No log in Azure saying there was an exception, no reboot - nothing. It just stands there as "running", but messages are no longer being sent.
How do I deal with this? It's essential that the application is able to reboot if it fails. Are there any standard ways to do this? Best practices?
Any help would be appreciated :)

It is always good to handle most of the failure scenarios in the system by ourselves rather than to let the hosting environment to react for the failures.
My suggestion would be to have a check in the code for exceptions like any try catch block in your executable script to catch different kind of failure scenarios and instead of throwing the exceptions, log it your self or take any retry operation if required.
Example, when you got a junk data to process and it failed. Then you can try to do the operation again for eg. 3 times and then finally push a log to deadletter account to manually take care of such junk inputs. And don't let the flow be stopped by throwing the exception but instead handle it your self by logging a message which needs manual intervention.
In any GUI or Web applications, if there is an exception then the flow is re initiated by user click and system will respond. But here as it a background processor, it is ideal to avoid all such control flow blockers.
Hope this would help.

Periodic app restarts in a Docker container

We're running a Node.js/Express application which runs for a few hours and will then start to throw 504 errors for no good reason. Since we're currently unable to track these errors down we need to restart the application every hour or so to ensure it's still running during the weekend.
Our Ubuntu server runs Dokku, which then has a container setup for our application. Every time the application spits a 504 we have to run docker restart appid as root.
So what's the best way of automatically restarting the node process every hour?

throw 504 errors for no good reason
It's throwing these because you application is crashing
currently unable to track these errors down
You have to track them down. They are very likely unhandled exceptions which you can catch and log via:
process.on('uncaughtException', function(error) {
//look Ma, I died
});
So what's the best way of automatically restarting the node process every hour?
Since I'd feel bad not at least attempting to address you actual question, even though you are most certainly fixing the symptom instead of the problem in a seriously bad way...
Use cron. Put a script in /etc/cron.hourly/restart_express to do it. Make sure the script file has execute permissions and conforms to the run-parts naming constraints (no dots, etc).

Node.JS with forever on Heroku

So, I need to run my node.js app on heroku, it works very well, but when my app crashes, i need something to restart it, so i added forever to package.json, and created a file named forever.js with this:
var forever = require('forever');
var child = new (forever.Monitor)('web.js', {
max: 3,
silent: false,
options: []
});
//child.on('exit', this.callback);
child.start();
forever.startServer(child);
on my Procfile (that heroku uses to know what to start) i put:
web: node forever.js
alright! Now everytime my app crashes it auto restarts, but, from time to time (almost every 1 hour), heroku starts throwing H99 - Platform error, and about this error, they say:
Unlike all of the other errors which will require action from you to correct, this one does not require action from you. Try again in a minute, or check the status site.
But I just manually restart my app and the error goes away, if I don't do that, it may take hours to go away by itself.
Can anyone help me here? Maybe this is a forever problem? A heroku issue?

This is an issue with free Heroku accounts: Heroku automatically kills unpaid apps after 1 hour of inactivity, and then spins them back up the next time a request comes in. (As mentioned below, this does not apply to paid accounts. If you scale up to two servers and pay for the second one, you get two always-on servers.) - https://devcenter.heroku.com/articles/dynos#dyno-sleeping
This behavior is probably not playing nicely with forever. To confirm this, run heroku logs and look for the lines "Idling" and " Stopping process with SIGTERM" and then see what comes next.
Instead of using forever, you might want to try the using the Cluster API and automatically create a new child each time one dies. http://nodejs.org/api/cluster.html#cluster_cluster is a good example, you'd just put your code into the else block.
The upshot is that your app is now much more stable, plus it gets to use all of the available CPU cores (4 in my experience).
The downside is that you cannot store any state in memory. If you need to store sessions or something along those lines, try out the free Redis To Go addon (heroku addons:add redistogo).
Here's an example that's currently running on heroku using cluster and Redis To Go: https://github.com/nfriedly/node-unblocker
UPDATE: Heroku has recently made some major changes to how free apps work, and the big one is they can only be online for a maximum of 18 hours per day, making it effectively unusable as a "real" web server. Details at https://blog.heroku.com/archives/2015/5/7/heroku-free-dynos
UPDATE 2: They changed it again. Now, if you verify your ID, you can run 1 free dyno constantly: https://blog.heroku.com/announcing_heroku_free_ssl_beta_and_flexible_dyno_hours#flexible-free-dyno-hours

Develop Reference

node.js excel linux python-3.x azure haskell apache-spark rust .htaccess string