How to prevent pm2 from restarting application on error during startup

How to prevent pm2 from restarting application on error during startup - node.js

Context
I've added configuration validation to some of the modules that compose my Node.js application. When they are starting, each one checks if it is properly configured and have access to the resources it needs (e.g. can write to a directory). If it detects that something is wrong it sends a SIGINT to itself (process.pid) so the application is gracefully shutdown (I close the http server, close possible connections to Redis and so on). I want the operator to realize there is a configuration and/or environment problem and fix it before starting the application.
I use pm2 to start/stop/reload the application and I like the fact pm2 will automatically restart it in case it crashes later on, but I don't want it to restart my application in the above scenario because the root cause won't be eliminated by simply restarting the app, so pm2 will keep restarting it up to max_restarts (defaults to 10 in pm2).
Question
How can I prevent pm2 from keeping restarting my application when it is aborted during startup?
I know pm2 has the --wait-ready option, but given we are talking about multiple modules with asynchronous startup logic, I find very hard to determine where/when to process.send('ready').
Possible solution
I'm considering making all my modules to emit an internal "ready" event and wire the whole thing chaining the "ready" events to finally be able to send the "ready" to pm2, but I would like to ask first if that would be a little bit of over engineering.
Thanks,
Roger

Related

pm2 cluster calling a specific instance? And one request handled by all instances simultaneously

I am having a small problem with an application ran by pm2 cluster mode. Normally everything is working fine, but due to the logic of my application and recently switching to cluster mode i am now facing an issue, i can't handle properly without refactoring my application from the ground.
My application uses express for http-request handling and uses also global variables to store data, timers, etc. Now after switching to pm2 cluster mode, only one of the instances has a value, but the others don't. Thats resulting in problems, because of inconsistencies over the different instances. The behaviour is clear, but i would have to refactor many things to make the application in whole work properly again.
I already saw things like the INSTANCE_VAR, but could not find out how that could help me.
All i can think of at the moment is, am i able to force pm2 to send a http request to all instances simultanously, or if not can i tell pm2 to handle my request with a specific instance, which i define on the runtime from the outside and without interfering the other instances?

I am actually having something similar were i want a specific api request to make an update available registered only on process. Here i found a way to send a message/event to a specific process, i think it may help

Using supervisord instead of pm2 to manage nodejs process

We currently use pm2 to keep our nodejs process alive, we don't use the cluster mode (and the related load balance feature).
Our php team uses supervisord to manage their php process alive, as laravel suggests. Now we are investigating the possibility of using supervisord to manage our nodejs process too. We mainly need 2 things from a process manager, keep a process alive and log the event when it crushes and restarts.
In term of keeping a process alive I do find pm2 & supervisord share some similarity. But pm2 have more restart policies, e.g. pm2 has a CRON time which supervisord doesn't have (correct me if I am wrong). Without cron time feature we will have to just resort to cronjob, so it is a nice to have feature but not a must.
supervisord has process groups and priority order, which through my experience with node, I don't find many use cases.
So to us it seems doable, but we don't have enough experience with supervisord and we are afraid we may miss something here, especially the big one like you should not do that in the first place! Have anyone done this before ?
BTW, my question is kind of the opposite of Running a python script in virtual environment with node.js pm2

When you use pm2 or supervisord, the crash message will be hidden by pm2 or supervisord. So the key point is that when the worker crashese, you should have the ability to know it, such as to hack the pm2 program, when the worker crashes, send a message to the monitor system.

Azure Service Fabric StatelessService does not call OnCloseAsync

I have a simple StatelessService and I want to knew when it is being closed down so I can perform some quick clean up. But it never seems to call OnCloseAsync.
When the service is running and I use the 'Restart' command on the running node via the Service Fabric Explorer, it removes the services and restarts the node. But it never calls the OnCloseAsync override, even though it is knowingly being closed down.
Nor does it signal the cancellationToken that is passed into the RunAsync method. So there is no indication that the service is being shutdown. Are there any circumstances when it does call OnCloseAsync, because I cannot see much point in it at the moment.

I wonder the reasoning behind issueing the restart command, what behavior do you expect?
It does explain however the behavior you see. From the docs (Keep in mind that a restart is just a combined stop and start)
Stopping a node puts it into a stopped state where it is not a member of the cluster and cannot host services, thus simulating a down node. This is useful for injecting faults into the system to test your application.
Now, if we take a look at the lifecycle we read this:
After CloseAsync() finishes on each listener and RunAsync() also finishes, the service's StatelessService.OnCloseAsync() method is called, if present. OnCloseAsync is called when the stateless service instance is going to be gracefully shut down.
So, the basic problem is that you service is not gracefully shutdown. The restart command kills the process and no cancellation will be issued.

Node as Kafka consumer

I have searched for packages for Node like this: https://www.npmjs.com/package/kafka and https://www.npmjs.com/package/no-kafka
My question is: Do these packages makes the node.js subscribe to kafka all the time? or Do I need some packages like forever or pm2 to achieve that?

The purpose of something like forever is to keep your node app running all the time (if it should crash).
This is pretty much separate from what those two packages do or don't do. They run inside your node app. If you want them to run all the time, then you need them to be used in a node app that is always running.
You can either write a rock solid node app that doesn't ever crash so it runs continually or you can attempt to do that and also run something like forever so that if your app dies, forever will automatically restart it.
Do these packages makes the node.js subscribe to kafka all the time?
or Do I need some packages like forever or pm2 to achieve that?
No. forever and pm2 have no influence on what kafka does or doesn't do. They just make sure your app is restarted if it crashes or exits for some reason.
If you are using the consumer side of the kafka API, then you will have to do some research and testing to see how good the library is at keeping you connected all the time, even when the server you are connecting to temporarily restarts or there is a temporary internet glitch.
From what I can tell looking at the code for this implementation, if there's an error on an open connection, the connection is just closed and there is no reconnect logic so you would probably have to write reconnect logic by subscribing to either the error or close events and then attempting to reconnect when the connection is lost.

Best approach to deploy and run node servers on production machines?

I am using Webrtc, nodejs, Expressjs as the framework to create a audio, video and chat application. I have used Forever so that the script runs continuously.
As my application deals with audio, video and chat. User presence plays an important role. We need to have the system up and running always and avoid system crashes and restart. If it happens we are going to loose all information regarding the users who were online.
Need Suggestions what are the best approaches to avoid such situations.
Also, while moving new features to the production server, what steps should we take into consideration so that the application doesn't stop and henceforth we don't loose user information.
What if the server went down or we had to make it down. What are the different techniques that can be used so that we don't loose the presence information of the online users in the system and restore them back(if necessary).

1) use node cluster to fork multiple process per core. So if one process died, another process will be auto boot up. Check out: http://nodejs.org/api/cluster.html
2) use domain to catch asyn operation instead of using try-catch or uncaught http://nodejs.org/api/domain.html. I'm not saying that try-catch or uncaught is bad thought!
3) use forever/supervisor to monitor your services
4) add daemon to run your node app: http://upstart.ubuntu.com
hope this helps!
I also answered at this post: How do I prevent node.js from crashing? try-catch doesn't work

Develop Reference

node.js excel linux python-3.x azure haskell apache-spark rust .htaccess string

How to prevent pm2 from restarting application on error during startup - node.js

Related

pm2 cluster calling a specific instance? And one request handled by all instances simultaneously

Using supervisord instead of pm2 to manage nodejs process

Azure Service Fabric StatelessService does not call OnCloseAsync

Node as Kafka consumer

Best approach to deploy and run node servers on production machines?

Categories

Resources