Monitor node.js scripts running on ubuntu instance - node.js

I have a node.js script that run once in a day on ubuntu EC2 instance. This script pulls data from some hundered thousand remote APIs and save to our local database. Is there any way we can monitor this node.js script on remote server? There have been few instances where script crashed due to some reason and we were unable to figure it out without SSHing into instance and checking the logs. I have however created a small system after first few crashes which send us an email whenever script crashes due to some uncaught exception and also when script completes execution.
However, we need to develop a better system where we can monitor the progress of script via web interface of our admin application which is deployed over some other instance and also trigger start/stop of script via this interface. What are possible options for achieving this?

If you like to stay in Node.js, then there are several process monitoring tools:
PM2 comes with lots of other features besides monitoring processes. You can monitor your processes via CLI or their official web interface: https://keymetrics.io/. A quick search on npm also gives a bunch of nice unofficial gui tools: https://www.npmjs.com/search?q=pm2+web
Forever is not as feature rich as PM2 but will do the basic process operations and couple of gui are also available in npm.

There are two problems here that you are trying to solve:
Scheduling work to be done
Monitoring a process for failure
At a simple level, this is easy: schedule a cron job and restart failed things so they keep trying.
However, when things don't go smoothly, it helps to have a lot more granularity over what you are scheduling, and how it is executed. This would also give you the visibility over each little piece of work.
Adding a little more complexity, you can end up with something like this:
Schedule the script that starts everything (via cron, if that's comfortable)
That script generates several jobs that need to be executed into a queue
A worker process (or n worker processes) consume that queue and execute pending jobs
You can monitor both the progress of the jobs, as well as the state of each worker (# of crashes, failures, jobs completed, etc.). The other tools mentioned above are good candidates for this (forever, pm2, etc.)
When jobs fail, other workers can pick up the small piece of work that was in progress and restart it. This is much more efficient than restarting the entire process, and also lets you parallelize things across n workers based on how you can split up the workloads.
You could easily throw the status onto a web app so you can check in periodically rather than have to dig through server logs.
You can also get more intelligent with different types of failures. Network error? Retry 5 times. Rated limited? Gradual back-off. Crash? Don't retry and notify via email. etc

I have tried this with pm2, you can get the info of the task, then cat out or grab the log files. Or you could have a logging server, see also: https://github.com/papertrail/remote_syslog2

Related

Heroku node timeout because of enormous task

Our node app gets quite big and one job takes quite some time to execute. We run this job with a cronjob, but by calling the URL. Now Heroku has problems with this, because the job takes more than 30 seconds to finish. So we receive a time-out and after that it tries to execute it immediately again, and again, till our Memory quota is about 300% and the app crashes.
Now I want to fix this. Locally we don't have any problems running this script at all. It takes about a minute (for now, but in the future if we have more users it may take more time) to finish and memory stays stable.
Now running this script on the background should fix the problem according https://devcenter.heroku.com/articles/request-timeout#debugging-request-timeouts
Overe here https://devcenter.heroku.com/articles/asynchronous-web-worker-model-using-rabbitmq-in-node#getting-started I read about JackRabbit. But it seems like it's used for systems like RabbitMQ https://github.com/hunterloftis/jackrabbit
So my question: anyone who has experience with background tasks in node? Can and should I use JackRabbit for my background tasks, or are there better solutions? My background task just contains a very complex ExpressJS task, which takes some time to execute so....
I'm the Node.js platform owner at Heroku (and I actually wrote the web worker article you referenced).
Your use case sounds like it may fit the scheduler very well:
https://devcenter.heroku.com/articles/scheduler
It's a great replacement for cron-type jobs.

JXcore, How external process monitoring works?

I am a newbie and trying to figure out how process monitoring works with JXcore. I saw the documentation but need few steps in order to make my server application starting multithreaded and monitored properly.
Thanks in advance!
I'll try to explain it to you. There is no shame to be a newbie! :)
JXcore offers you two types of application monitoring.
1) One of them is Process Monitor and this is a process, which runs as separate instance. Your applications may subscribe to it for being monitored. Monitor verifies them in regular intervals, and if it finds that your application is gone it tries to relaunch it. For example, if your application servers http and should be online all the time - Process Monitor will ensure, that it is really running.
The fastest way to start to monitor your application is to:
launch the monitor: > jx monitor start
launch your application with automatic subscription to the monitor: > jx monitor run app.js
After that, when your application crashes, Process Monitor will restart it. You can test it by just killing your application's process.
Process monitor also gives you information about currently monitored processes. You can browse to see the list of them:
http://127.0.0.1:17777/json
2) Second type of a monitoring feature is process and thread recovery. With Process Recovery you can achieve the same as with the Process Monitoring, so there is no reason to use them both at the same time.
Another scenario could be:
Let's say you have a multithreaded application and only to recovering it's threads is enough.
Your application is launched with a command:
jx mt-keep:3 app.js
which means, that you run it with 3 threads.
To enable Thread Recovery is enough to subscribe to process.on('restart') event like this:
process.on('restart', function (cb) {
process.release();
cb();
});
Remember, to call cb() callback. As you probably saw it in the docs, the thread will not restart until you invoke this callback. Before restart, you may back-up things etc.
Basically that's it. Feel free to play with it!

Executing process on Linux from WSGI based web application

I have a dashboard and I want a process to run when the user clicks on a button. That process might take a long time to complete.
My options so far:
using popen or something similar to execute the process
having a daemon monitor a directory. When this directory is changed (a file created) the daemon will do the job and then delete the file before idling again.
using cron, running every 5 seconds and also monitoring some directory.
Which one is more Linux-friendly? Is there any I have not considered?
This is what task queueing systems like Celery and Redis Queue are for.
Another option is to have a daemon (as in your 2nd option) that listen on some socket. Then, your WSGI application could just connect & send a command. There are many possibilities for how the communication over the socket would take place, choosing the right one depends a lot on the actual case.
This have the advantage that you can eventually have the two application (WSGI and the daemon) run on different computers or VMs at some point.

Maintaining a long-running task on Linux

My system includes a task which opens a network socket, receives pushed data from the network, processes it, and writes it out to disk or pings other machines depending on the messages. This task is intended to run forever, and the service is designed to have this task always running. But sometimes it crashes.
What's the best practice for keeping a task like this alive? Assume it's okay for the task to be dead for up to 30 seconds before we restart it.
Some obvious ideas include having a watchdog process that checks to make sure the process is still running. Watchdog could be triggered by cron. But how does it know if the process is alive or not? Write a pidfile? touch a heartbeat file? An ideal solution wouldn't continuously spin up more processes if the machine gets bogged down to the point where the watchdog is running faster than the heartbeat.
Are there standard linux tools for this? I can imagine a solution that uses a message queue, but I'm not sure if that's a good idea or not.
Depending on the nature of the task that you wish to monitor, one method is to write a simple wrapper to start up your task in a fork().
The wrapper task can then do a waitpid() on the child and restart it if it is terminated.
This does depend on modifying the source for the task that you wish to run.
sysvinit will restart processes that die, if added to inittab.
If you're worried about the process freezing without crashing and ending the process, you can use a heartbeat and hard kill the active instance, letting init restart it.
You could use monit along with daemonize. There are lots of tools for this in the *nix world.
Supervisor was designed precisely for this task. From the project website:
Supervisor is a client/server system that allows its users to monitor and control a number of processes on UNIX-like operating systems.
It runs as a daemon (supervisord) controlled by a command line tool, supervisorctl. The configuration file contains a list of programs it is supposed to monitor, among other settings.
The number of options is quite extensive, -- have a look at the docs for a complete list. In your case, the relevant configuration section might be something like this:
[program:my-network-task]
command=/bin/my-network-task # where your binary lives
autostart=true # start when supervisor starts?
autorestart=true # restart automatically when stopped?
startsecs=10 # consider start successful after how many secs?
startretries=3 # try starting how many times?
I have used Supervisor myself and it worked really well once everything was set up. It requires Python, which should not be a big deal in most environments but might be.

which one to use windows services or threading

We are having a web application build using asp.net 3.5 & SQL server as database which is quite big and used by around 300 super users for managing around 5000 staffs.
Now we are implementing SMS functionality into the application which means the users will be able to send and receive SMS. Every two minute the SMS server of the third party is pinged to check whether there are any new messages. Also SMS are hold in queue and send every time interval of 15 to 30 minutes.
I want this checking and sending process to run in the background of the application all the time, even if the user closes the browser window.
I need some advice on how do I do this?
Will using thread will achieve this or do I need to create a windows service for it or are there any other options?
More information:
I want to execute a task in a timer, what will happen if I close the browser window, the task wont be completed isn't it so.
For example I am saving 10 records to the database in a time interval of 5 minutes, which means every 5 minutes when the timer tick event fires, a record is inserted into the database.
How do I run this task if I close the browser window?
I tried looking at windows service but how do I pass a generic collection of data to it for processing.
There really is no thread or service choice, a service can (and usually is!) multi threaded, a thread can start a service.
There are three basic choices you can:-
Somehow start another thread running when a user logs in -- this is probably a very poor choice for what you want, as you cannot really keep it running once the user session is lost.
Write a fully fledged windows service which is starts on OS startup and continues running unitl the server is shutdown. You can make this dependant on the SQLserver service, so it starts after the DB is available. This is the "best" solution but may be overkill for your purposes. Aslo you need to know the services API to write it properly as you need to respond correctly to shutdown and status requests.
You can schedule your task periodically using either the Windows schedular, or, preferably the schedular which is built in to SQLServer, I think this would be the most suitable option for your needs.
Distinguish between what the browser is doing and what's happening server-side.
Your Web App is sitting server-side waiting for requests from whatever browsers may be running, and servicing those requests, in servicing those requests I guess it may well put messages on a queue and have a look in a database for any new messages.
You want the daemon processor, which talks to the third-party SMS, to be triggered by time rather than by browser function. Either of your suggestions would work:
A competely independent service could run and work against the queues and database.
Your web app, which I assume is already a service, could spawn a thread
In either case we have a few technical questions of avoiding any race conditions between the browser-request processing and the daemon - but databases and queueing systems can deal with that.
So I would decide between stand-alone daemon and background thread like this:
Which is easier to implement? I'm a Java EE developer, I know in my app server I have an API for specifying code to be run according to a timer, the API deals with the threading issues. So for me that's very easy. I don't know what you have available. Timers are not quite as trivial as they may appear - so having a reliable API is beneficial. If this was a more complex requirement, where the daemon code were gnarly and might possibly interfere with the WebApp code then I might prefer to keep it conspicuously separate.
Which is easier to deploy and administer? Deploy separate Web App and daemon, or deploy one thing. In the Java EE world we could have a single Enterprise Application with all the code, so that's a single thing to deploy, start and control.
One other thing to consider: Scaling and Resilience. You might choose to have more than one copy of your web app running, either to provide fail-over capabilities or just because you need the extra power. In which case how many daemons would you have? Would it be a problem to have two daemons running? You might need some extra code to mediate between two daemons, for example log in the database the time of last work, each daemon can say "Oh, my buddy balready did the 10:30 job, I'll go back to sleep"

Resources