Monitor cron jobs in real-time - linux

I have an issue currently where I've got a cron job set to run at midnight each day to reset daily API requests for a service that I run. The job failed recently which caused me a whole bunch of headaches and I've been trying to find a solution to monitor all of my cron jobs so I don't have a situation like this happen again.
I haven't been able to find a sufficient solution however, and in response I am considering creating a platform that allows you to monitor cron jobs, see logs (and past logs), last run date, failure/success of the last run, etc... in real-time and would notify you if your job hasn't completed within a specified window of time or the job failed.
I believe this might be a pain point and a good solution for others as well.
What are you thoughts? Do you think that this would be useful, have any suggestions, or just think this would be a waste of time?

Did you hear about Rundeck? (https://www.rundeck.com/open-source)
It looks like it's exactly what you're looking for.
You install it on a server, and it's like a Web UI for a crontab.
You define jobs you want to run using the Web UI, how often you want them to run and you can see some history of the past executions, their status and their output. You can also see when the next execution will happen.
I think there are also some alerting features to notify you if a job is on failure. I'm not sure if it can notify you based on the job execution time though.
This might be a good fit for what you're looking for.

2 years later, I am asking myself exactly the same questions ) Definitely you should have created such service already, haven't you? Every backend coder needs this time from time, in theory. I'm surprised this question hasn't received enough activity/voting. I got an answer leading to this though: https://uptimerobot.com/cron-job-monitoring/ that might be a good solution. Need to test it out. It does not seem to be promoted enough, as it's not easy to find. Also there is https://cronitor.io/docs/cron-job-monitoring that has ability to transmit (somewhat limited) telemetry data, +a lot of SDKs to be used from within programming languages.

Related

Looking for time based persistent scheduler - node js

I have been looking for a time based persistent scheduler. I looked into some applications (Agenda, node-cron, node-schedule). But I couldn't find anything that satisfies my criteria.
So my applications sends out reminders to our customers based on their event timings. I am hesitating to run a regular cronjob because I have to run every 15 mins or so in this case. And for each cronjob, I have to make a database call. I am trying not to use resources unnecessarily.
In addition to that, I am already running a lot of cronjobs. But in my case, when the job is completed, I want the cron to get cancelled/finished; not live on memory until the server restart happens.
I tried using the above specified applications by setting exact timestamps (agenda, node-cron, node-schedule). But the cron lives on forever even after the job is completed, and if i restart the server, all the scheduled jobs are cron. So persistence is also an issue I am facing.
My server uses node js. If there are any other languages/tools to make this work, I am all ears.
Looking forward to your help.
I tried following this solution. But this solution is for one predefined event. In my case, the number of reminders to be sent out are dynamic and jobs are to be scheduled on the fly.

preferred way to schedule job #Scheduled vs crontab

I have to run one utility periodically for instance say, every minute.
So, I have two option #Scheduled spring boot vs crontab of linux box that we are using to deploy the artifact.
SO, my question is which way should I use?
what are the pros and cons for each solution , any other solution if you can suggest.
Just for comparing between these two, I don't have much points, but only based on this situation which I faced now. I just built a new end point and am doing performance testing and stress testing for the same on production. I am yet to decide the cron schedule times, and those may need a slight tweaking over some more time of observation. Setting via #Scheduled needs me to deploy/restart application every time I make a change.
Application restart generally takes more time than crontab edit.
Other than this, a few points considering the aspects of availability and scalability -
Setting only via crontab on a single server would mean a single point of failure, if the server goes down.
Setting via #Scheduled also could mean the same.
If you have multiple instances of the server, this could mean endpoint getting triggered twice and you may not want to have the same. Worst case, is if the scaling up happens after a long time, and you wrote the #Scheduled endpoint long back, while it was only deployed on a single server and then you forgot. As soon as scaling up happens, the process will start getting hit twice.
So, none of these seem to be the best in terms of points of availability and scalability.
In such situations, ideally a distributed cron management system (I have heard about Rundeck) is needed, which manages which, out of the available servers is to be called to hit the desired end point and if needed to call the next server in case the first one is down.
In case of any need for investigation. logs of rundeck could be checked to find the server which was actually called.

How to schedule node.js code to run at a specific time daily outside times given in scheduler?

So, I see that heroku provides the option to run a command at a specific time. Information on the scheduler here.
LINK: https://elements.heroku.com/addons/scheduler
However, if you go through the steps when setting it up, they do not provide a lot of flexibility on when you can run your code daily. For example, you can only run code at 4:00pm or 4:30pm, not 4:10pm.
How can I make it so that a node.js file would run on heroku at a specific time (like 4:10pm or 2:15pm, some time outside the options given on heroku) on a daily basis?
There appears to be no support for node.js explaining this either on their website.
Might be just a work around, but you could start the process at the nearest time slot, that is before your desired time, let it run and wait passively until your desired time and let it do the actual task just then.
However notice as Heroku mentions in the documentation Heroku Scheduler isn't guaranteed to run the task, even though it's very reliable. If you need something critical or have something that has to been run everyday for sure, you should probably make a separate process, which handles the scheduling.
We've added a simple Heroku add-on called Cron To Go that does exactly that - you can use Cron expressions for accuracy and schedule one-off Dynos, just like with Heroku Scheduler.
There's also a simple Node example here.

Azure App Service: How can I determine which process is consuming high CPU?

UPDATE: I've figured it out. See the end of this question.
I have an Azure App Service running four sites. One of the sites has two deployment slots in addition to the primary one. Recently I've been seeing really high CPU utilization for the App Service plan as a whole.
The dark orange line shows the CPU percentage. This is just after restarting all my sites, which brought it down to this level.
However, when I look at the CPU use reported by each site, it's really low.
The darker blue line shows the CPU time, which is basically nothing. I did this for all of my sites, and all the graphs look the same. Basically, it seems that none of my sites are causing the issue.
A couple of the sites have web jobs, so I took a look at the logs but everything is running fine there. The jobs run for a few seconds every few hours.
So my question is: how can I determine the source of this CPU utilization? Any pointers would be greatly appreciated.
UPDATE: Thanks to the replies below, I was able to get more detail into what was happening. I ended up getting what I needed from SCM / Kudu tools. You can get here by going to your web app in Azure and choosing Advanced Tools from the side nav. From the Kudu dashboard, choose Process Explorer. The value in the Total CPU Time column is not directly useful, because it's the time in seconds that the process has run since it started, which might have been minutes or days ago.
However, if you make a record of the value at intervals, you can look at the change over time, and one process might jump out at you. In my case, it was my WebJobs process. Every 60 seconds, this one process was consuming about 10 seconds of processor time, just within one environment.
The great thing about this Kudu dashboard is, if you can catch the problem while it is actually happening, you can hit the Start Profiling button and capture a diagnostic session. You can then open this up in Visual Studio and get some nice details about where the CPU time is being spent.
Just in case anyone else is seeing similar issues, I'll provide more details about my particular case. As I mentioned, my WebJobs exe was the culprit, and I found that all the CPU time was being spent in StackExchange.Redis.SocketManager, which manages connections to Azure Redis Cache. In my main web app, I create only one connection, as recommended. But Since my web jobs only run every once in a while, I was creating a new connection to Azure Redis Cache each time one ran, which apparently can lead to issues. I changed my code to create the Redis Cache connection once when the WebJob process starts up and use the existing connection when any individual WebJob runs.
Time will tell if this really fixes the issue, but I think it will. When the problem occurred, it always fit the same pattern: After a few days of running fine, my CPU would slowly ramp up over the course of about 12 hours. My thinking is that each time a WebJob ran, it created a connection object, which at first didn't produce trouble, but gradually as WebJobs ran every hour or two, cruft was building up until finally some critical threshold was met and the CPU usage would take off.
Hope this helps someone out there. Best wishes!
May be you should go to webApp scm?
%yourAppName%.scm.azurewebsites.com;
There is a page, that can show you all process, that runned now on your web app. (something like Console > Process).
Also you can go to support page (from scm right corner).
You can find some more info about your performance there, and make memory dump (not for this problem, but it useful for performance issues).
According to your description, I assumed that you could leverage the Crash Diagnoser extension to capture dump files from your Web Apps and WebJobs when the CPUs usage percentage is higher than the specific threshold to isolate this issue. For more details, you could refer to this official blog.

Heroku node timeout because of enormous task

Our node app gets quite big and one job takes quite some time to execute. We run this job with a cronjob, but by calling the URL. Now Heroku has problems with this, because the job takes more than 30 seconds to finish. So we receive a time-out and after that it tries to execute it immediately again, and again, till our Memory quota is about 300% and the app crashes.
Now I want to fix this. Locally we don't have any problems running this script at all. It takes about a minute (for now, but in the future if we have more users it may take more time) to finish and memory stays stable.
Now running this script on the background should fix the problem according https://devcenter.heroku.com/articles/request-timeout#debugging-request-timeouts
Overe here https://devcenter.heroku.com/articles/asynchronous-web-worker-model-using-rabbitmq-in-node#getting-started I read about JackRabbit. But it seems like it's used for systems like RabbitMQ https://github.com/hunterloftis/jackrabbit
So my question: anyone who has experience with background tasks in node? Can and should I use JackRabbit for my background tasks, or are there better solutions? My background task just contains a very complex ExpressJS task, which takes some time to execute so....
I'm the Node.js platform owner at Heroku (and I actually wrote the web worker article you referenced).
Your use case sounds like it may fit the scheduler very well:
https://devcenter.heroku.com/articles/scheduler
It's a great replacement for cron-type jobs.

Resources