How can I setup a system to tell me if a cron job is NOT running fine? - cron

This is more of an "general architecture" problem. If you have a cron job (or even a Windows scheduled task) running periodically, its somewhat simple to have it send you an email / text message that all is well, but how do I get informed when everything is NOT okay? Basically, if the job doesn't run at its scheduled time or Windows / linux has its own set of hangups that prevent the task from running...?
Just seeking thoughts of people who've faced this situation before and come up with interesting solutions...

A way I've done it in the past is to simply put at the top of each script (say, checkUsers.sh):
touch /tmp/lastrun/checkUsers.sh
then have another job that runs periodically that uses find to locate all those "marker" files in tmp/lastrun that are older than a day.
You can fiddle with the timings, having /tmp/lastrun/hour/ and tmp/lastrun/day/ to separate jobs that have different schedules.
Note that this won't catch scripts that have never run since they will never create the initial file for find-ing. To alleviate that, you can either:
create that file manually when creating the cron job (won't handle situations where someone inadvertently deletes the marker file); or
maintain a list of required marker files somewhere so that you can detect when they're missing as well as outdated.
And, if your cron job is not a script, put the touch directly into crontab:
0 4 * * * ( touch /tmp/lastrun/daily/checkUsers ; /usr/bin/checkUsers )
It's a lot easier to validate a simple find script than to validate every one of your cron jobs.

Related

Will conflicts arise between larger and smaller files in CRON job?

I have set a cron job for a file for every 6 hours. The file may run for 4hours.
If i set cron for another file , will it affect the previous one which may run for 4hours?
No. If the job is not working on same resources, it wont conflict even if it's running simultaneously.
The cron daemon doesn't check to see if anything else by the same name is running, if that is what you mean, so cron will not care. However, if your script creates temporary files, for example, without using helper-tools like "mktemp" they could conflict with each other - so that will depend how well written your script is.

cron jobs: Monitor time it takes for jobs to finish

I'm doing a research project that requires I monitor cron jobs on a Ubuntu Linux system. I have collected data about the jobs' tasks and when they are started, I just don't know of a way to monitor how long they take to finish running.
I could calculate the time of finishing the task minus starting it with something like this but that would require doing that on the Shell scripts of each cron job. That's not necessarily difficult by any means but it seems a little silly that cron wouldn't in some way log this, so I'm trying to find an easier way :P
tl;dr Figure out time cron jobs take from start to finish
You could just put time in front of your crontabs, and if you're getting notifications about cron script outputs, it'll get sent to you.
For example, if you had:
0 1,13 * * * /maint/run_webalizer.sh
add time in front
0 1,13 * * * time /maint/run_webalizer.sh
and you'll get some output that looks like (the "real" is the time you want):
real 3m1.255s
user 0m37.890s
sys 0m3.492s
If you don't get cron notifications, you can just pipe the output to a file.
man time. Maybe you can create a wrapper and tell Cron to use it as your "shell" or something like that.
Cronitor (https://cronitor.io) is a tool I built exactly for this purpose. It uses http requests to record the start and end of your jobs.
You'll be notified if your job doesn't run on schedule, or if it runs for too long/too short. You can also configure it to send alerts to you via email, sms, but also Slack, Hipchat, Pagerduty and others.
I use the Jenkins CI to do this via its external-monitor-job plugin. Jenkins can track start and end times, track overall execution time over time, save the output of all jobs it tracks, and present success/failure conditions graphically.
https://wiki.jenkins-ci.org/display/JENKINS/Monitoring+external+jobs

How to handle overtime crons

Suppose if i have cron tasks running every minute. And if each time, that task takes more than one minute to run, what will happen. Will the next cron wait for the first cron or will it run without any checks.
I want to run a cron task every minute and I don't over lapping cron tasks like that in case of a long running task/situation.
please help.
It depends on what you run. If it's your own script, you can implement a locking/lock checking mechanism to avoid running duplicates.
But that's not cron's job.
Yes, cron will go ahead and start your 1+ minute-running process every minute until something crashes.
You'll want to put a lock of some sort into your job if you can to basically do this at start-up:
if not get_lock()
print "Another process is running"
exit
This, of course, assumes that you own the code running. If you're running a command that you didn't code, then I'd recommend building a shell wrapper that implements the above pseudocoded logic where get_lock() will see if another process like this one is running.
As others have mentioned, CRON will run your script every minute regardless of whether another instance of your script is still running.
If you want to avoid this and don't fancy implementing your own locking mechanism then you could try using a CRON alternative called The Fat Controller which is a daemon that will continually re-run scripts. You can optionally specify an interval between runs and also optionally specify a maximum execution time so if a script goes AWOL then it can be killed.
There's some use cases and more information on the website:
http://fat-controller.sourceforge.net/

How to define frequency of a job in application by users?

I have an application that has to launch jobs repeatingly. But (yes, that would have been to easy without a but...) I would like users to define their backup frequency in application.
In worst case, they would have to choose between :
weekly,
daily,
every 12 hours,
every 6 hours,
hourly
In best case, they should be able to use crontab expressions (see documentation for example)
How to do this? Do I launch a job every minutes that check for last execution time, frequency and then launches another job if needed? Do I create a sort of queue that will be executed by a masterjob?
Any clues, ideas, opinions, best pratices, experiences are welcome!
EDIT : Solved this problem using Akka scheduler. Ok, this is a technical solution not a design answer but still everything works great.
Each user defined repetition is an actor that send messages every period to a new actor to execute the actual job.
There may be two ways to do this depending on your requirements/architecture:
If you can only use Play:
The user creates the job and the frequency it will run (crontab, whatever).
On saving the job, you calculate the first time it will have to be run. You then add an entry to a table JOBS with the execution time, job id, and any other information required. This is required as Play is stateless and information must be stored in the DB for later retrieval.
You have a job that queries the table for entries whose execution date is less than now. Retrieves the first, runs it, removes it from the table and adds a new entry for next execution. You should keep some execution counter so if a task fails (which means the entry is not removed from DB) it won't block execution of the other tasks by the job trying again and again.
The frequency of this job is set to run every second. That way while there is information in the table, you should execute the request around as often as they are required. As Play won't spawn a new job while the current one is working if you have enough tasks this one job will serve all. If not, it will be killed at some point and restored when required.
Of course, the crons of the users will not be too precise, as you have to account for you own cron delays plus execution delays on all the tasks in queue, which will be run sequentially. Not the best approach, unless you somehow disallow crons which run every second or more often than every minute (to be safe). Doing a check on execution time of the crons to kill them if they are over a certain amount of time would be a good idea.
If you can use more than Play:
The better alternative I believe is to use Quartz (see this) to create a future execution when the user creates the job, and reproram it once the execution is over.
There was a discussion on google-groups about it. As far as I remember you must define a job which start every 6 hours and check which backups must be done. So you must remember when the last backup job was finished and make the control yourself. I'm unsure if Quartz can handle such a requirement.
I looked in the source-code (always a good source ;-)) and found a method every, where I think this should be do what you want. How ever I'm unsure if this is a clever design, because if you have 1000 user you will have then 1000 Jobs. I'm unsure if Play was build to handle such a large number of jobs.
[Update] For cron-expressions you should have a look into JobPlugin.scheduleForCRON()
There are several ways to solve this.
If you don't have a really huge load of jobs, I'd just persist them to a table using the required flexibility. Then check all of them every hour (or the lowest interval you support) and run those eligible. Simple.
Or, if you prefer to use cron syntax anyway, just write (export) jobs to a user crontab using a wrapper which calls back to your running app, or starts the job in a standalone process if that's possible.

Cron expression for every other day?

I'm setting up a 2 builds in Teamcity, with scheduled triggering using cron expressions.
I want the builds to alternate every other day. I.e., one gets build on one day, then the other one gets built the next day.
Under no circumstance do I want the same build to run 2 days back to back.
Is this sort of scheduling even possible using cron expressions?
This is impossible to do using only cron, but you can still get this behavior with a bit of a workaround. Create a simple script or program in whatever language you prefer that keeps track of the last build program to run. Any time it is run have it run the build that was not run last, then save that one as the new 'last build'. Then, run this program using cron every day.
You'll need to figure out what works for saving the last build in a persistent way, one the simpler approaches would be to use a file.

Resources