Why cronjobs use 42 as Report progress on a job? - cron

I usually see that in cronjob services like "bull" package they use 42 as their report progress when the job is running.
job.progress(42)
Why? Is there a specific meaning behind it?

Related

Netsuite - Scheduler not running for some intervals

I have a Map/Reduce script running in every 15 minutes. While I check the schedule status, it seems few times the execution skips. I have set the priority to "High", Concurrency limit to "2". Please see the screenshot for reference. I am not getting why this skips for few cases. Can anyone please help me on this? Thanks in advance.
From the above example, the script skips to execute on 3/16/2021 5:01.42 pm
You can try listing all your scheduled and map/reduce scripts. It is likely that some other scheduled or map/reduce script was running at that time. Netsuite will not interrupt a running task so if something else was running through the next slot Netsuite won't have run it.
The other thing I've seen is scripts scheduled for near midnight California time are often skipped or terminated. I haven't looked at this lately so they may have fixed this but if that time is midnight California then that's likely the reason.

Why in kubernetes cron job two jobs might be created, or no job might be created?

In k8s Cron Job Limitations mentioned that there is no guarantee that a job will executed exactly once:
A cron job creates a job object about once per execution time of its
schedule. We say “about” because there are certain circumstances where
two jobs might be created, or no job might be created. We attempt to
make these rare, but do not completely prevent them. Therefore, jobs
should be idempotent
Could anyone explain:
why this could happen?
what are the probabilities/statistic this could happen?
will it be fixed in some reasonable future in k8s?
are there any workarounds to prevent such a behavior (if the running job can't be implemented as idempotent)?
do other cron related services suffer with the same issue? Maybe it is a core cron problem?
The controller:
https://github.com/kubernetes/kubernetes/blob/master/pkg/controller/cronjob/cronjob_controller.go
starts with a comment that lays the groundwork for an explanation:
I did not use watch or expectations. Those add a lot of corner cases, and we aren't expecting a large volume of jobs or scheduledJobs. (We are favoring correctness over scalability.)
If we find a single controller thread is too slow because there are a lot of Jobs or CronJobs, we we can parallelize by Namespace. If we find the load on the API server is too high, we can use a watch and UndeltaStore.)
Just periodically list jobs and SJs, and then reconcile them.
Periodically means every 10 seconds:
https://github.com/kubernetes/kubernetes/blob/master/pkg/controller/cronjob/cronjob_controller.go#L105
The documentation following the quoted limitations also has some useful color on some of the circumstances under which 2 jobs or no jobs may be launched on a particular schedule:
If startingDeadlineSeconds is set to a large value or left unset (the default) and if concurrentPolicy is set to AllowConcurrent, the jobs will always run at least once.
Jobs may fail to run if the CronJob controller is not running or broken for a span of time from before the start time of the CronJob to start time plus startingDeadlineSeconds, or if the span covers multiple start times and concurrencyPolicy does not allow concurrency. For example, suppose a cron job is set to start at exactly 08:30:00 and its startingDeadlineSeconds is set to 10, if the CronJob controller happens to be down from 08:29:00 to 08:42:00, the job will not start. Set a longer startingDeadlineSeconds if starting later is better than not starting at all.
Higher level, solving for only-once in a distributed system is hard:
https://bravenewgeek.com/you-cannot-have-exactly-once-delivery/
Clocks and time synchronization in a distributed system is also hard:
https://8thlight.com/blog/rylan-dirksen/2013/10/04/synchronization-in-a-distributed-system.html
To the questions:
why this could happen?
For instance- the node hosting the CronJobController fails at the time a job is supposed to run.
what are the probabilities/statistic this could happen?
Very unlikely for any given run. For a large enough number of runs, very unlikely to escape having to face this issue.
will it be fixed in some reasonable future in k8s?
There are no idemopotency-related issues under the area/batch label in the k8s repo, so one would guess not.
https://github.com/kubernetes/kubernetes/issues?q=is%3Aopen+is%3Aissue+label%3Aarea%2Fbatch
are there any workarounds to prevent such a behavior (if the running job can't be implemented as idempotent)?
Think more about the specific definition of idempotent, and the particular points in the job where there are commits. For instance, jobs can be made to support more-than-once execution if they save state to staging areas, and then there is an election process to determine whose work wins.
do other cron related services suffer with the same issue? Maybe it is a core cron problem?
Yes, it's a core distributed systems problem.
For most users, the k8s documentation gives perhaps a more precise and nuanced answer than is necessary. If your scheduled job is controlling some critical medical procedure, it's really important to plan for failure cases. If it's just doing some system cleanup, missing a scheduled run doesn't much matter. By definition, nearly all users of k8s CronJobs fall into the latter category.

Will conflicts arise between larger and smaller files in CRON job?

I have set a cron job for a file for every 6 hours. The file may run for 4hours.
If i set cron for another file , will it affect the previous one which may run for 4hours?
No. If the job is not working on same resources, it wont conflict even if it's running simultaneously.
The cron daemon doesn't check to see if anything else by the same name is running, if that is what you mean, so cron will not care. However, if your script creates temporary files, for example, without using helper-tools like "mktemp" they could conflict with each other - so that will depend how well written your script is.

cron jobs: Monitor time it takes for jobs to finish

I'm doing a research project that requires I monitor cron jobs on a Ubuntu Linux system. I have collected data about the jobs' tasks and when they are started, I just don't know of a way to monitor how long they take to finish running.
I could calculate the time of finishing the task minus starting it with something like this but that would require doing that on the Shell scripts of each cron job. That's not necessarily difficult by any means but it seems a little silly that cron wouldn't in some way log this, so I'm trying to find an easier way :P
tl;dr Figure out time cron jobs take from start to finish
You could just put time in front of your crontabs, and if you're getting notifications about cron script outputs, it'll get sent to you.
For example, if you had:
0 1,13 * * * /maint/run_webalizer.sh
add time in front
0 1,13 * * * time /maint/run_webalizer.sh
and you'll get some output that looks like (the "real" is the time you want):
real 3m1.255s
user 0m37.890s
sys 0m3.492s
If you don't get cron notifications, you can just pipe the output to a file.
man time. Maybe you can create a wrapper and tell Cron to use it as your "shell" or something like that.
Cronitor (https://cronitor.io) is a tool I built exactly for this purpose. It uses http requests to record the start and end of your jobs.
You'll be notified if your job doesn't run on schedule, or if it runs for too long/too short. You can also configure it to send alerts to you via email, sms, but also Slack, Hipchat, Pagerduty and others.
I use the Jenkins CI to do this via its external-monitor-job plugin. Jenkins can track start and end times, track overall execution time over time, save the output of all jobs it tracks, and present success/failure conditions graphically.
https://wiki.jenkins-ci.org/display/JENKINS/Monitoring+external+jobs

How can I setup a system to tell me if a cron job is NOT running fine?

This is more of an "general architecture" problem. If you have a cron job (or even a Windows scheduled task) running periodically, its somewhat simple to have it send you an email / text message that all is well, but how do I get informed when everything is NOT okay? Basically, if the job doesn't run at its scheduled time or Windows / linux has its own set of hangups that prevent the task from running...?
Just seeking thoughts of people who've faced this situation before and come up with interesting solutions...
A way I've done it in the past is to simply put at the top of each script (say, checkUsers.sh):
touch /tmp/lastrun/checkUsers.sh
then have another job that runs periodically that uses find to locate all those "marker" files in tmp/lastrun that are older than a day.
You can fiddle with the timings, having /tmp/lastrun/hour/ and tmp/lastrun/day/ to separate jobs that have different schedules.
Note that this won't catch scripts that have never run since they will never create the initial file for find-ing. To alleviate that, you can either:
create that file manually when creating the cron job (won't handle situations where someone inadvertently deletes the marker file); or
maintain a list of required marker files somewhere so that you can detect when they're missing as well as outdated.
And, if your cron job is not a script, put the touch directly into crontab:
0 4 * * * ( touch /tmp/lastrun/daily/checkUsers ; /usr/bin/checkUsers )
It's a lot easier to validate a simple find script than to validate every one of your cron jobs.

Resources