Repartition of cron - cron

(I apologize in advance for bad formulation of my problem, please consider english is not my first language).
I have several processes (crons) and I want to "optimize" the schedule when to launch them.
For example, cron c1 starts every 3 minutes, cron c1 starts every 7 minutes and cron c3 starts every 18 minutes. Assume they last only a few seconds before stopping.
The unit of time here is 1 minute.
Now, what I want is that these crons are distributed so that we don't have a moment where many of them start and then long time interval with no cron at all. For example, if c1 and c3 both start at time 0, then they will start again together every 18 minutes. It would be better to start cron c1 at time 0 and then c3 at time 1, so that they are never launched together.
So the idea is, given a list of crons with periodicity, to plan a schedule, so that there is as much time between each cron as possible and as few as possible moments when two crons start together.
Are there some well-known algorithms about such problems?
The real-life application of this problem is: ~ 200 crons. Some of them are launched every 5 or ~10 or ~30 minutes and last very short (few seconds), some (~20 - 25) are launched every 2 hours and last a few minutes. So the idea is also that the big crons are not launched at the same time.
I am a mathematician myself and not a computer scientist, so I asked this question on https://math.stackexchange.com/, since I consider this being a "nice" question for mathematicians too.

I think you should consider the ressources used by each of your crons and then schedule your jobs from that.
I don't think there is a particular algorithm for that.

Related

Google App Engine cron job schedule repeat on days with DST changes

I have a cron job that should repeat every 3 hours and must land on 9:30 in US/NY time.
If I start the job at 0:30, and run it every 3 hours, will it run at 9:30 on daylight savings days?
The documentation is a bit vague on this: https://cloud.google.com/scheduler/docs/configuring/cron-job-schedules#daylight_savings_time. It uses wall clock time to start, but does it use wall-clock time for each offset as well?
Alternatively, could I start it using 9:30 as the trigger time, and then every 3 hours until 6:30am the next day?
As you mentioned, Cloud scheduler uses wall clock time, is expected that the executions changes according the wall time.
To run a job from 9:30am to 6:30am (next day) you need 2 Scheduler jobs
1st will be run every 3 hours starting 9:30 to 23:30
30 9/3 * * *
2nd will be run every 3 hours starting 0:30 to 6:30
30 0-6/3 * * *
I use this page when I have doubts about cron format
The best way to completely avoid any kind of issue with DST changes is to switch to a timezone unaffected by those, as the documentation states:
If your job requires a very specific cadence, you may want to consider choosing time zones that do not observe daylight savings time.
A good one would be UTC for example, and as the difference between NY(GMT -4) and UTC(GMT) is 4h you just need to add 4 to your desired starting time.
With that you are assuring that the scheduled hours do not get modified and whatever you are expecting to be run is done as it should.
Regarding the start time of the trigger, it does not matter whether it is 0:30 or 9:30, once you create it, it will run every X hours/mins/seconds as you will have it specified beforehand.

How quickly is CRON triggered?

First Example
Suppose I have a CRON job
30 2 * * * ....
Then this would run every time when it is 2:30 at night (local time).
Now suppose I have the time zone Europe/Germany and it's 2017-10-29 (the day when DST is switched). Then this CRON job would run twice, right?
Second Example
Suppose I have the time zone Europe/Germany and the CRON job
30 11 * * * ....
As Germany never had a DST change at 11:30, this will not interfere. But the user could change the local time. To be super clear: This question is NOT about DST.
For the following test cases, I would like to know if/how often the CRON job gets scheduled:
At 11:29:58.0, the user sets the time to 11:31:00
At 11:29:59.1, the user sets the time to 11:31:00
At 11:29:59.6, the user sets the time to 11:31:00
At 11:30:01.0, the user sets the time to 11:29:59.7 - is CRON executed directly afterwards?
They boil down to How quickly is CRON triggered?, where the 4th one also has the question if CRON stores that it was already executed for that minute.
Another variant of the same question:
At 11:29:59, the NTP service corrects the time to 11:31:00 - will the job be executed that day at all?
The easiest way to answer this with confidence is to take a look at the source for the cron daemon. There are a few versions online like this, or you can use apt-get source cron.
The tick cycle in cron is to repeatedly sleep for a minute, or less if there is a job coming up. Immediately after emerging from the sleep, it checks the time and treats the result as one of these wakeupKind values:
Expected time - run any jobs we were expecting
Small jump forwards (up to 5 minutes) - run the jobs for the intervening minutes
Medium jump forwards (up to 3 hours, so this would include DST starting in spring) - run any wildcard jobs first (because the catch up could take more than a minute), then catch up on the intervening fixed time jobs
Large jump (3 hours or more either way) - start over with the current time
Jump backwards (up to 3 hours, so including the end of DST) - because any fixed time jobs have 'probably' already run, only run any wildcard jobs until the time is caught up again
If in doubt, the source comments these wakeupKind values clearly.
Edit
To follow up on whether sleep() could be affected by a clock change, it looks like the answer is indirectly there in a couple of the Linux man pages.
Firstly the notes for the sleep() function confirm that is implemented by nanosleep()
The notes for nanosleep() say Linux measures the time using the CLOCK_MONOTONIC clock (even though POSIX.1 says it shouldn't)
Scroll down a bit in the docs for clock_settime() to see the explanation of CLOCK_MONOTONIC, which explains it is not affected by jumps in the system time, but it would be affected by incremental NTP style clock sync adjustments.
So in summary, a system admin style clock change will have no effect on the sleep(). But for example if an NTP adjustment came in and said to 'gently' advance the clock, cron would experience a series of slightly short sleep() function calls.
There are many implementations of cron systems (See here). One of the most commonly used cron's is Vixie cron. And its man page states:
Daylight Saving Time and other time changes
Local time changes of less than three hours, such as those caused by the Daylight Saving Time changes, are handled in a special way. This only applies to jobs that run at a specific time and jobs that run with a granularity
greater than one hour. Jobs that run more frequently are scheduled normally.
If time was adjusted one hour forward, those jobs that would have run in the interval that has been skipped will be run immediately. Conversely, if time was adjusted backwards, running the same job twice is avoided.
Time changes of more than 3 hours are considered to be corrections to the clock or the timezone, and the new time is used immediately.
source: man 8 cron
I believe this answers most of your points.
In addition to point five:
At 11:29:59, the NTP service corrects the time to 11:31:00 - will the job be executed that day at all?
First of, if NTP corrects the time with more then a minute, you have a very bad clock! This should not happen too often. Generally, you might have such a step when you enable NTP but then it should be much less.
In any case, if the DeltaT is not to high, generally below 125 ms, your system will slew the time. Slewing the time means to change the virtual frequency of the software clock to make the clock go faster or slower until the requested correction is achieved. Slewing the clock for a larger amount of time may require some time, too. For example standard Linux adjusts the time with a rate of 0.5ms per second.
This implies, (under the assumption of Vixie cron, and probably many others):
If NTP jumps more then 3 hours, the job is skipped
If NTP jumps less then 3 hours but more then 125 ms, Vixie cron handles the job nicely by assuming the concepts of the time-jumps.
If NTP corrects the time for less then 125 ms, cron does not notice the time-jump due to the slewing.
Interesting information:
RFC5905: Network Time Protocol Version 4: Protocol and Algorithms Specification
The NTP FAQ and Howto
https://wiki.gentoo.org/wiki/Cron/en
You're actually asking two related questions. The general answer is it depends[1], but I'll answer based on the Debian Linux installation I'm on right now:
How does cron handle DST changes and other 'special' time-related events?
On my Debian Linux system cron handles 'DST and other time-related changes/fixes' (per the man page) so that jobs don't get run twice or skipped due to changes like DST. (See https://debian-handbook.info/browse/stable/sect.task-scheduling-cron-atd.html for more specifics) Related to the 5th point raised in your second question, I would expect these same facilities to deal with NTP-related time jumps but don't know for certain.
How often is cron triggered and how quickly does it pick up my crontab changes?
Again, on my Debian Linux system the cron daemon wakes up once a minute and will detect and utilize any crontab changes man since the previous check/run one minute ago. Note that there is no guarantee that cron fires off at 12:00:00 or 12:00:59 or any specific time between (only that it fire when the time is 12:00:??) so in the event that you change a crontab at 12:00:17 but cron fired at 12:00:13, your changes will not be picked up until the next run (most likely at 12:01:13 though there might be a slight amount of variance due to the Linux scheduler)
[1] It Depends...
The precise answer absolutely depends both on the platform (Linux/Unix/BSD/OS X/Windows) and the particular implementation of cron (there have been several over the decades with derivatives of Vixie cron being prevalent on Linux and BSD per https://en.wikipedia.org/wiki/Vixie_cron). If you're running something other than Linux, the man page / documentation for your implementation should provide details as to the specifics of how often it runs, picks up modified crontabs, DST handling etc. If you really need to know the specific details, df778899 is right in that you should look at the source code for your implementation as needed... because sometimes software/documentation is buggy.
On mac OS:
$> man cron
...
Available options:
-s Enable special handling of situations when the GMT offset of the local timezone changes, such as the switches between the standard time and daylight saving time.
The jobs run during the GMT offset changes time as intuitively expected. If a job falls into a time interval that disappears (for example, during the switch from standard time) to daylight saving time
or is duplicated (for example, during the reverse switch), then it is handled in one of two ways:
The first case is for the jobs that run every at hour of a time interval overlapping with the disappearing or duplicated interval. In other words, if the job had run within one hour before the GMT
offset change (and cron was not restarted nor the crontab(5) changed after that) or would run after the change at the next hour. They work as always, skip the skipped time or run in the added time as
usual.
The second case is for the jobs that run less frequently. They are executed exactly once, they are not skipped nor executed twice (unless cron is restarted or the user's crontab(5) is changed during
such a time interval). If an interval disappears due to the GMT offset change, such jobs are executed at the same absolute point of time as they would be in the old time zone. For example, if exactly
one hour disappears, this point would be during the next hour at the first minute that is specified for them in crontab(5).
-o Disable the special handling of situations when the GMT offset of the local timezone changes, to be compatible with the old (default) behavior. If both options -o and -s are specified, the option
specified last wins.

Linux task schedule to Hour, minute, second

I'm trying to run a shell script at a specific time up to it's seconds (H:M:S) , but so far all programs such as at only go up to a specific minute (not second).
I don't want to use sleep since it's not accurate. For some reason it ended couple of hours earlier than it was supposed to!
Your question doesn't seem to define accuracy, but there is always some jitter in scheduling in electronic devices. You might use quartz to schedule to the second. You could also use at or cron to schedule to the minute and then sleep the appropriate number of second(s).

Preventing cronjobs from overlapping

I have 3 different jobs set up in crontab (call them jobA, jobB, jobC) that run at different intervals and start at different times during the day. For example, jobA runs once per hour at 5 mins past the hour, jobB runs every 30 mins at 9 and 39 mins past the hour, and jobC runs every 15 mins. They are not dependent on each other, but for various reasons they can NOT be running at the same time.
The problem is that sometimes one of the jobs takes a long time to run and another one starts before the first one is done, causing issues.
Is there some way to queue or spool these jobs so that one will not start until the current running one has finished? I tried using this solution but this does not guarantee that the pending jobs will resume in the same order they were supposed to start. A queue would be best, but I cannot find anything about how to do this.
You can't do that using cron. Cron is used to run a specific command at specific time. You can do it by the solution you proposed, but that adds a lot more complexity.
I suggest, writing/coding the requirement in high level language like java and use a mutil-thread program to achieve what you need.
Control-m is another scheduling software, with a lot of other features as well. You would be able to integrate the above use-case in it.

Autosys box issue

I am having an autosys box job- auto_task_box which have 3 child jobs
: auto_task1_wd - runs in every 5 mins monday to friday
: auto_task2_dly - runs on 02:00 on every day
: auto_task3_sa - runs at 03:00 on every saturday
Issue is after scheduling is after the Ist run of auto_task1_wd, the box will wait for completion of auto_task2_dly and auto_task3_sa so the next iteration of auto_task1_wd ie after 5 mins won't happen.
How would I tackle this issue?
I am using autosys R11 in linux.
It sounds like the three jobs should run independent from each other. In this case I would not use a box at all, but just three separate tasks, as I always tend to think of boxes as a way of ensuring relationships between tasks.
I agree with the first answer, these jobs should not be grouped into the same box. Boxes are container for jobs with like starting conditions. It is a very bad idea to have date time conditions for jobs in a box. You will have some unexpected runs that way.

Resources