Just a quick question that has been bothering me today. I own five servers, all have the exact same image and run behind a load balancer. I want to run a process heavy cron on these servers every half an hour.
I don't want to put the cron on each machine, as it is resource heavy and would block all incoming connections for a good thirty seconds. In addition, I don't really want to put the cron on one machine, just to make sure it is redundant and it will be run.
My possible solutions to this would be to have a remote service that would run the cron, just by way of accessing a URL that would trigger it; I think that would be the most feasible at this point.
I'm really curious as to what other solutions might be available.
Thanks for your time!
You could set up staggered cron jobs on your 5 machines, so it runs every 2.5 hours on each of your 5 machines. Probably the cleanest way to do that is to schedule a job to run every 30 minutes, and have the job itself be a script that runs conditionally, depending on the current time and which machine it's on.
Or, if you have some kind of batch scheduling system, you could run a cron job on one system that submits a batch job, letting the scheduling system choose which server to use. This has the advantage that, assuming your batch system works properly, the job should still run if one of your servers is down. You'll likely need to set up some environment variables in your cron job to let it use the batch system properly.
Related
I have been looking for a time based persistent scheduler. I looked into some applications (Agenda, node-cron, node-schedule). But I couldn't find anything that satisfies my criteria.
So my applications sends out reminders to our customers based on their event timings. I am hesitating to run a regular cronjob because I have to run every 15 mins or so in this case. And for each cronjob, I have to make a database call. I am trying not to use resources unnecessarily.
In addition to that, I am already running a lot of cronjobs. But in my case, when the job is completed, I want the cron to get cancelled/finished; not live on memory until the server restart happens.
I tried using the above specified applications by setting exact timestamps (agenda, node-cron, node-schedule). But the cron lives on forever even after the job is completed, and if i restart the server, all the scheduled jobs are cron. So persistence is also an issue I am facing.
My server uses node js. If there are any other languages/tools to make this work, I am all ears.
Looking forward to your help.
I tried following this solution. But this solution is for one predefined event. In my case, the number of reminders to be sent out are dynamic and jobs are to be scheduled on the fly.
I set up GitLab to use runners configured on my host. They pick up jobs which are in a specific group.
This works fine, except that the runners take some time to pick the jobs up. The first job is usually picked up quickly and the second one (sequential) takes some more time to be picked up.
This delay is acceptable but I would like to understand whether this is due to GitLab processing requests (in which case I will not have the ability to fine tune), or that this is something settable on the machine hosting the runners?
The time is configurable on the runner using the check_interval parameter but the default is 3 seconds so that shouldn't be an issue, maybe you need to adjust the concurrent parameter so more jobs can run in parallel?
I have to run one utility periodically for instance say, every minute.
So, I have two option #Scheduled spring boot vs crontab of linux box that we are using to deploy the artifact.
SO, my question is which way should I use?
what are the pros and cons for each solution , any other solution if you can suggest.
Just for comparing between these two, I don't have much points, but only based on this situation which I faced now. I just built a new end point and am doing performance testing and stress testing for the same on production. I am yet to decide the cron schedule times, and those may need a slight tweaking over some more time of observation. Setting via #Scheduled needs me to deploy/restart application every time I make a change.
Application restart generally takes more time than crontab edit.
Other than this, a few points considering the aspects of availability and scalability -
Setting only via crontab on a single server would mean a single point of failure, if the server goes down.
Setting via #Scheduled also could mean the same.
If you have multiple instances of the server, this could mean endpoint getting triggered twice and you may not want to have the same. Worst case, is if the scaling up happens after a long time, and you wrote the #Scheduled endpoint long back, while it was only deployed on a single server and then you forgot. As soon as scaling up happens, the process will start getting hit twice.
So, none of these seem to be the best in terms of points of availability and scalability.
In such situations, ideally a distributed cron management system (I have heard about Rundeck) is needed, which manages which, out of the available servers is to be called to hit the desired end point and if needed to call the next server in case the first one is down.
In case of any need for investigation. logs of rundeck could be checked to find the server which was actually called.
I'm creating a project in Node & Express that allows users to schedule the server to run test scripts e.g. once every ten minutes. I looked into node-schedule which looks great however it seems that all scheduled tasks disappear if the server ever restarts Node.
Cron looks good too but it has the problem that it doesn't seem to have a way to delete scheduled tasks after they have been set up.
If you were doing this, how would you go about it? I really don't want anything that's going to be complex, just need to schedule tasks, be able to delete individual tasks, and keep tasks in the event of a server reboot.
Simplest solution is to store the configurations for Cron in a database (since it takes a string as a parameter). Load the jobs from the db every time the app starts.
I've seen that Heroku charges $15/mo to run Delayed Job, and $3/mo to run cron tasks daily. Is it possible to skip that entirely and run my own cron tasks manually? Or are they somehow figuring out that I'm running cron tasks?
I'm not entirely sure what you mean by "run my own cron tasks manually". For cron specifically, you need access to crontab, which they can control, as they're their servers. If you have another way of doing it, it would probably be fine, but bear in mind that your app is not tied to a specific server when running under Heroku, and that the server will change between executions.
Also, unless they've changed it since last time I checked, you can run daily cron tasks for free, but hourly costs $3/mo.
EDIT: Yes, daily crons are free. See http://addons.heroku.com/.
If you install the Heroku gem on your computer, you can then run your cron tasks manually as follows:
$ heroku rake cron
(in /disk1/home/slugs/xxxxxx_aa515b2_6c4f/mnt)
Running cron at 2010/04/25 10:28:54...
This will execute the exact same code as Heroku's daily/hourly cron add-on does; that is, for this to work, your application must have a Rakefile with a cron task, for example:
desc "Runs cron maintenance tasks."
task :cron do
puts "Running cron at #{Time.now.strftime('%Y/%m/%d %H:%M:%S')}..."
# TODO: your cron code goes here
end
Now, just add the heroku rake cron command to a crontab on any Unix server of yours, or even directly to your personal computer's crontab if you're running Linux or Mac OS X, and you can be scheduling cron jobs for your Heroku application as you please and without being charged for it.
Updating the answer for 2020:
You can use Heroku Scheduler which is Heroku's own add-on that lets you schedule commands using one-off dynos (so that you only pay for the run time of your jobs). The add-on itself is free, but doesn't really allow you to use cron but rather plain frequency: every day, every hour or every 10 minutes. Also, there's no guarantee that your job will execute at the scheduled time or at all..
There are other 3rd party add-ons that can help you run one-off dynos using cron expressions for better flexibility and are more resilient than Heroku Scheduler (proper disclosure, my company is the creator of one such add-on).
You can also use custom clock process (see here for more info) which essentially means that you have one dyno or process spawn tasks that run on other dynos. This usually costs more than using the aforementioned add-ons, but you have more granular control over your processes and since you only rely on Heroku, it may be more stable.
Yes, I've successfully used a cron job on my local server which essentially runs
$ heroku rake <rake task>
at whatever intervals I've required. I've used in on both aspen and bamboo stacks.
You can also just install a gem like rufus-scheduler if you're running a rails app and setup scheduling that way. I don't know if this is bad practice for some reason, but it's what I do with my app, and it seems to work fine.
If you want to have scheduled jobs you can also use http://guardiano.getpeople.in that is a free service (for 10 jobs) for job scheduling.
You just need to setup an HTTP endpoint in your application to recieve event notifications on POST or GET and you can also set some additional params to prevent not authorized actions.
So you set a job in Guardiano that will cal http://yourapp.com/youraction and leave "minutes" blank if you want you action to run once in the future or set minutes to what you want to run your action every X minutes. In that way you only have to create your endpoint for you app and when this EP is called you execute something.
So your app can sleep and you don't need to spend money and time setting up jobs and taking care that they are working properly.
IMHO if you need something fast for an MVP or you need to setup a lot of jobs for different apps than a free service like that where you can actually outsource cronjobs is quite good.
There was aslo an Heroku Addon called Temporize to do that but I'm not sure is still alive and working