How does autoscaling effect Crons for a django application - cron

When I create an Openshift Django application with autoscaling and minutely running crons, are the cron tasks ran on each scaled instance or just once for the cluster of instances. I've got a cron that triggers a database heavy task and only want to run one task at a time.

Yes, the cron cartridge is scaled alongside the the web cartridges. So each cron will fire on each web server. You might look into controlling them with a database table or something so they can mark when they have fired so no other ones fire.

Related

AWS and NodeJS architecture for a scheduled/cron task in multi server setup

I am using AWS services in deploying my application which currently has the production site setup on an application load balancer running 2 instances of my NodeJS server.
My current concern is if I just setup a node-cron to trigger a task at 5:00am, it will do this for each server I spin up.
I need to implement an email delivery system where at 5:00am it will query my database table I made to generate customized emails (need to iterate over each individual;s record which has a unique array that helps build a list of items for each user). I then fire the object off to AWS SES.
What's are some ways you have done this?
**Currently based off my readings I am looking at two options:
**
Setup a node-cron child process within one cluster (but if I have auto-scaling, wouldn't this create a duplicate node-cron task), but this would probably require Redis and tracking the process across servers
OR
Setup an EventBridge API which fires an api.mybackendserver.com/send-email-event where I then carry out my logic. (this seems like the simpler approach, and the drawbacks would be potential CPU/RAM spikes which would be fine as i'm regionally based and would do this in off-peak hours).
EventBridge is definitely a way to go with CRON. If you're worried about usage spikes you could use CRON to invoke a Lambda function. That pushes events to SQS for each job. Those would be polled by EC2 instances.
Other way would be to schedule a task to increase number of instances before cron event occurs.

Is it possible to have a scheduler starting only on one instance?

I have a backend written in Adonis v4.1 (node.js) that I'm deploying on AWS ECS.
Inside the Task Definition of my API I have Adonis and Redis, linked by bridge network.
I'm using adonis-scheduler to run a cronjob every 30 minutes. This cronjob makes a db query and for each row it creates a job (with adonis-queue-pro library). The job concurrency is 1 (so if I have 30 records will be processed one job at a time). Each job makes and external API call and updates the record in the db.
On localhost everything works fine, but on staging environment sometimes happens that my API Task has more than one instance and more than one cronjob starts at the same exact time. So instead of having on job for each record I have multiple jobs for each record that runs simultaneously. This is a big problem.
Is there a way to handle this situation and have only the cronjob of one instance to execute?
Or maybe could I have a Task Definition with only the scheduler part in one single instance (always) and the API and redis with many instances?
The problem is that the code I wrote is deeply linked to my backend, so I cannot use something like lambda functions or other external services.
With ECS, you can have REPLICA or DAEMON service types. DAEMON ensures that only one instance of the task is placed on each instance in the cluster. In your case, it'd mean having exactly one instance behind the cluster. If this constraint is acceptable, give it a go and see if the problem persists.

Allow users to set up schedule for server-side scripts to run in Node

I'm creating a project in Node & Express that allows users to schedule the server to run test scripts e.g. once every ten minutes. I looked into node-schedule which looks great however it seems that all scheduled tasks disappear if the server ever restarts Node.
Cron looks good too but it has the problem that it doesn't seem to have a way to delete scheduled tasks after they have been set up.
If you were doing this, how would you go about it? I really don't want anything that's going to be complex, just need to schedule tasks, be able to delete individual tasks, and keep tasks in the event of a server reboot.
Simplest solution is to store the configurations for Cron in a database (since it takes a string as a parameter). Load the jobs from the db every time the app starts.

CRON on CloudControl server

I've setup a node.js server with cron jobs via node-cron, which is js-land cron implementation. I've noticed that sometimes the jobs are not launching, aparently without errors and following an unknown pattern.
Well, since this server is a free one, I was thinking that maybe it goes to sleep when there is no activity, so that the jobs are not launching. I've looked the docs and I haven't seen any clear indication on this.
I've already seen the Cron addon, but I'm not interested on it. I'll like to make it work within a js process.
Thanks!
cloudControl uses Container idling (https://www.cloudcontrol.com/dev-center/Platform%20Documentation#deploying-new-versions) for free deployments.
If a free deployment (1 container with 128MB) does not get any requests within a timeframe of one hour the container is idled ("server goes to sleep").

Running Cron Tasks on Heroku

I've seen that Heroku charges $15/mo to run Delayed Job, and $3/mo to run cron tasks daily. Is it possible to skip that entirely and run my own cron tasks manually? Or are they somehow figuring out that I'm running cron tasks?
I'm not entirely sure what you mean by "run my own cron tasks manually". For cron specifically, you need access to crontab, which they can control, as they're their servers. If you have another way of doing it, it would probably be fine, but bear in mind that your app is not tied to a specific server when running under Heroku, and that the server will change between executions.
Also, unless they've changed it since last time I checked, you can run daily cron tasks for free, but hourly costs $3/mo.
EDIT: Yes, daily crons are free. See http://addons.heroku.com/.
If you install the Heroku gem on your computer, you can then run your cron tasks manually as follows:
$ heroku rake cron
(in /disk1/home/slugs/xxxxxx_aa515b2_6c4f/mnt)
Running cron at 2010/04/25 10:28:54...
This will execute the exact same code as Heroku's daily/hourly cron add-on does; that is, for this to work, your application must have a Rakefile with a cron task, for example:
desc "Runs cron maintenance tasks."
task :cron do
puts "Running cron at #{Time.now.strftime('%Y/%m/%d %H:%M:%S')}..."
# TODO: your cron code goes here
end
Now, just add the heroku rake cron command to a crontab on any Unix server of yours, or even directly to your personal computer's crontab if you're running Linux or Mac OS X, and you can be scheduling cron jobs for your Heroku application as you please and without being charged for it.
Updating the answer for 2020:
You can use Heroku Scheduler which is Heroku's own add-on that lets you schedule commands using one-off dynos (so that you only pay for the run time of your jobs). The add-on itself is free, but doesn't really allow you to use cron but rather plain frequency: every day, every hour or every 10 minutes. Also, there's no guarantee that your job will execute at the scheduled time or at all..
There are other 3rd party add-ons that can help you run one-off dynos using cron expressions for better flexibility and are more resilient than Heroku Scheduler (proper disclosure, my company is the creator of one such add-on).
You can also use custom clock process (see here for more info) which essentially means that you have one dyno or process spawn tasks that run on other dynos. This usually costs more than using the aforementioned add-ons, but you have more granular control over your processes and since you only rely on Heroku, it may be more stable.
Yes, I've successfully used a cron job on my local server which essentially runs
$ heroku rake <rake task>
at whatever intervals I've required. I've used in on both aspen and bamboo stacks.
You can also just install a gem like rufus-scheduler if you're running a rails app and setup scheduling that way. I don't know if this is bad practice for some reason, but it's what I do with my app, and it seems to work fine.
If you want to have scheduled jobs you can also use http://guardiano.getpeople.in that is a free service (for 10 jobs) for job scheduling.
You just need to setup an HTTP endpoint in your application to recieve event notifications on POST or GET and you can also set some additional params to prevent not authorized actions.
So you set a job in Guardiano that will cal http://yourapp.com/youraction and leave "minutes" blank if you want you action to run once in the future or set minutes to what you want to run your action every X minutes. In that way you only have to create your endpoint for you app and when this EP is called you execute something.
So your app can sleep and you don't need to spend money and time setting up jobs and taking care that they are working properly.
IMHO if you need something fast for an MVP or you need to setup a lot of jobs for different apps than a free service like that where you can actually outsource cronjobs is quite good.
There was aslo an Heroku Addon called Temporize to do that but I'm not sure is still alive and working

Resources