How to run Node.js jobs routines - node.js

Right now I have a weekly email job that works by first checking a last_email_sent timestamp against the current time, it then uses setTimeout to schedule a routine that is exactly a week from the last_email_sent timestamp. If the process ever restarts, the setTimeout would be queued again but the interval would of course be smaller. This works for a weekly email job, but is there a better way to handle jobs in node.js? Maybe there's a module out there that can let me manage my jobs that I'm not aware of.

There's a handy module in npmjs.org called node-cron.
It'll give you more flexibility.

Many of the modules listed in the node.js wiki under "Message Queues" will help with this type of system. Being a TJ Holowaychuck fanboy, I myself would probably first look at Kue.

Related

Nodejs performing task on fixed time in the future

I have stumbled upon a difficult type of problem for me. So, lets say we have an API, which creates events in the future, for example, after two weeks from this moment. During this time, we can post comments, add photos, etc. on this event. After those two week pass, I want to close this event and change it's type from 'OPEN' to 'CLOSED'. How should I achieve this?
I have tried agenda library for this task, but it seems that it is for different purpose of tasks - scheduled tasks. Are there any other options or other practices to do this?
I am using postgres database, if that's helpful.

Best way to implement background “timer” functionality in Python/Django

I am trying to implement a Django web application (on Python 3.8.5) which allows a user to create “activities” where they define an activity duration and then set the activity status to “In progress”.
The POST action to the View writes the new status, the duration and the start time (end time, based on start time and duration is also possible to add here of course).
The back-end should then keep track of the duration and automatically change the status to “Finished”.
User actions can also change the status to “Finished” before the calculated end time (i.e. the timer no longer needs to be tracked).
I am fairly new to Python so I need some advice on the smartest way to implement such a concept?
It needs to be efficient and scalable – I’m currently using a Heroku Free account so have limited system resources, but efficiency would also be important for future production implementations of course.
I have looked at the Python threading Timer, and this seems to work on a basic level, but I’ve not been able to determine what kind of constraints this places on the system – e.g. whether the spawned Timer thread might prevent the main thread from finishing and releasing resources (i.e. Heroku Dyno threads), etc.
I have read that persistence might be a problem (if the server goes down), and I haven’t found a way to cancel the timer from another process (the .cancel() method seems to rely on having the original object to cancel, and I’m not sure if this is achievable from another process).
I was also wondering about a more “background” approach, i.e. a single process which is constantly checking the database looking for activity records which have reached their end time and swapping the status.
But what would be the best way of implementing such a server?
Is it practical to read the database every second to find records with an end time of “now”? I need the status to change in real-time when the end time is reached.
Is something like Celery a good option, or is it overkill for a single process like this?
As I said I’m fairly new to these technologies, so I may be missing other obvious solutions – please feel free to enlighten me!
Thanks in advance.
To achieve this you need some kind of scheduling tasks functionality. For a fast simpler implementation is a good solution to use the Timer object from the
Threading module.
A more complete solution is tu use Celery. If you are new, deeping in it will give you a good value start using celery as a queue manager distributing your work easily across several threads or process.
You mentioned that you want it to be efficient and scalable, so I guess you will want to implement similar functionalities that will require multiprocessing and schedule so for that reason my recommendation is to use celery.
You can integrate it into your Django application easily following the documentation Integrate Django with Celery.

How does node cron remembers its tasks?

I am trying to understand how node-cron by kelektiv works.
Specifically, if your node app crashes, how does it remember the dates that you scheduled for an event? Does it store the jobs in a database somewhere or a somewhere locally on the machine?
Any recommended reading resources or an explanation will be very helpful.
Thank you in advance for your answer.
See this code: https://github.com/kelektiv/node-cron/blob/master/lib/cron.js
They are using methods to calculate when to send next by sendAt, how much time is left before sending next by getTimeout and then they are simply putting a setTimeout based on that in start.
It's a nice piece of code and I'll suggest you to check it out, it's very simple and written in very understandable way.
And no it doesn't stores the next time in Database, it's just setTimeout

Is Cron the right option for this?

I am trying to create a script that watches my college time table and registers them for a class when it is open. Kind of like an Ebay auction sniper. I was wondering if cron is the right tool for this. I need to be able to run the script for every user. The user will enter their username and password and the script will query the timetable.
Looking for some advice on if cron is the tool or if there are other tools out there.
cron runs a particular program or script at a specified time. For example, if you wanted a report compiled and e-mailed every day at 2 a.m., that would be a cron job.
In this sense, cron has a timetable, but I am not sure that it is the sort of timetable of which you are thinking.
From a system-design perspective, the clean way to achieve the effect you want naturally would be to let the students' class requests join a queue, then to have the college's registrar's own computer take requests from the queue as seats became available. However, I assume from your Ebay reference that this is not possible in your case.

How to define frequency of a job in application by users?

I have an application that has to launch jobs repeatingly. But (yes, that would have been to easy without a but...) I would like users to define their backup frequency in application.
In worst case, they would have to choose between :
weekly,
daily,
every 12 hours,
every 6 hours,
hourly
In best case, they should be able to use crontab expressions (see documentation for example)
How to do this? Do I launch a job every minutes that check for last execution time, frequency and then launches another job if needed? Do I create a sort of queue that will be executed by a masterjob?
Any clues, ideas, opinions, best pratices, experiences are welcome!
EDIT : Solved this problem using Akka scheduler. Ok, this is a technical solution not a design answer but still everything works great.
Each user defined repetition is an actor that send messages every period to a new actor to execute the actual job.
There may be two ways to do this depending on your requirements/architecture:
If you can only use Play:
The user creates the job and the frequency it will run (crontab, whatever).
On saving the job, you calculate the first time it will have to be run. You then add an entry to a table JOBS with the execution time, job id, and any other information required. This is required as Play is stateless and information must be stored in the DB for later retrieval.
You have a job that queries the table for entries whose execution date is less than now. Retrieves the first, runs it, removes it from the table and adds a new entry for next execution. You should keep some execution counter so if a task fails (which means the entry is not removed from DB) it won't block execution of the other tasks by the job trying again and again.
The frequency of this job is set to run every second. That way while there is information in the table, you should execute the request around as often as they are required. As Play won't spawn a new job while the current one is working if you have enough tasks this one job will serve all. If not, it will be killed at some point and restored when required.
Of course, the crons of the users will not be too precise, as you have to account for you own cron delays plus execution delays on all the tasks in queue, which will be run sequentially. Not the best approach, unless you somehow disallow crons which run every second or more often than every minute (to be safe). Doing a check on execution time of the crons to kill them if they are over a certain amount of time would be a good idea.
If you can use more than Play:
The better alternative I believe is to use Quartz (see this) to create a future execution when the user creates the job, and reproram it once the execution is over.
There was a discussion on google-groups about it. As far as I remember you must define a job which start every 6 hours and check which backups must be done. So you must remember when the last backup job was finished and make the control yourself. I'm unsure if Quartz can handle such a requirement.
I looked in the source-code (always a good source ;-)) and found a method every, where I think this should be do what you want. How ever I'm unsure if this is a clever design, because if you have 1000 user you will have then 1000 Jobs. I'm unsure if Play was build to handle such a large number of jobs.
[Update] For cron-expressions you should have a look into JobPlugin.scheduleForCRON()
There are several ways to solve this.
If you don't have a really huge load of jobs, I'd just persist them to a table using the required flexibility. Then check all of them every hour (or the lowest interval you support) and run those eligible. Simple.
Or, if you prefer to use cron syntax anyway, just write (export) jobs to a user crontab using a wrapper which calls back to your running app, or starts the job in a standalone process if that's possible.

Resources