How many simultaneous scheduled Jobs can I have in Node - node.js

In this Node app I'm working on, it's possible for users to book appointments.
When an appointment is booked, the users will later get a reminder by mail X hours before the actual appointment.
I'm thinking about using Node-schedule for this task.
For each appointment: Set up a future Date, send the reminder mail once and the delete this particular scheduled job
But... there might be ALOT of appointments booked when the app grows, and this means there will be ALOT of Node-schedule processes simultaneously sleeping and waiting to fire...
On a regular day, lets pretend we have 180 appointments booked for the future per clients, and lets pretend the app right now has 50 clients. This gives us around 9000 scheduled tasks sleeping and waiting to fire.
Question: Is this perfectly OK? ... or will all these simultaneously scheduled task be to much/many?

Short answer: 9000 is not a lot, you're good to go. However, I would advise you to write a benchmark to see for yourself.
I checked node-schedule's source and sure enough, it delegates scheduling to setTimeout for date-based tasks. This means that your jobs are scheduled using node.js internal event loop.
This code is very efficient, as node.js is tailored to handle several thousands of requests simultaneously.
Regardless of the number of pending jobs, node.js will only care about the next task and sleep until its date (unless an async I/O interrupts it), so scheduling a task is essentially O(1).

Related

Multiple setTimeouts on Nodejs

I'm trying to implement an auto order cancel feature in my app. So i'm thinking of adding setTimeouts on Node which will cancel the user's order on a given time.
I tried adding the timer in the app but there's too much constraints.
Will multiple setTimeouts slow down the performance of our server?
Use Agenda instead of setTimeouts.
Agenda uses a MongoDB database to persist scheduled tasks(and the parameters needed for the task) so that even if the server goes down, the tasks will still run at the specified time or intervals.
References :
https://thecodebarbarian.com/node.js-task-scheduling-with-agenda-and-mongodb
https://medium.com/hacktive-devs/nodejs-scheduling-tasks-agenda-js-4b6824f9457e
Will multiple setTimeouts slow down the performance of our server?
No, it won't slow it down any more so than the CPU time used when each timer runs.
The timer design in node.js is specifically built to manage large numbers of timers well. There should be no issue with having lots of timers (tens of thousands would be fine). There's a sorted list of timers and it only uses an actual OS level timer or the "next" timer event to fire. When that fires, it grabs the next event in the list and sets an OS level timer for that one. When a new timer is created, it is inserted into the sorted list and if it's not now the first timer in the list, it will just wait its turn until it is the first one in the list.
That said, you may not actually "need" a separate timer for each order. Since you don't need millisecond or even minute level accuracy, you could maintain a list of unfinished orders with a timestamp for when they were last modified and then you could have a single interval timer that runs every several minutes that just checks which orders have exceeded your inactive time and should be cancelled. If the order list was sorted by its timestamp, you'd just check a few orders from the end until you found ones that no longer need to be cancelled.

What is better way to making longer delay inside a series of tasks?

I'm trying to build a workflow system, which will process a series of tasks & delays. Delay can be changed or removed from a running workflow.
What is the better way to making longer delay inside a series of tasks? (Like 3-4 months). Right now two ways are pocking around my head:
Pre-calculating & saving delay time. Setup a scheduler that will check delay repeatedly after a specific interval(1 minute maybe). This will make a lot of database queries, but the delay can be changed instantly.
Schedule a job for a delay. This can reduce a lot of database queries &, but the problem is maintaining & changing delay in these long-running jobs. Also, these jobs need to survive a server crash or restart.
Right now I'm not sure how to do it in a better way and still studying about it. If anyone has a similar experience, please share.
You can store the tasks into the database, like :
{
_id: String,
status: Enum,
executionTime: timestamp,
}
When you declare a new task, push a new entry into the DB.
At your server start, or when a new task is declared, create a setTimeout that will wake up your node.js when it's necessary.
Optimization
To avoid having X setTimeout, with X the number of task to execute. Keep only one setTimeout, with the time to wait equals to the closest task to execute.
For example, you have three task, one must run in 1 hour, one in 2 hour and one in 3 hour. Use a setTimeout of 1 hour. When it get triggered, it execute the task 1 and then look at the remaining tasks to re-run.

How node-scheduler working in this scenerio

I am new in NodeJs and now I want to use node-scheduler, but i have just one query, please give me suggestion regarding this.
https://github.com/node-schedule/node-schedule
When I setup a scheduler that run in every 5 Minutes, If the scheduler does
not completed within 5 minutes. So my question is that then the scheduler
will start another thread or not?
Please solve my query.
Thanks.
Since jobs don't seem to have a mechanism to let the scheduler know they are done, jobs will be scheduled according to their scheduled time alone.
In other words: if you schedule a job to run every 5 minutes, it will be started every 5 minutes, even if the job itself takes more than 5 minutes to complete.
To clarify: this doesn't start a new thread for each job, as JS is single-threaded. If a job blocks the event loop (for instance by doing heavy calculations), it is possible for the scheduler to not be able to start a new job when its time has arrived, but blocking the event loop is not a good thing.

How to put Rendezvous function in jmeter thread or vuser for a paticular function

I am new in Jmeter , I am familiar with LR. But not able to get some functionalities in jmeter for "VUser/Thread Group ".
While I am running my script with 10 threads in jmeter , what does that mean () :-
all 10 users are performing same action at same time
or
each thread is performing separate actions
or
once one thread complete then another thread will start its execution.
How to put 'Rendezvous' function in jmeter for some particular transaction or action like we used write lr_rendezvous("R1"); in Loadrunner to hit all vusers at same time. Is that possible in Jmeter then how ?
If you set number of thread as 10 its loadrunner equalent as executing with 10 virtual users. all threads will start executing at same time.
You can use Synchronizing_Timer for achieving Rendezvous in JMeter
http://jmeter.apache.org/usermanual/component_reference.html#Synchronizing_Timer
https://blazemeter.com/blog/using-jmeter-synchronizing-timer
Concerning the 1st part,
'running script with 10 threads' means that they would all start running at the same time... IF ramp-up time == 0.
If you set ramp-up to [someValue] - the threads will get asynced. See the article from jmeter docs.
Each thread will execute the test plan in its entirety and completely independently of other test threads. Multiple threads are used to simulate concurrent connections to your server application.
The ramp-up period tells JMeter how long to take to "ramp-up" to the
full number of threads chosen. If 10 threads are used, and the ramp-up
period is 100 seconds, then JMeter will take 100 seconds to get all 10
threads up and running. Each thread will start 10 (100/10) seconds
after the previous thread was begun. If there are 30 threads and a
ramp-up period of 120 seconds, then each successive thread will be
delayed by 4 seconds.
If you are reproducing human behavior a full rendezvous event, more than one person in the same section of code at the same time engaging in the same function, is exceedingly rare....on the order of a credit card company having only 4-5 people in such an incident on the largest shopping day of the year.
So, if you are headed down this path, consider if you are trying to reproduce human behavior or if you have a technical metric you are hitting with a small number of focused users on such an event.
The use of exceptionally highlighted rendezvous use on a resume is a mark of someone you do not want to hire.

How to define frequency of a job in application by users?

I have an application that has to launch jobs repeatingly. But (yes, that would have been to easy without a but...) I would like users to define their backup frequency in application.
In worst case, they would have to choose between :
weekly,
daily,
every 12 hours,
every 6 hours,
hourly
In best case, they should be able to use crontab expressions (see documentation for example)
How to do this? Do I launch a job every minutes that check for last execution time, frequency and then launches another job if needed? Do I create a sort of queue that will be executed by a masterjob?
Any clues, ideas, opinions, best pratices, experiences are welcome!
EDIT : Solved this problem using Akka scheduler. Ok, this is a technical solution not a design answer but still everything works great.
Each user defined repetition is an actor that send messages every period to a new actor to execute the actual job.
There may be two ways to do this depending on your requirements/architecture:
If you can only use Play:
The user creates the job and the frequency it will run (crontab, whatever).
On saving the job, you calculate the first time it will have to be run. You then add an entry to a table JOBS with the execution time, job id, and any other information required. This is required as Play is stateless and information must be stored in the DB for later retrieval.
You have a job that queries the table for entries whose execution date is less than now. Retrieves the first, runs it, removes it from the table and adds a new entry for next execution. You should keep some execution counter so if a task fails (which means the entry is not removed from DB) it won't block execution of the other tasks by the job trying again and again.
The frequency of this job is set to run every second. That way while there is information in the table, you should execute the request around as often as they are required. As Play won't spawn a new job while the current one is working if you have enough tasks this one job will serve all. If not, it will be killed at some point and restored when required.
Of course, the crons of the users will not be too precise, as you have to account for you own cron delays plus execution delays on all the tasks in queue, which will be run sequentially. Not the best approach, unless you somehow disallow crons which run every second or more often than every minute (to be safe). Doing a check on execution time of the crons to kill them if they are over a certain amount of time would be a good idea.
If you can use more than Play:
The better alternative I believe is to use Quartz (see this) to create a future execution when the user creates the job, and reproram it once the execution is over.
There was a discussion on google-groups about it. As far as I remember you must define a job which start every 6 hours and check which backups must be done. So you must remember when the last backup job was finished and make the control yourself. I'm unsure if Quartz can handle such a requirement.
I looked in the source-code (always a good source ;-)) and found a method every, where I think this should be do what you want. How ever I'm unsure if this is a clever design, because if you have 1000 user you will have then 1000 Jobs. I'm unsure if Play was build to handle such a large number of jobs.
[Update] For cron-expressions you should have a look into JobPlugin.scheduleForCRON()
There are several ways to solve this.
If you don't have a really huge load of jobs, I'd just persist them to a table using the required flexibility. Then check all of them every hour (or the lowest interval you support) and run those eligible. Simple.
Or, if you prefer to use cron syntax anyway, just write (export) jobs to a user crontab using a wrapper which calls back to your running app, or starts the job in a standalone process if that's possible.

Resources