No Mongo Query gets result when cron is running in background - node.js

I have been NodeJS as server side and MongoDB as our database. It really works great together.
Now I have added node-schedule library into our system , to call a function like a cron-job.
The process takes around hours to complete.
My issue is whenever cron is running , all users to my site gets No response fro server i.e database gets locked.
Stuck on the issue from a week , needs good solution to run cron , without affecting users using the site.

Typically you will want to write a worker and run the worker in a different entry point that is not part of your server. There are multiple ways you could achieve this.
1) Write a worker on another server to interact with your database
2) Write a service worker on another server that interacts with your api
3) Use the same server but setup a cronjob to execute the file that does the work at a specified time.
But you should not do this from the same entry point that your server is running on. You need a different execution file.
There is one thing you can do to run this where it will not bog down your server and that would be for your trigger for node-schedule to run a child process. https://nodejs.org/api/child_process.html

Related

How to synchronize background tasks across multiple FastAPI processes?

I have a FastAPI application which sends emails when the users register on the website.
The app is implemented in such a way that there is a cron task (scheduled every minute) which checks the database and if a flag is set, it will try to send an email.
Deployment: Two instances of the FastAPI application are running connected to a locally hosted MYSQL database
Problem: Since, there are two instances of the Application running, each one will trigger a cron job every minute and this results in sending an email twice.
How to synchronise between multiple processes? Please help me with this issue. Thanks

Where do I save functions that are called with a setTimeout?

I'm learning NodeJS and am trying to stick with the MVC architecture. I'm getting stuck on where to place those functions that update data from an outside source on a set loop, with a 30 second or so delay.
Example: I build an app that takes data from a API, Orders in this case, and stores it in a database. I can add orders to my database locally, and I want the orders database to be synchronized with the outside source mentioned previously, every 30 seconds.
My models directory will contain Order.js which includes an order schema and it will connect to MongoDB via Mongoose. My controller will have API endpoints for CRUD operations.
Where does the function go that refreshes the data from the server? In the controller? Then I would export that function so that I can set up the loop that updates the database in my app.js (or whatever I use to start the application)?
I recommend using something like node-cron to handle the setTimeout for you. It gives you the advantage of cron-like syntax to run your jobs on a schedule and will run while your node app is. I would put these jobs in a separate directory with node cron jobs. The individual node cron job can then import your MongoDB model. Your main application can then import index.js or something similar from the cronjobs dir which imports all your node cron jobs to bootstrap them on application startup.

Cron job on NodeJS server runs multiple times simultaneously due to load balancers

I have cron job services on my nodeJS server (part of a React app) that I deploy using Convox to AWS, which has 4 load balancer servers. This means my cron job runs 4 times simultaneously on each server, when I only want it to run once. How can I stop this from happening and have my cron jobs run only once? As far as I know, there is no reliable way to lock my cron to a specific instance, since instances are volatile and may be deleted/recreated as needed.
The cron job services conduct tasks such as querying and updating our database, sending out emails and texts to users, and conducting external API calls. The services are run using the cron npm package, upon the server starting (after server.listen).
Can you expose these tasks via url? That way you can have an external cron service that requests each job via url against the ELB.
See https://cron-job.org/en/
Another advantage of this approach is you get error reports if a url does not return a 200 status. This could simplify error tracking across all jobs.
Also this provides better redudency and load balancing, as opposed to having a single instance where you run all jobs.
I had the same issue. Se my solution here. Two emails was sent because of two instances on AWS. I lock each sending by unique random number.
My example based on MongoDB.
https://forums.meteor.com/t/help-email-sends-emails-twice/50624

Node.js(&MongoDB) server crashes , Database operations halfway?

I have a node.js app with mongodb backend going to production in a week and i have few doubts on how to handle app crashes and restart .
Say i have a simple route /followUser in which i have 2 database operations
/followUser
----->Update User1 Document.followers = User2
----->Update User2 Document.followers = User1
----->Some other mongodb(via mongoose)operation
What happens if there is a server crash(due to power failure or maybe the remote mongodb server is down ) like this scenario :
----->Update User1 Document.followers = User2
SERVER CRASHED , FOREVER RESTARTS NODE
What happens to these operations below ? The system is now in inconsistent state and i may have error everytime i ask for User2 followers
----->Update User2 Document.followers = User1
----->Some other mongodb(via mongoose)operation
Also please recommend good logging and restart/monitor modules for apps running in linux.
Right now im using domains for to catch exceptions , doing server.close , but before process.exit() i want to make sure all database transactions are done , can i check this by testing if the event loop is empty or not (how?) and then process.exit(1) ?
You need transactions for this, and since MongoDB doesn't have them here is a workaround http://docs.mongodb.org/manual/tutorial/perform-two-phase-commits/
One way to address this problem is to add cleanup code to your application that runs whenever the application starts. You write the cleanup code to perform sanity checks on any portions of your data that can be updated in multiple steps like your example and then repairs that data in whatever way make sense for your application.
Depending on the complexity of your application/data, this may also require that you keep a log of actions the app was trying to perform, but that gets complicated real fast. Ideally it's more a matter of refreshing denormalized data and deleting partial data.
You want to do this during startup rather than shutdown as there's no guarantee your shutdown code will fully run and if you're shutting down because of an exception you don't know what the state of your system is at that point.
the solution given by vkurchatkin in this link is a workaround in case your appserver crashes, because you will be able of knowing which transactions were pending at that moment. If you implement this in your code you can create cleanup code when your system restart as suggested by JohnnyHK. The code you mention (catching exceptions, test when closing, etc) will not work because...well.. your server crashed! ;-)
That said, this is done using the database, so you will have to warrantee to a certain point that your database does not crash. I would suggest your use replication for that. It is basically a cluster of servers that recovers itself if one node fails, and also you can make some check to make sure that the data reached the servers and is safe.
Hope this helps.

How to run cronjobs on local files on cloudControl PaaS?

On cloudControl, I can either run a local task via a worker or I can run a cronjob.
What if I want to perform a local task on a regular basis (I don't want to call a publicly accessible website).
I see possible solutions:
According to the documentation,
"cronjobs on cloudControl are periodical calls to a URL you specify."
So calling the file locally is not possible(?). So I'd have to create a page I can call via URL. And I have to perform checks, if the client is on localhost (=the server) -- I would like to avoid this way.
I make the worker sleep() for the desired amount of time and then make it re-run.
// do some arbitrary action
Foo::doSomeAction();
// e.g. sleep 1 day
sleep(86400);
// restart worker
exit(2);
Which one is recommended?
(Or: Can I simply call a local file via cron?)
The first option is not possible, because the url request is made from a seperate webservice.
You could either use HTTP authentication in the cron task, but the worker solution is also completely valid.
Just keep in mind that the worker can get migrated to a different server (in case of software updates or hardware failure), so do SomeAction() may get executed more often than once per day from time to time.

Resources