Where do I save functions that are called with a setTimeout? - node.js

I'm learning NodeJS and am trying to stick with the MVC architecture. I'm getting stuck on where to place those functions that update data from an outside source on a set loop, with a 30 second or so delay.
Example: I build an app that takes data from a API, Orders in this case, and stores it in a database. I can add orders to my database locally, and I want the orders database to be synchronized with the outside source mentioned previously, every 30 seconds.
My models directory will contain Order.js which includes an order schema and it will connect to MongoDB via Mongoose. My controller will have API endpoints for CRUD operations.
Where does the function go that refreshes the data from the server? In the controller? Then I would export that function so that I can set up the loop that updates the database in my app.js (or whatever I use to start the application)?

I recommend using something like node-cron to handle the setTimeout for you. It gives you the advantage of cron-like syntax to run your jobs on a schedule and will run while your node app is. I would put these jobs in a separate directory with node cron jobs. The individual node cron job can then import your MongoDB model. Your main application can then import index.js or something similar from the cronjobs dir which imports all your node cron jobs to bootstrap them on application startup.

Related

Doing tasks before heroku nodejs server is ready

When deploying a new release, I would like my server to do some tasks before actually being released and listen to http requests.
Let's say that those tasks take around a minute and are setting some variables: until the tasks are done I would like the users to be redirected to the old release.
Basically do some nodejs work before the server is ready.
I tried a naive approach:
doSomeTasks().then(() => {
app.listen(PORT);
})
But as soon as the new version is released, all https request during the tasks do not work instead of being redirect to old release.
I have read https://devcenter.heroku.com/articles/release-phase but this looks like I can only run an external script which is not good for me since my tasks are setting cache variables.
I know this is possible with /check_readiness on App Engine, but I was wondering for Heroku.
You have a couple options.
If the work you're doing only changes on release, you can add a task as part of your dyno build stage that will fetch and store data inside of the compiled slug that will be deployed to virtual containers on Heroku and booted as your dyno. For example, you can run a task in your build cycle that fetches data and stores/caches it as a file in your app that you read on-boot.
If this data changes more frequently (e.g. daily), you can utilize “preboot” to capture and cache this data on a per-dyno basis. Depending on the data and architecture of your app you may want to be cautious with this approach when running multiple dynos as each dyno will have data that was fetched independently, thus this data may not match across instances of your application. This can lead to subtle, hard to diagnose bugs.
This is a great option if you need to, for example, pre-cache a larger chunk of data and then fetch only new data on a per-request basis (e.g. fetch the last 1,000 posts in an RSS feed on-boot, then per request fetch anything newer—which is likely to be fewer than a few new entries—and coalesce the data to return to the client).
Here's the documentation on customizing a build process for Node.js on Heroku.
Here's the documentation for enabling and working with Preboot on Heroku
I don't think it's a good approach to do it this way. you can use an external script ( npm script ) to do this task and then use the release phase. the situation here is very similar to running migrations you can require the needed libraries to the script you can even load all the application to the script without listening to a port let's make it clearer by example
//script file
var client = require('cache_client');
// and here you can require all the needed libarires to the script
// then execute your logic using sync apis
client.setCacheVar('xyz','xyz');
then in packege.json in "scripts" add this script let assume that you named it set_cache
"scripts": {
"set_cache": "set_cache",
},
now you can use npm to run this script as npm set_cache and use this command in Procfile
web: npm start
release: npm set_cache

nodejs - run a function at a specific time

I'm building a website that some users will enter and after a specific amount of time an algorithm has to run in order to take the input of the users that is stored in the database and create some results for them storing the results also in the database. The problem is that in nodejs i cant figure out where and how should i implement this algorithm in order to run after a specific amount of time and only once(every few minutes or seconds).
The app is builded in nodejs-expressjs.
For example lets say that i start the application and after 3 minutes the algorithm should run and take some data from the database and after the algorithm has created some output stores it in database again.
What are the typical solutions for that (at least one is enough). thank you!
Let say you have a user request that saves url to crawl and get listed products
So one of the simplest ways would be to:
On user requests create in DB "tasks" table
userId | urlToCrawl | dateAdded | isProcessing | ....
Then in node main site you have some setInterval(findAndProcessNewTasks, 60000)
so it will get all tasks that are not currently in work (where isProcessing is false)
every 1 min or whatever interval you need
findAndProcessNewTasks
will query db and run your algorithm for every record that is not processed yet
also it will set isProcessing to true
eventually once algorithm is finished it will remove the record from tasks (or mark some another field like "finished" as true)
Depending on load and number of tasks it may make sense to process your algorithm in another node app
Typically you would have a message bus (Kafka, rabbitmq etc.) with main app just sending events and worker node.js apps doing actual job and inserting products into db
this would make main app lightweight and allow scaling worker apps
From your question it's not clear whether you want to run the algorithm on the web server (perhaps processing input from multiple users) or on the client (processing the input from a particular user).
If the former, then use setTimeout(), or something similar, in your main javascript file that creates the web server listener. Your server can then be handling inputs from users (via the app listener) and in parallel running algorithms that look at the database.
If the latter, then use setTimeout(), or something similar, in the javascript code that is being loaded into the user's browser.
You may actually need some combination of the above: code running on the server to periodically do some processing on a central database, and code running in each user's browser to periodically refresh the user's display with new data pulled down from the server.
You might also want to implement a websocket and json rpc interface between the client and the server. Then, rather than having the client "poll" the server for the results of your algorithm, you can have the client listen for events arriving on the websocket.
Hope that helps!
If I understand you correctly - I would just send the data to the client-side while rendering the page and store it into some hidden tag (like input type="hidden"). Then I would run a script on the server-side with setTimeout to display the data to the client.

No Mongo Query gets result when cron is running in background

I have been NodeJS as server side and MongoDB as our database. It really works great together.
Now I have added node-schedule library into our system , to call a function like a cron-job.
The process takes around hours to complete.
My issue is whenever cron is running , all users to my site gets No response fro server i.e database gets locked.
Stuck on the issue from a week , needs good solution to run cron , without affecting users using the site.
Typically you will want to write a worker and run the worker in a different entry point that is not part of your server. There are multiple ways you could achieve this.
1) Write a worker on another server to interact with your database
2) Write a service worker on another server that interacts with your api
3) Use the same server but setup a cronjob to execute the file that does the work at a specified time.
But you should not do this from the same entry point that your server is running on. You need a different execution file.
There is one thing you can do to run this where it will not bog down your server and that would be for your trigger for node-schedule to run a child process. https://nodejs.org/api/child_process.html

How can I "break up" a long running server side function in a Meteor app?

I have, as part of a meteor application, a server side that gets POST messages of information to feed to the web client via inserts/updates to a Collection. So far so good. However, sometimes these updates can be rather large (50K records a go, every 5 seconds). I was having a hard time keeping up to this until I started using batch-insert package and then low-level batch.find.update() and batch.execute() from Mongo.
However, there is still a good amount of processing going on even with 50K records (it does some calculations, analytics, etc). I would LOVE to be able to "thread" that logic so the main event loop can continue along. However, I am not sure there is a real easy way to create "real" threads for this within Meteor. So baring that, I would like to know the best / proper way of at least "batching" the work so that every N (say 1K or so) records I can release the event loop back to process other events (like some client side DDP messages and the like). Then do another 1K records, etc. until however many records as I need are done.
I am THINKING the solution lies within using Fibers/Futures -- which appear to be the Meteor way -- but I am not positive that is correct or the low level ideas like "setTimeout()" and/or "setImmediate()" are more appropriate.
TIA!
Meteor is not a one size fits all tool. I think you should decouple your meteor application from your batch processing. Set up a separate meteor instance, or better yet set up a pure node.js server to handle these requests and batch processes. It would look like this:
Create a node.js instance that connects to the same mongo database using the mongodb plugin (https://www.npmjs.com/package/mongodb).
Use express if you're using node.js to handle the post requests (https://www.npmjs.com/package/express).
Do the batch processing/inserts/updates in this instance.
The updates in mongo will be reflected in meteor very quickly. I had a similar situation and used a node server to do some batch data collection and then pass it into a cassandra database. I then used pig latin to run some batch operations on that data, and then inserted it into mongo. My meteor application would reactively display the new data pretty much instantaneously.
You can call this.unblock() inside a server method to allow the code to run in the background, and immediately return from the method. See example below.
Meteor.methods({
longMethod: function() {
this.unblock();
Meteor._sleepForMs(1000 * 60 * 60);
}
});

Which all libraries for NodeJS provide persistent scheduling and cron jobs

From what I have read, only Agenda,Node-crontab and schedule-drone provide this feature. It would be grateful if you provide a small description of the mechanism which these library use for persistent storage of jobs.
I need to send emails by reading the mail options from MongoDB and want my nodeJS application to somehow schedule and be in sych with these even if nodeJS is stopped temporarily.
For MySQL you can try with nodejs-persistable-scheduler
In other cases you need to build your own solution. For example, I created a collection/table to store the schedule state and rules. Then, if the service's crashes or restarted, I can get all the schedules form the database and restart them again from the app.listen event.

Resources