Asynchronous function with multiple emit events (futures in meteor) - node.js

My use case is to read RSS feed items asynchronously and load them into a meteor collection.
I have the feedparser npm module that does the parsing. It emits three events .on('error'), .on('meta') and .on('readable) with three different outputs.
When I run it in fixtures.js, with just console.log statements to run the output, its working fine.
When I use the same code to insert into a collection, I get errors related to asynchronocity of the function (assuming something to do with fibers)
So, I want to make it into a meteor method using futures as below -
http://www.discovermeteor.com/patterns/5828399
I tried but could not wrap my head around handling multiple events in Futures.

If you just want to push something to db at one point, it's enough to synchronize this call. Other than that, you can do whatever you want asynchronously. For example:
var Fiber = Npm.require('fibers');
var item = {};
var onInit = function() {
// do whatever with item
};
var onData = function() {
// do whatever with item
};
var onFinish = function() {
new Fiber(function(){
Documents.insert(item);
}).run();
};

Although Meteor is a great tool, I think node and its async insight is brilliant, and the best tool for what you are doing. Keep as a plan b having this part of your project be a straight node app.
Otherwise,
async from meteor
and

Related

How Nodejs knows if sync or async

I understand what a callback is and what asynchronous means, what I don't get is how to run asynchronous functions in node.
For example, how is this
var action = (function(data,callback) {
result = data+1;
callback(result);
});
http.createServer(function (req, res) {
action(5, function(r){
res.end(r.toString());
});
}).listen(80);
different from this
var action = (function(data) {
result = data+1;
return result;
});
http.createServer(function (req, res) {
var r = action(5);
res.end(r.toString());
}).listen(80);
?
I guess in the first example I'm doing it asynchronously, yet I don't know how Node knows when to do it sync or async... is it a matter of the return? or the fact that in the sync mode we're doing var x = func(data);?
And also: when to use sync or async? Because obviously you don't want to use it when adding +1... is it OK to use async just when performing IO tasks, such as reading from DB?
For example, I'm using the library crypto to encrypt a short string (50 chars at most), is this case a good example where I should already be using async?
I guess in the first example I'm doing it asynchronously...
Your first example isn't async :) Merely passing a callback and calling it when you're done doesn't make a function asynchronous.
Asynchronous means that, basically, you're telling Node: "here, do this for me, and let me know when you're done while I continue doing other stuff".
Your example is not handing anything to Node for future completion. It's doing a calculation and calling the callback immediately after that. That's functionally the same as your second example, where you return the result of the calculation.
However, you can change your first example to something that is asynchronous:
var action = (function(data,callback) {
setTimeout(function() {
result = data + 1;
callback(result);
}, 1000);
});
Here, you're telling Node to delay calling the callback for one second, by using setTimeout. In the mean time, Node won't get stuck waiting for a second; it will happily accept more HTTP requests, and each one will be delayed one second before the response is sent.
When to use sync or async?
Asynchronous code is "viral": if you rely on functions that are async, your own code that uses those functions will also have to be async (generally by accepting a callback, or using another mechanism to deal with asynchronicity, like promises).
For example, I'm using the library crypto to encrypt a short string (50 chars at most), is this case a good example where I should already be using async?
This depends on which function you're using. AFAIK, most encryption functions in crypto aren't asynchronous, so you can't "make" them asynchronous yourself.
Both examples will work synchronous. Simple async operations are setTimout and setInterval.
Node actually doesn't care what code are you running. You can block or not (blocking/non-blocking).
In other words - you have event loop. If your process is async he will pass the program control to the event loop, so it can execute any other action node needs to be done. If not - he wont.
if you want a function to work asynchronously, you can do that using promises, look at the code below :
function is_asynch(){
return new Promise((resolve,reject)=>{
resolve( here_your_synch_function() )
})
}

Getting data pushed to an array outside of a Promise

I'm using https://github.com/Haidy777/node-youtubeAPI-simplifier to grab some information from a playlist of Bounty Killers. The way, this library is setup seems to use Promise via Bluebird (https://github.com/petkaantonov/bluebird) which I don't know much about. Looking up the Beginner's Guide for BlueBird gives http://bluebirdjs.com/docs/beginners-guide.html which literally just shows
This article is partially or completely unfinished. You are welcome to create pull requests to help completing this article.
I am able to set up the library
var ytapi = require('node-youtubeapi-simplifier');
ytapi.setup('My Server Key');
As well as list some information about Bounty Killers
ytdata = [];
ytapi.playlistFunctions.getVideosForPlaylist('PLCCB0BFBF2BB4AB1D')
.then(function (data) {
for (var i = 0, len = data.length; i < len; i++) {
ytapi.videoFunctions.getDetailsForVideoIds([data[i].videoId])
.then(function (video) {
console.log(video);
// ytdata.push(video); <- Push a Bounty Killer Video
});
}
});
// console.log(ytdata); This gives []
Basically the above pulls the full playlist (normally there will be some pagination here depending on the length) then it takes the data from getVideosForPlaylist iterates the list and calls getDetailsForVideoIds for each YouTube video. All good here.
The issues arises with getting data out of this. I would like to push the video object to ytdata array and I'm unsure whether the empty array at the end is due to scoping or some out of sync such that console.log(ytdata) gets called before the API calls are finished.
How will I be able to get each Bounty Killer video into the ytdata array to be available globally?
console.log(ytdata) gets called before the API calls are finished
Spot on, that's exactly what's happening here, the API calls are async. Once you're using async functions, you must go the async way if you want to deal with the returned data. Your code could be written like this:
var ytapi = require('node-youtubeapi-simplifier');
ytapi.setup('My Server Key');
// this function return a promise you can "wait"
function getVideos() {
return ytapi.playlistFunctions
.getVideosForPlaylist('PLCCB0BFBF2BB4AB1D')
.then(function (videos) {
// extract all videoIds
var videoIds = videos.map(video => video.videoId);
// getDetailsForVideoIds is called with an array of videoIds
// and return a promise, one API call is enough
return ytapi.videoFunctions.getDetailsForVideoIds(videoIds);
});
}
getVideos().then(function (ydata) {
// this is the only place ydata is full of data
console.log(ydata);
});
I made use of ES6's arrow function in videos.map(video => video.videoId);, that should work if your nodejs is v4+.
console.log(ytdata) should be immediately AFTER your FOR loop. This data is NOT available until the promise is resolved and the FOR loop execution is complete and attempting to access it beforehand will give you an empty array.
(your current console.log is not working because that code is being executed immediately before the promise is resolved). Only code inside the THEN block is executed AFTER the promise is resolved.
If you NEED the data available NOW or ASAP and the requests for the videos is taking a long time then can you request 1 video at a time or on demand or on a separate thread (using a webworker maybe)? Can you implement caching?
Can you make the requests up front behind the scenes before the user even visits this page? (not sure this is a good idea but it is an idea)
Can you use video thumbnails (like youtube does) so that when the thumbnail is clicked then you start streaming and playing the video?
Some ideas ... Hope this helps
ytdata = [];
ytapi.playlistFunctions.getVideosForPlaylist('PLCCB0BFBF2BB4AB1D')
.then(function (data) {
// THE CODE INSIDE THIS THEN BLOCK IS EXECUTED WHEN ALL THE VIDEO IDS HAVE BEEN RETRIEVED AND ARE AVAILABLE
// YOU COULD SAVE THESE TO A DATASTORE IF YOU WANT
for (var i = 0, len = data.length; i < len; i++) {
var videoIds = [data[i].videoId];
ytapi.videoFunctions.getDetailsForVideoIds(videoIds)
.then(function (video) {
// THE CODE INSIDE THIS THEN BLOCK IS EXECUTED WHEN ALL THE DETAILS HAVE BEEN DOWNLOADED FOR ALL videoIds provided
// AGAIN YOU CAN DO WHATEVER YOU WANT WITH THESE DETAILS
// ALSO NOW THAT THE DATA IS AVAILABLE YOU MIGHT WANT TO HIDE THE LOADING ICON AND RENDER THE PAGE! AGAIN JUST AN IDEA, A DATA STORE WOULD PROVIDE FASTER ACCESS BUT YOU WOULD NEED TO UPDATE THE CACHE EVERY SO OFTEN
// ytdata.push(video); <- Push a Bounty Killer Video
});
// THE DETAILS FOR ANOTHER VIDEO BECOMES AVAILABLE AFTER EACH ITERATION OF THE FOR LOOP
}
// ALL THE DATA IS AVAILABLE WHEN THE FOR LOOP HAS COMPLETED
});
// This is executed immediately before YTAPI has responded.
// console.log(ytdata); This gives []

Continuous tasks in nodejs

i'am using nodejs with express for my webapp and i need to to run continuously
some code which checks if some data change and then update my mongodb.
How can i easily create a background process which runs the whole time together with the main task? So that the background task/process can inform the main task.
What i have tried already:
to solve this problem with a "setInterval" Function in the main process --> I works with no problem but think its not a good idea because it blocks the node event loop
Use child processes -> i could not found a good tutorial on them --> is there a easier method, perhaps a library which could help me?
some background worker libraries -->But do heavy-load tasks on the a child-process and finish but i need to do the work all the time
Update:
Final Solution:
UpdateEvent.js:
var events = require('events');
function Updater(time) {
this.time = time;
this.array = [
{number: 1},
{number: 2}
];
var that;
events.EventEmitter.call(this);
this.init = function()
{
that = this;
console.log("Contructor");
//Start interval
setInterval(that.run,that.time);
};
this.run = function()
{
that.array.forEach(function (item) {
if(item.number === 2)
{
that.emit('Event');
}
});
};
}
Updater.prototype.__proto__ = events.EventEmitter.prototype;
module.exports = Updater;
and then the code that uses it:
server.js:
var Updater = require('./UpdaterEvent');
var u = new Updater(10000);
u.init();
u.on('Event',function () {
console.log("Event catched!");
});
I followed the tutorial at:
http://www.sitepoint.com/nodejs-events-and-eventemitter/
The problem is the way you export your Updater constructor function:
exports.Updater = Updater;
When you require it, you do
var Updater = require('./UpdaterEvent');
and then try to run:
var u = new Updater(10000);
The problem is that you do not expose the function itself, but an object with a property called Updater which contains the function. Hence you either have to export it using
module.exports = Updater;
or you have to require it using:
var Updater = require('./UpdaterEvent').Updater;
Either way, then calling new Updater() will work. At the moment, you try to initialize a new object by calling an object instead of a constructor function, hence the error message:
TypeError: object is not a function
You should look into Events and EventEmitter
You could use child-process you don't really need to since JS is asynchronous. Just create a function for your background process and pass it your eventEmitter object. You can use setInterval or a while(true) loop to continuously check for the data change. When the data changes, call eventEmitter.emit('someEvent'); which will trigger a function in your main task to update your mongoDB.

Meteor client synchronous server database calls

I am building an application in Meteor that relies on real time updates from the database. The way Meteor has laid out the examples is to have the database call under the Template call. I've found that when dealing with medium sized datasets this becomes impractical. I am trying to move the request to the server, and have the results passed back to the client.
I have looked at similar questions on SA but have found no immediate answers.
Here is my server side function:
Meteor.methods({
"getTest" : function() {
var res = Data.find({}, { sort : { time : -1 }, limit : 10 });
var r = res.fetch();
return (r);
}
});
And client side:
Template.matches._matches = function() {
var res= {};
Meteor.call("getTest", function (error, result) {
res = result;
});
return res;
}
I have tried variations of the above code - returning in the callback function as one example. As far as I can tell, having a callback makes the function asynchronous, so it cannot be called onload (synchronously) and has to be invoked from the client.
I would like to pass all database queries server side to lighten the front end load. Is this possible in Meteor?
Thanks
The way to do this is to use subscriptions instead of remote method calls. See the counts-by-room example in the docs. So, for every database call you have a collection that exists client-side only. The server then decides the records in the collection using set and unset.

What's the right way to find out when a series of callbacks (fired from a loop) have all executed?

I'm new to Node.js and am curious what the prescribed methodology is for running a loop on a process (repeatedly) where at the end of the execution some next step is to take place, but ONLY after all the iterations' callbacks have fired.
Specifically I'm making SQL calls and I need to close the sql connection after making a bunch of inserts and updates, but since they're all asynchronous, I have no way of knowing when all of them have in fact completed, so that I can call end() on the session.
Obviously this is a problem that extends far beyond this particular example, so, I'm not looking for the specific solution regarding sql, but more the general practice, which so far, I'm kind of stumped by.
What I'm doing now is actually setting a global counter to the length of the loop object and decrementing from it in each callback to see when it reaches zero, but that feels REALLY klugy, and I'm hoping theres a more elegant (and Javascript-centric) way to achieve this monitoring.
TIA
There are a bunch of flow-control libraries available that apply patterns to help with this kind of thing. My favorite is async. If you wanted to run a bunch of SQL queries one after another in order, for instance, you might use series:
async.series([
function(cb) { sql.exec("SOME SQL", cb) },
function(cb) { sql.exec("SOME MORE SQL", cb) },
function(cb) { sql.exec("SOME OTHER SQL", cb) }
], function(err, results) {
// Here, one of two things are true:
// (1) one of the async functions passed in an error to its callback
// so async immediately calls this callback with a non-null "err" value
// (2) all of the async code is done, and "results" is
// an array of each of the results passed to the callbacks
});
I wrote my own queue library to do this (I'll publish it one of these days), basically push queries onto a queue (an array basically) execute each one as it's removed, have a callback take place when the array is empty.
It doesn't take much to do it.
*edit. I've added this example code. It isn't what I've used before and I haven't tried it in practice, but it should give you a starting point. There's a lot more you can do with the pattern.
One thing to note. Queueing effectively makes your actions synchronous, they happen one after another. I wrote my mysql queue script so I could execute queries on multiple tables asynchronously but on any one table in synch, so that inserts and selects happened in the order they were requested.
var queue = function() {
this.queue = [];
/**
* Allows you to pass a callback to run, which is executed at the end
* This example uses a pattern where errors are returned from the
* functions added to the queue and then are passed to the callback
* for handling.
*/
this.run = function(callback){
var i = 0;
var errors = [];
while (this.queue.length > 0) {
errors[errors.length] = this.queue[i]();
delete this.queue[i];
i++;
}
callback(errors);
}
this.addToQueue = function(callback){
this.queue[this.queue.length] = callback;
}
}
use:
var q = new queue();
q.addToQueue(function(){
setTimeout(function(){alert('1');}, 100);
});
q.addToQueue(function(){
setTimeout(function(){alert('2');}, 50);
});
q.run();

Resources