async parallel request - partial render - node.js

What is the proper way to partially render a view following an async parallel request?
Currently I am doing the following
// an example using an object instead of an array
async.parallel({
one: function(callback){
setTimeout(function(){
callback(null, 1);
// can I partially merge the results and render here?
}, 200);
},
two: function(callback){
setTimeout(function(){
callback(null, 2);
// can I partially merge the results and render here?
}, 100);
}
},
function(err, results) {
// results is now equals to: {one: 1, two: 2}
// merge the results and render a view
res.render('mypage.ejs', { title: 'Results'});
});
It is basically working fine, but, if I have a function1, function2, ..., functionN the view will be rendered only when the slowest function will have completed.
I would like to find the proper way to be able to render the view as soon as the first function is returning to minimise the user delay, and add the results of the function as soon as they are available.

what you want is facebook's bigpipe: https://www.facebook.com/note.php?note_id=389414033919. fortunately, this is easy with nodejs because streaming is built in. unfortunately, template systems are bad at this because async templates are a pain in the butt. however, this is much better than doing any additional AJAX requests.
basic idea is you first send a layout:
res.render('layout.ejs', function (err, html) {
if (err) return next(err)
res.setHeader('Content-Type', 'text/html; charset=utf-8')
res.write(html.replace('</body></html>', ''))
// Ends the response.
// `writePartials` should not return anything in the callback!
writePartials(res.end.bind(res, '</body></html>'))
})
you can't send </body></html> because your document isn't finished. then writePartials would be a bunch of async functions (partials or pagelets) executed in parallel.
function writePartials(callback) {
async.parallel([partial1, partial2, partial3], callback)
})
Note: since you've already written a response, there's not much you can do with errors except log them.
What each partial will do is send inline javascript to the client. For example, the layout can have .stream, and the pagelet will replace .stream's innerHTML upon arrival, or when "the callback finishes".
function partialStream(callback) {
res.render('stream.partial.ejs', function (err, html) {
// Don't return the error in the callback
// You may want to display an error message or something instead
if (err) {
console.error(err.stack)
callback()
return
}
res.write('<script>document.querySelector(".stream").innerHTML = ' +
JSON.stringify(html) + ';</script>')
callback()
})
})
Personally, I have .stream.placeholder and replace it with a new .stream element. The reason is I basically do .placeholder, .placeholder ~ * {display: none} so things don't jump around the page. However, this requires a DIY front-end framework since suddenly the JS gets more complciated.
There, your response is now streaming. Only requirement is that the client supports Javascript.

I think you can't do it just on the backend.
To minimise users' delay you need to send the minimal page to the browser and then to request the rest of the information from the browser via AJAX. Another approach to minimising delays is to send all templates to the browser on the first page load, together with the rendered page, and render all the pages in browser based on the data you request from the server. That's the way I do it. The beauty of nodejs is that you can use the same templating engine both in the backend and frontend and also share the modules.
If your page is composed in such a way that the slow information is further in HTML than the fast information, you can write response partially without using res.render (that renders complete page) and use res.write instead. I don't think though that this approach deserves serious attention as you would stuck with it sooner than you notice...

Related

Recommended pattern to page through API response until exhausted?

I'm new to Node and the async programming model. I'm having problems dealing with a simple requirement that seems pretty basic in synchronous environments: paging through an API response until the response is empty.
More specifically, the API, on a successful call, will return data and a status of 200 or 206 (partial content). If I see the 206 response, I need to keep making calls to the API (also sending a page query param that I increment each time) until I see the 200 response.
In a synchronous language, the task will be a piece of cake:
// pseudocode
data = []
page = 1
do {
response = api.call(page)
data.append(response.data)
page++
} while (response != 200)
return data
Now, in Node, for a single api call, code like this will work:
// fire when '/' has a GET request
app.get('/', (req, res) => {
axios.get('https://api.com/v1/cats')
.then(response => {
// now what??
});
});
});
See the //now what?? comment? That's the point where I'm wondering how to proceed. I came across this somewhat-relevant post but am not able to convert this to a format that will work for me in Node and Axios.
Is it enough to just wrap the axios code in a separate function? I don't think so, because if I do this:
function getData(pageNum) {
axios.get('https://api.com/v1/cats')
.then(response => {
// now what??
});
});
}
I can't rely on a return value because as soon axios.get() gets executed, the function will be over. I can call getData() again after I get the first response, but then, suppose I want to return all the data from these multiple calls as the HTTP response from my Express server . . . how do I do that?
I hope I will not get downvoted for laziness or something. I've really looked around but not found anything relevant.
First, a counter-question: Is the data set so big that you need to worry about using up all the memory? Because if so then it will take more work to structure your code in a way that streams the data all the way through. (In fact I'm not even sure whether express allows streaming... you are using express aren't you?)
From the axios documentation, it looks like response is a readable stream which provides the response body. So reading it is also an asynchronous task. So you should write a function that does that. See the "Stream" page of the nodejs docs for more details. Or I could be persuaded to help with that too, time permitting. But for now, I'll assume you have a function readResponse, which takes an axios response object as an argument and returns a promise, and the promise resolves to an object such as { statusCode: 206, result: ['thing1', 'thing2'] }. I'll also assume that your goal is to get all the result arrays and concatenate them together to get e.g. ['thing1', 'thing2', 'thing3', 'thing4', 'thing5', 'thing6'].
You could write a self-calling version of your getData function. This will retrieve all data from a given page onwards (not just the page itself):
function getData(pageNum) {
axios.get('https://api.com/v1/cats' + (pageNum ? '?page=' + pageNum) : '')
.then(readResponse)
.then(function(parsedResponse) {
if(parsedResponse.statusCode == 200) {
return parsedResponse.result;
} else if(parsedResponse.statusCode == 206) {
return getData(pageNum + 1).then(function(laterData) {
return parsedResponse.result.concat(laterData);
});
} else {
// error handling here, throw an exception or return a failing promise.
}
});
});
}
Then, to get all data, just call this function with pageNum = 0:
// fire when '/' has a GET request
app.get('/', (req, res) => {
getData(0)
.then(function(results) {
// results is now the array you want.
var response = JSON.stringify(results); // or whatever you're doing to serialise your data
res.send(response);
});
});

Receiving 2 HTTP requests on the server when only 1 sent

I am creating an app and using http://c9.io environment to develop it. It is a NodeJS app, which provides some REST endpoints for the client side application to query. Till now, everything was running fine, and today what I observe is that for 1 call sent by the browser to the REST API, 2 requests are being shown as received, and the request handler is being called 2 times. This has slowed the response time for one request.
In Chrome developer tools, it shows only one request sent, however, I am using app.use() to log incoming requests in Express and it prints the same 2 times for each request. Also, the handler is called twice.
This is happening intermittently, not every time. I am behind a corporate network. As I have sent a lot of requests in the day for testing, is there any chance that a monitoring program is sending the requests since it finds it suspicious? I have not edited the code that handles the requests.
Edit: Adding the code for handlers as suggested.
app.get('/suggestions/:keyword', function(r, s) {
sug_db.retrieveSuggestions(r.params.keyword, function(data) {
s.writeHead(200, {'content-type': 'text/html'});
s.write(renderSugg({data: data}))
s.end();
});
});
app.get('/search/:query', function(r, s) {
esc_db.search(r.params.query, function(data) {
s.send(renderResults({query: r.params.query, results:data}));
});
});
As you can see, they do nothing but get some data from a database and return the result as HTTP response. The templating engine I am using is Pug (formerly Jade)
It doesn't look like that code that you included in the question can be guilty of running twice. But maybe some code in sug_db.retrieveSuggestions or esc_db.search does that.
What I would do is this:
Add some logging inside the code that you provided, both before calling the functions and inside the callback:
app.get('/suggestions/:keyword', function(r, s) {
console.log('*** GET /suggestions/:keyword handler');
sug_db.retrieveSuggestions(r.params.keyword, function(data) {
console.log('GET /suggestions/:keyword callback');
s.writeHead(200, {'content-type': 'text/html'});
s.write(renderSugg({data: data}))
s.end();
});
});
app.get('/search/:query', function(r, s) {
console.log('*** GET /search/:query handler');
esc_db.search(r.params.query, function(data) {
console.log('GET /search/:query callback');
s.send(renderResults({query: r.params.query, results:data}));
});
});
(or change console.log to whatever method of logging you use).
I would see what is actually called twice - the handlers themselves, or the callbacks, or none. Next would be examination of the functions that are actually called by the handlers:
sug_db.retrieveSuggestions()
esc_db.search()
renderSugg()
renderResults()
It's important to see what is actually called twice and then examine why it can be happening. But it can happen if, for example, you do something like:
function badFunction(data, callback) {
if (something) {
callback('error');
}
callback('ok');
}
instead of:
function goodFunction(data, callback) {
if (something) {
callback('error');
} else {
callback('ok');
}
}
I would expect that the functions that are called from the handlers could do something like that to call the callback twice - and maybe the condition or error that they checking didn't happen before but happens now, causing the change in behavior.

Display dynamically an image using express and EJS

I have a collection containing different URLs of images. I retrieve the URL I want and want to pass it to the jade template like:
app.get('/',function(req,res){
mongoDB.getUsedHomePageOne(function(err, result){
if(!err){
console.log("getUsedHomePageOne : ");
console.log(result);
app.locals['homePageImg'] = result.url;
}
});
app.render('userPageEjs.html',function(err,renderedData){
console.log(renderedData);
res.send(renderedData);
});
});
and the getUsedHomePageOne looks like:
DBMongo.prototype.getUsedHomePageOne = function(callback){
this.homePageColl.findOne({used:1}, callback);
};
and in the jade template:
<img src="<%= homePageImg %>"/>
So this won't work except if I load twice the page, I assume because it gets cached and is computed quickly enough or something.
What is the proper way of doing it?
PS: the 2nd time I load the page, everything will load correctly.
PS2: I don't want to delay the rendering for the image, I would like to load the image once it is ready, but render the HTML page before anyway.
From what I've gathered in your code:
app.get('/',function(req,res){
mongoDB.getUsedHomePageOne(function(err, result){
if(!err){
console.log("getUsedHomePageOne : ");
console.log(result);
app.locals['homePageImg'] = result.url;
app.render('userPageEjs.html',function(err,renderedData){
console.log(renderedData);
res.send(renderedData);
});
}
});
});
Basically, you have an async function to the DB and you quickly render the template before waiting for the DB function to complete. The normal pattern when using async functions whose results should be used down the line, you have to call the next function inside the async function. However, this might lead to callback hell (similar to how I've written the fix above), so an alternative like Promises or async.js is usually preferred.

How do I make HTTP requests inside a loop in NodeJS

I'm writing a command line script in Node (because I know JS and suck at Bash + I need jQuery for navigating through DOM)… right now I'm reading an input file and I iterate over each line.
How do I go about making one HTTP request (GET) per line so that I can load the resulting string with jQuery and extract the information I need from each page?
I've tried using the NPM httpsync package… so I could make one blocking GET call per line of my input file but it doesn't support HTTPS and of course the service I'm hitting only supports HTTPS.
Thanks!
A good way to handle a large number of jobs in a conrolled manner is the async queue.
I also recommend you look at request for making HTTP requests and cheerio for dealing with the HTML you get.
Putting these together, you get something like:
var q = async.queue(function (task, done) {
request(task.url, function(err, res, body) {
if (err) return done(err);
if (res.statusCode != 200) return done(res.statusCode);
var $ = cheerio.load(body);
// ...
done();
});
}, 5);
Then add all your URLs to the queue:
q.push({ url: 'https://www.example.com/some/url' });
// ...
I would most likely use the async library's function eachLimit function. That will allow you to throttle the number of active connections as well as getting a callback for when all the operations are done.
async.eachLimit(urls, function(url, done) {
request(url, function(err, res, body) {
// do something
done();
});
}, 5, function(err) {
// do something
console.log('all done!');
})
I was worried about making a million simultaneous requests without putting in some kind of throttling/limiting the number of concurrent connections, but it seems like Node is throttling me "out of the box" to something around 5-6 concurrent connections.
This is perfect, as it lets me keep my code a lot simpler while also fully leveraging the inherent asynchrony of Node.

Block function whilst waiting for response

I've got a NodeJS app i'm building (using Sails, but i guess that's irrelevant).
In my action, i have a number of requests to other services, datasources etc that i need to load up. However, because of the huge dependency on callbacks, my code is still executing long after the action has returned the HTML.
I must be missing something silly (or not quite getting the whole async thing) but how on earth do i stop my action from finishing until i have all my data ready to render the view?!
Cheers
I'd recommend getting very intimate with the async library
The docs are pretty good with that link above, but it basically boils down to a bunch of very handy calls like:
async.parallel([
function(){ ... },
function(){ ... }
], callback);
async.series([
function(){ ... },
function(){ ... }
]);
Node is inherently async, you need to learn to love it.
It's hard to tell exactly what the problem is but here is a guess. Assuming you have only one external call your code should look like this:
exports.myController = function(req, res) {
longExternalCallOne(someparams, function(result) {
// you must render your view inside the callback
res.render('someview', {data: result});
});
// do not render here as you don't have the result yet.
}
If you have more than two external calls your code will looks like this:
exports.myController = function(req, res) {
longExternalCallOne(someparams, function(result1) {
longExternalCallTwo(someparams, function(result2) {
// you must render your view inside the most inner callback
data = {some combination of result1 and result2};
res.render('someview', {data: data });
});
// do not render here since you don't have result2 yet
});
// do not render here either as you don't have neither result1 nor result2 yet.
}
As you can see, once you have more than one long running async call things start to get tricky. The code above is just for illustration purposes. If your second callback depends on the first one then you need something like it, but if longExternalCallOne and longExternalTwo are independent of each other you should be using a library like async to help parallelize the requests https://github.com/caolan/async
You cannot stop your code. All you can do is check in all callbacks if everything is completed. If yes, go on with your code. If no, wait for the next callback and check again.
You should not stop your code, but rather render your view in your other resources callback, so you wait for your resource to be reached before rendering. That's the common pattern in node.js.
If you have to wait for several callbacks to be called, you can check manually each time one is called if the others have been called too (with simple bool for example), and call your render function if yes. Or you can use async or other cool libraries which will make the task easier. Promises (with the bluebird library) could be an option too.
I am guessing here, since there is no code example, but you might be running into something like this:
// let's say you have a function, you pass it an argument and callback
function myFunction(arg, callback) {
// now you do something asynchronous with the argument
doSomethingAsyncWithArg(arg, function() {
// now you've got your arg formatted or whatever, render result
res.render('someView', {arg: arg});
// now do the callback
callback();
// but you also have stuff here!
doSomethingElse();
});
});
So, after you render, your code keeps running. How to prevent it? return from there.
return callback();
Now your inner function will stop processing after it calls callback.

Resources