Promise issue with nested loops - node.js

I'm trying to write a server in node, and I am taking data from a SQL database, I have another function that does that, but inside this function I have that list of data (thisData). In my server I need to query another server (offsite) to get a status update, update objects in the list based on this, and then respond with the JSON data.
I've been working on this for a long time and tried to find solutions based on other answers on here and I've yet to find something. I was originally trying to do this with just callbacks, but now I'm trying to do it with Promise.all()
I have my object thisData, it is a list of three "areas" which each contain a list of objects.
Both of these code snippets are inside a standard listener function for the standard http lib:
var server = http.createServer((request, response => {...
Immediately inside this function I also have a call to get the data:
getData(function(Data){
thisData = JSON.parse(data);
all the code snippets are inside this getData which is inside my listener function.
I also have a function that returns a promise to do the async web query:
var statusRequest = function(myArea, obj){
var url = 'https://www.example.com/data/' + obj.id + '/mode.json';
var options = {
method: 'GET',
uri: url,
strictSSL: false,
headers: {
'Authorization': config.auth_key,
'Accept': "application/json",
}
}
return new Promise((resolve, reject) => {
request(options, function (err, body, resp) {
var status = JSON.parse(resp).mode;
if (status == "success") thisData[area][obj.col].status = 0;
else if (status == "fail") thisData[area][obj.col].status = 1;
else thisData[area][obj.col].status = -1;
resolve(thisData[area][obj.col].status);
});
});
}
Then I have a loop that traverses the areas and objects, and I add each promise to a list and call Promise.all(), then I respond with the updated list:
var promises = [];
Object.keys(thisData).forEach(function(area){
for (var col = 0; col < thisData[area].length; col++){
promises.push(statusRequest(area, thisData[area][col]));
}
});
Promise.all(promises).then( (val) => {
console.log(val);
response.end(JSON.stringify(thisData));
});
I was initially having problems with all the promises resolving when col == thisData[area].length, and so I passed the actual object (which contains its own column information) into the statusRequest function so it shouldn't matter if the col variable changes.
The problem I'm having is that my console prints the correct status, but my JSON response does not contain the correct status, in fact it seems as though none of them are updating.
I would really appreciate any help, thanks in advance!

Related

Request to API within while loop with accumulator

Many people have asked on this site how to loop through a list of URLs and make a GET request to each of them. This doesn't exactly serve my purpose, as the number of times I make a GET request will be dependent on the values I get from the initial API request.
As a general outline of what I currently have:
var total = null;
var curr = 0;
while (total == null || cur < total) {
request.get('https://host.com/skip=' + curr, function(error, response, body) {
var data = JSON.parse(body);
total = data['totalItems'];
curr += data.items.length;
}
}
Due to Node.js and how it uses asynchronous requests, this gives me a forever loop, as total and cur always stay as null and 0 respectively. I'm not really sure how to rework this to use Promises and callbacks, can someone please help?
So there's a few ways to do this, but the easiest is probably to just recurse on the function that fetches the results.
It's not tested but should be in the ballpark:
function fetch(skip, accumulator, cb) {
// do some input sanitization
request.get('https://host.com/skip=' + skip, (err, res, body) => {
// commonly you'd just callback the error, but this is in case you've fetched a number of results already but then got an error.
if(err) return cb(err, accumulator);
var data = JSON.parse(body);
accumulator.total: data['totalItems'];
accumulator.items.concat(data.items);
if(accumulator.items.length === accumulator.total) return cb(null, accumulator);
return fetch(accumulator.items.length, accumulator, cb);
});
}
fetch(0, { items: [] }, console.log);

Node.js: given array of URLs, determine which are valid

I am a total scrub with the node http module and having some trouble.
The ultimate goal here is to take a huge list of urls, figure out which are valid and then scrape those pages for certain data. So step one is figuring out if a URL is valid and this simple exercise is baffling me.
say we have an array allURLs:
["www.yahoo.com", "www.stackoverflow.com", "www.sdfhksdjfksjdhg.net"]
The goal is to iterate this array, make a get request to each and if a response comes in, add the link to a list of workingURLs (for now just another array), else it goes to a list brokenURLs.
var workingURLs = [];
var brokenURLs = [];
for (var i = 0; i < allURLs.length; i++) {
var url = allURLs[i];
var req = http.get(url, function (res) {
if (res) {
workingURLs.push(?????); // How to derive URL from response?
}
});
req.on('error', function (e) {
brokenURLs.push(e.host);
});
}
what I don't know is how to properly obtain the url from the request/ response object itself, or really how to structure this kind of async code - because again, I am a nodejs scrub :(
For most websites using res.headers.location works, but there are times when the headers do not have this property and that will cause problems for me later on. Also I've tried console logging the response object itself and that was a messy and fruitless endeavor
I have tried pushing the url variable to workingURLs, but by the time any response comes back that would trigger the push, the for loop is already over and url is forever pointing to the final element of the allURLs array.
Thanks to anyone who can help
You need to closure url value to have access to it and protect it from changes on next loop iteration.
For example:
(function(url){
// use url here
})(allUrls[i]);
Most simple solution for this is use forEach instead of for.
allURLs.forEach(function(url){
//....
});
Promisified solution allows you to get a moment when work is done:
var http = require('http');
var allURLs = [
"http://www.yahoo.com/",
"http://www.stackoverflow.com/",
"http://www.sdfhksdjfksjdhg.net/"
];
var workingURLs = [];
var brokenURLs = [];
var promises = allURLs.map(url => validateUrl(url)
.then(res => (res?workingURLs:brokenURLs).push(url)));
Promise.all(promises).then(() => {
console.log(workingURLs, brokenURLs);
});
// ----
function validateUrl(url) {
return new Promise((ok, fail) => {
http.get(url, res => return ok(res.statusCode == 200))
.on('error', e => ok(false));
});
}
// Prevent nodejs from exit, don't need if any server listen.
var t = setTimeout(() => { console.log('Time is over'); }, 1000).ref();
You can use something like this (Not tested):
const arr = ["", "/a", "", ""];
Promise.all(arr.map(fetch)
.then(responses=>responses.filter(res=> res.ok).map(res=>res.url))
.then(workingUrls=>{
console.log(workingUrls);
console.log(arr.filter(url=> workingUrls.indexOf(url) == -1 ))
});
EDITED
Working fiddle (Note that you can't do request to another site in the browser because of Cross domain).
UPDATED with #vp_arth suggestions
const arr = ["/", "/a", "/", "/"];
let working=[], notWorking=[],
find = url=> fetch(url)
.then(res=> res.ok ?
working.push(res.url) && res : notWorking.push(res.url) && res);
Promise.all(arr.map(find))
.then(responses=>{
console.log('woking', working, 'notWorking', notWorking);
/* Do whatever with the responses if needed */
});
Fiddle

Perform arbitrary set of asynchronous tasks

My input is streamed from another source, which makes it difficult to use async.forEach. I am pulling data from an API endpoint, but I have a limit of 1000 objects per request to the endpoint, and I need to get hundreds of thousands of them (basically all of them) and I will know they're finished when the response contains < 1000 objects. Now, I have tried this approach:
/* List all deposits */
var depositsAll = [];
var depositsIteration = [];
async.doWhilst(this._post(endpoint_path, function (err, response) {
// check err
/* Loop through the data and gather only the deposits */
for (var key in response) {
//do some stuff
}
depositsAll += depositsIteration;
return callback(null, depositsAll);
}, {limit: 1000, offset: 0, sort: 'desc'}),
response.length > 1000, function (err, depositsAll) {
// check for err
// return the complete result
return callback(null, depositsAll);
});
With this code I get an async internal error that iterator is not a function. But in general I am almost sure the logic is not correct as well.
If it's not clear what I'm trying to achieve - I need to perform a request multiple times, and add the response data to a result that at the end contains all the results, so I can return it. And I need to perform requests until the response contains less than 1000 objects.
I also looked into async.queue but could not get the hang of it...
Any ideas?
You should be able to do it like that, but if that example is from your real code you have misunderstood some of how async works. doWhilst takes three arguments, each of them being a function:
The function to be called by async. Gets argument callback that must be called. In your case, you need to wrap this._post inside another function.
The test function (you would give value of response.length > 1000, ie. a boolean, if response would be defined)
The final function to be called once execution is stopped
Example with each needed function separated for readability:
var depositsAll = [];
var responseLength = 1000;
var self = this;
var post = function(asyncCb) {
self._post(endpoint_path, function(err, res) {
...
responseLength = res.length;
asyncCb(err, depositsAll);
});
}
var check = function() {
return responseLength >= 1000;
};
var done = function(err, deposits) {
console.log(deposits);
};
async.doWhilst(post, check, done);

Getting original request object during multiple asynchronous calls in nodejs-request

I have multiple HTTP requests in a nodejs app that each returns a word of a sentence. The replies will come at different times, so I'm saving them in a dictionary, with the key being the original sentence's word index. Problem is, when I access the request object, I only get the last one.
var completed_requests = 0;
sentence = req.query.sentence;
sentence = "sentence to be translated"
responses=[];
words = sentence.split(" ");
for(j=0;j<words.length;j++){
var word = words[j];
var data={
word:word
};
var options = {
url: 'example.com',
form:data,
index:j
};
request.post(options, function(err,httpResponse,body){
options = options;
if(!err){
responses.push({options.index: body});
completed_requests+=1;
if(completed_requests==words.length){
var a="";
for(var k=0;k<words.length;k++){
a+=responses[k]+" ";
}
res.render('pages/index', { something: a });
}
}
else{
//err
}
});
}
Basically, when I access the object.index object, the object returned isn't the one used for the original request, but the last one (for some reason). How should I resolve this?
When we take a look at how the code is evaluated by JavaScript due to it's async nature in node.js the problem becomes obvious:
For the first word the loop for(j=0;j<words.length;j++){ is executed.
The value of j is assigned to options.index. For the loop run this options.index has now the value 0.
request.post(options, function(err,httpResponse,body){ is executed but the callback handler will be invoked later.
For the first word the loop for(j=0;j<words.length;j++){ is executed.
The value of j is assigned to options.index. options.index has now the value 1.
request.post(options, function(err,httpResponse,body){ is executed but the callback handler will be invoked later.
The problem becomes obvious now since no new options objects are created but the value of j is assigned to options.index in every loop run. When the first callback handler is invoked options.index has the value words.length - 1.
To fix the problem we will wrap creating the options object in a function executeRequest
var completed_requests = 0;
sentence = req.query.sentence;
sentence = "sentence to be translated"
responses=[];
words = sentence.split(" ");
for(j=0;j<words.length;j++){
var word = words[j];
var data={
word:word
};
function executeRequest(url, form, index) {
var options = {
url: url,
form: form,
index: index
};
request.post(options, function(err,httpResponse,body){
// options = options; Superfluous
if(!err){
responses.push({ [index]: body});
completed_requests+=1;
if(completed_requests==words.length){
var a="";
for(var k=0;k<words.length;k++){
a+=responses[k]+" ";
}
res.render('pages/index', { something: a });
}
}
else{
//err
}
});
}
executeRequest('example.com', data, j);
}
A good read about scoping and hoisting in JavaScript can be found here http://www.adequatelygood.com/JavaScript-Scoping-and-Hoisting.html
You need to use an async routine such as forEach or map, also I suggest you read up on the async nature of node to help understand how to handle callbacks for io.

Nodejs: Resolving promises with generator function

I know there are a lot of good examples over the web and I read a lot of them, but currently I'm stucked with resolving promises with the new functionality of generators in nodejs 0.11.x.
For e.g. I have the following function:
SolrBaseDomain.prototype.promisedQuery = function(query, callback) {
var solrClient = solr.createClient(this.configuration);
var defer = Q.defer();
solrClient.search(query, function(err,obj){
if (!err) {
if (obj.response.numFound > 0) {
defer.resolve(obj.response.docs);
} else {
defer.resolve(null);
}
} else {
defer.reject(err);
}
});
var promise = defer.promise;
return Q.async(function* (){
var result = yield promise;
return result;
});
};
I expected that every call to this method will wait until the promise is fullfilled and the return-statement gives back the result of the promise.
But currently it seems that instead the code inside "Q.async..." will not be executed or the async call arrives after the return statement of the method was executed.
It's strange, in every example I know, this is one of the recommended ways in order to wait for async calls in nodejs but currently it does not work for me.
I've tried a lot of different variations of the above example, but the result is everytime the same, I get not back a valid result.
I have nodejs installed in version 0.11.10 and the --harmony-flag is set, when the code is executede.
Can anyone point me to right direction? I'm wondering if I oversee something ... :)
Thanks for your feedback.
Best regards
Udo
I expected that every call to this method will wait until the promise is fullfilled and the return-statement gives back the result of the promise.
No. Generators will not make functions synchronous, you cannot (and don't want to) block while waiting for a result. When calling a generator function and running sequentially through the async steps that it yields, the result you will get back in the end is still asynchronous - and therefore a promise. Only inside of the generator, your code can use synchronous control flow and yield.
This means that the (then-) callback-based code
SolrBaseDomain.prototype.promisedQuery = function(query) {
var promise = Q.ninvoke(solr.createClient(this.configuration), "search", query);
return promise.then(function(obj) {
if (obj.response.numFound > 0) {
return obj.response.docs;
} else {
return null;
}
});
};
becomes
SolrBaseDomain.prototype.promisedQuery = Q.async(function* (query) {
var promise = Q.ninvoke(solr.createClient(this.configuration), "search", query);
var obj = yield promise;
// ^^^^^
if (obj.response.numFound > 0) {
return obj.response.docs;
} else {
return null;
}
});
Try this
SolrBaseDomain.prototype.promisedQuery = Q.async(function*(query) {
var solrClient = solr.createClient(this.configuration);
var obj = yield Q.ninvoke(solrClient, "search", query);
return obj.response.numFound > 0 ? obj.response.docs : null;
});
This does the same thing for promises as this does for callbacks:
SolrBaseDomain.prototype.query = function (query, callback) {
var solrClient = solr.createClient(this.configuration);
solrClient.search(query, function(err, obj) {
if (err) return callback(err);
callback(null, obj.response.numFound > 0 ? obj.response.docs : null);
});
};
Therefore if the first return a promise that resolves to undefined so will the callback version call the callback with undefined.
according to your suggestions, my code looks now like this:
...
SolrBaseDomain.prototype.query = Q.async(function* (query) {
var solrClient = solr.createClient(this.configuration);
var obj = yield Q.ninvoke(solrClient, "search", query);
return obj.response.numFound > 0 ? obj.response.docs : null;
});
...
I share the above query-function over all data access layers in order to have a central method which is querying the different indexes in an asynchronous way.
For e.g. in the domain data access layer, the code which deals with that function looks like this:
SolrHostDomain.prototype.getByName = Q.async(function* (domain) {
var queryObject = {
"domain": domain
};
var query = this.getQuery("byName", queryObject);
var docs = yield this.query(query);
var domain = null;
if (docs != null && docs.length > 0) {
domain = this.dataMapper.merge(docs[0]);
}
return domain;});
Currently I'm not sure if the generator in the "getByName"-function is necessary, but it seems to work. Dealing with promises is some unclear concept for me, since I'm new to nodejs.
So maybe, if you can help me on that topic and point me in the right direction, this would be helpfull.
The main question for me is, how can I ensure, that a synchronous method can call an asynchronous method and get back not a promise but the final result of this promise.
I've searched a long time, but I could not find a good documentation which describes the use of generator functions or promises in conjunction with synchronous calls. Even examples are focusing only of using the mechanism but not working together with synchronous function.
Best regards and many thanks for your help
Udo
Got it!!!
After a few trial and errors, I think I got it now and I have a working solutions:
Query function:
SolrBaseDomain.prototype.query = Q.async(function* (query) {
var solrClient = solr.createClient(this.configuration);
var obj = yield Q.ninvoke(solrClient, "search", query);
return obj.response.numFound > 0 ? obj.response.docs : null;
});
Calling method:
SolrHostDomain.prototype.getByName = function(domain) {
var queryObject = {
"domain": domain
};
var query = this.getQuery("byName", queryObject);
var docsPromise = this.query(query);
var _self = this;
return docsPromise.then(function(docs) {
var domain = null;
if (docs != null && docs.length > 0) {
domain = _self.dataMapper.merge(docs[0]);
}
return domain;
});
};
The solution was to understand, that the "query"-method still returns a promise instead of the concrete result even if yield is used.
So I have to add every code which is working on the result of the promise within the "then"-functions (or "done" if no other caller up in the calling hierarchy of methods will follow).
After the settlement of the promise, each code which is set within the "then"-functions will be processed.
BR
Udo

Resources