Axios GET is sending the same url multiple times in Promise chain - node.js

I have a Promise chain that runs like this:
// this part is not meant to be syntactically correct
axios.get(<rest_api_that_queries_a_list_of_car_models>).then(res => {
// loop thru list and call a custom module promise
for (...) {
mymodule.getSomething(args).then(res => {
axios.post(<rest_write_to_db>).then(res => {
//we're done
....
// in mymodule
function getSomething(args) {
return getAnotherThing(args).then(res => {
// do stuff
return aThing
...
function getAnotherThing(args) {
return getThatThing(args).then(res => {
// see if pagination is greater than 1 page
if (pages == 1)
return res
let promises = [res]
for (x=2;x<pages;x++) {
// change args
promises.push( getThatThing(args))
}
return Promise.all(promises)
}).then(allres => {
return allres
})
...
// this is where it's breaking. this part is syntactically accurate
function getThatThing(args) {
let params = Object.assign(BASE_PARAMS, args.params)
console.log(args.params.model) // this logs prints a different model everytime
return axios.get(URL, {
headers: {
"Accept": ACCEPT,
"Content-Type":CONTENT_TYPE,
},
params: params
}).then (response => {
console.log(response.request.path) // this path includes the last key only everytime. so if there are 10 car models, this will search for the last model 10 times.
let result = response.data
return result
}).catch(function (error) {
console.log("search error:",error);
return error.response.data.errorMessage[0].error[0].message[0]
})
}
So basically the issue is that the axios.get command in the last function is using the same get parameters even tho I'm printing different parameters right before I make the call. I don't see how that is possible.

I was able to fix the issue by changing this line
let params = Object.assign(BASE_PARAMS, args.params)
to this
let params = {...BASE_PARAMS, ...args.params}
I can't really tell you why this fixed it. I'm assuming the Object.assign set the value to params on a global level. Perhaps someone else could provide more insight.

Related

return response data from async call

I created this function to get list all my drives from GDrive.
async getAllDrives(token) {
let nextPageToken = ""
let resultArray = []
const config= {
headers: {
Authorization: `Bearer ${token}`
}
};
const bodyParams = {
pageSize: 2,
fields: 'nextPageToken, drives(id, name)',
q:`hidden=false`,
};
do {
axios.get(
`https://www.googleapis.com/drive/v3/drives`,
config,
bodyParams,
).then(result => {
nextPageToken = result.data.nextPageToken;
resultArray.push(result.data.drives);
resultArray = resultArray.flat();
console.log("result", resultArray);
}).catch(error => {
console.log(error);
//res.send(error);
});
}while(nextPageToken);
resultArray = resultArray.flat();
resultArray.map(drive => {
drive.isSharedDrive = true;
return drive;
});
return JSON.stringify(resultArray);
}
When I look in console.log
then(result => {
nextPageToken = result.data.nextPageToken;
resultArray.push(result.data.drives);
resultArray = resultArray.flat();
console.log("result", resultArray);
})
I have the expected result,
result [
{
kind: 'drive#drive',
id: '**',
name: ' ★ 🌩'
},
]
but return JSON.stringify(resultArray); is empty.
I found a similar question here, How do I return the response from an asynchronous call? but the answer is not satisfying.
You used the async call slightly incorrectly. You calling axios.get without await keyword, but with .then chaining. Since you don't wait for result to return, you getting empty array first, returning you nothing. And only then your callback function inside .then is getting called. To simplify, you doing this in your example:
function getAllDrives() {
// Local variable where you want your result
let result = [];
// You calling the axios.get method, but don't wait for result
axios.get().then(result => {})
// Empty result is getting returned immediately
return result;
}
And when response is returned from the remote server, function inside .then trying to save result to local variable. But function is already completed, so you don't get anything.
What you actually should do is call axios.get with await keyword:
// You should always cover your asynchronous code with a try/catch block
try {
// Instead of `then` callback use `await` keyword. Promise returned from
// this method will contain result. If error occurs, it will be thrown,
// and you can catch it inside `catch`.
const result = await axios.get(
`https://www.googleapis.com/drive/v3/drives`,
config,
bodyParams
);
// Here is your code as you wrote it inside `then` callback
nextPageToken = result.data.nextPageToken;
resultArray.push(result.data.drives);
resultArray = resultArray.flat();
console.log("result", resultArray);
} catch (error) {
// And here is your error handling code as you wrote it inside `catch`
console.log(error);
}
This way your method will not complete until your request is not executed.
You can read more about async/await functions here.
I believe your goal is as follows.
You want to retrieve the drive list using axios.
Your access token can be used for retrieving the drive list using Drive API.
Modification points:
In order to use nextPageToken in the request, in this case, it is required to run the script with a synchronous process. So, async/await is used. This has already been mentioned in the existing answers.
When I saw your script, I thought that the query parameter might be required to be included in the 2nd argument of axios.get().
In order to use nextPageToken, it is required to include the property of pageToken. In your script, pageToken is not used. By this, the infinite loop occurs because nextPageToken is continued to be returned.
When these points are reflected in your script, how about the following modification?
Modified script:
let resultArray = [];
const config = {
headers: {
Authorization: `Bearer ${token}`,
},
params: {
pageSize: 2,
fields: "nextPageToken, drives(id, name)",
q: `hidden=false`,
pageToken: "",
},
};
do {
const { data } = await axios
.get(`https://www.googleapis.com/drive/v3/drives`, config)
.catch((error) => {
if (error.response) {
console.log(error.response.status);
console.log(error.response.data);
}
});
if (data.drives.length > 0) {
resultArray = [...resultArray, ...data.drives];
}
nextPageToken = data.nextPageToken;
config.params.pageToken = nextPageToken;
} while (nextPageToken);
resultArray.map((drive) => {
drive.isSharedDrive = true;
return drive;
});
return JSON.stringify(resultArray);
Testing:
When this script is run, the following result is obtained.
[
{"id":"###","name":"###","isSharedDrive":true},
{"id":"###","name":"###","isSharedDrive":true},
,
,
,
]
Note:
From the official document of "Drives: list",
pageSize: Maximum number of shared drives to return per page. Acceptable values are 1 to 100, inclusive. (Default: 10)
So, when pageSize is 100, the number of loops can be reduced. If you want to test the loop using nextPageToken, please reduce the value.
References:
axios
Drives: list
I recommend you study a little more about async/await.
It makes no sense for you to use async and put a .then().catch(), the purpose of async to get these encapsulated syntaxes.
async getAllDrives(token) {
try {
const getDrives = await this.request(token)
console.log(getDrives)
const results = this.resultArray(getDrives)
return results
} catch (e) {
console.log(e)
}
}
I didn't quite understand your while or your objective, adapt it to your code or remove it
async request(token) {
let nextPageToken = 1 // ????????
const config = {
headers: {
Authorization: `Bearer ${token}`
}
};
const bodyParams = {
pageSize: 2,
fields: 'nextPageToken, drives(id, name)',
q: `hidden=false`,
};
let getDrives = [];
// loop for each request and create a request array
for (let x = 0; x < fields.nextPageToken; x++) {
const request = axios.get(
`https://www.googleapis.com/drive/v3/drives`,
config,
bodyParams
);
getDrives.push(request)
}
const drives = await Promise.all(getDrives)
return drives
}
async resultArray(drivers) {
// result treatment here
}
The return of promise all will be an array of the driver's responses
Note: The response in request.data
const request = await axios.get()
const resposta = request.data
Read about
https://developer.mozilla.org/pt-BR/docs/Web/JavaScript/Reference/Global_Objects/Promise/all

request-promise loop, how to include request data sent with response?

I am trying to use request-promise in a loop and then send a response back to the client with all of the responses. The below code works, however I want to also include the request data with each response so that the request ID can be correlated with the result. Is there a built in way to do this:
promiseLoop: function (req, res) {
var ps = [];
for (var i = 0; i < 3; i++) {
// var read_match_details = {
// uri: 'https://postman-echo.com/get?foo1=bar1&foo2=bar2',
// json: true // Automatically parses the JSON string in the response
// };
var session = this.sessionInit(req, res);
if (this.isValidRequest(session)) {
var assertion = session.assertions[i];
const options = {
method: 'POST',
uri: mConfig.serviceURL,
body: assertion,
headers: {
'User-Agent': 'aggregator-service'
},
json: true
}
logger.trace(options);
ps.push(httpClient(options));
}
}
Promise.all(ps)
.then((results) => {
console.log(results); // Result of all resolve as an array
res.status(200);
res.send(results);
res.end();
}).catch(err => console.log(err)); // First rejected promise
}
Assuming httpClient() is request-promise that you refer to and the assertion value is what you're trying to pass through with this result, you could change this:
ps.push(httpClient(options));
to this:
ps.push(httpClient(options).then(result => {
return {id: assertion, result};
}));
Then, your promise would resolve to that object which contains both the result and the id and you could access each in the final array of results.
Your code doesn't show what the current result is. If it's already an object, you could also just add the id property to that object if you'd rather. This is up to you exactly how you put that final result together.
ps.push(httpClient(options).then(result => {
// add id into final result
result.id = assertion;
return result;
}));
Anyway, the general idea is that before putting the promise in the array, you use a .then() handler to slightly modify the returned result, adding in whatever data you want to add and then returning that new modified result so it becomes the resolved value of the promise chain.
To make sure you process all responses, even if some have an error, you can use the newer [Promise.allSettled()][1] instead of Promise.all() and then look through which responses succeeded or failed in processing the results. Or, you can catch any errors, turn them into resolved promises, but give them a sential value (often null) that you can see in processing the final results:
ps.push(httpClient(options).then(result => {
// add id into final result
result.id = assertion;
return result;
}).catch(err => {
console.log(err);
// got an error, but don't want Promise.all() to stop
// so turn the rejected promise into a resolved promise
// that resolves to an object with an error in it
// Processing code can look for an `.err` property.
return {err: err};
}));
Then, later in your processing code:
Promise.all(ps)
.then((results) => {
console.log(results); // Result of all resolve as an array
// filter out error responses
let successResults = results.filter(item => !item.err);
res.send(successResults );
}).catch(err => console.log(err)); // First rejected promise
Promise.allSettled will not stop at error. It make sure you process all responses, even if some have an error.
const request = require('request-promise');
const urls = ["http://", "http://"];
const promises = urls.map(url => request(url));
Promise.allSettled(promises)
.then((data) => {
// data = [promise1,promise2]
})
.catch((err) => {
console.log(JSON.stringify(err, null, 4));
});

Angular : Receive responses in order with the calls

Hi I am pretty new to Angular and Observables
I am trying to GET Objects by theirs ID through a loop.
But don't receive my Response in Order.
Example
get ID(1)
get ID(2)
get ID(3)
Receive Object ID(2)
Receive Object ID(3)
Receive Object ID(1)
Is it possible to get my Objects back in order ??
Below is where I call multiple times my service function :
conferences-attendance.component.ts
ExportExcelAttendance() {
for (var i = 0; i < this.contactsAttendance.length; i++) {
this.practiceService.GetPracticebyDBID(this.contactsAttendance[i].practiceId)
.subscribe(
(practice: Practice) => {
this.practicesAttendance.push(practice);
if (this.practicesAttendance.length == this.contactsAttendance.length) {
this.ExportExcelAttendance2();
}
},
error => this.errorMessage = <any>error
);
}
}
Here is my function in my service, it where I receive the data (not in order with the calls).
practices.service.ts
GetPracticebyDBID(id: string) {
let params: URLSearchParams = new URLSearchParams();
params.set('thisId', id);
let requestOptions = new RequestOptions();
requestOptions.params = params;
return this.http.get('http://ec2-34-231-196-71.compute-1.amazonaws.com/getpractice', requestOptions)
.map((response: Response) => {
return response.json().obj;
})
.catch((error: Response) => Observable.throw(error.json()));
}
forkJoin gives you a little less code,
const arrayOfFetches = this.contactsAttendance
.map(attendee => this.practiceService.GetPracticebyDBID(attendee.practiceId) );
Observable.forkJoin(...arrayOfFetches)
.subscribe((practices: Practice[]) => {
this.practicesAttendance = practices;
this.ExportExcelAttendance2();
});
Edit
Snap! #Anas beat me to it. Although, I don't think you need the concatAll()
you should use concatAll operator to ensure calling your observables in sequence.
also, you can use completed callback to call ExportExcelAttendance2 instead of checking practicesAttendance length on every response callback.
check the below example:
let contactsAttendanceObservables = this.contactsAttendance
.map((item) => {
return this.practiceService.GetPracticebyDBID(item.practiceId);
});
Observable.of(...contactsAttendanceObservables)
.concatAll()
.subscribe(
(practice: Practice) => {
this.practicesAttendance.push(practice);
},
(err) => {
// handle any errors.
},
() => {
// completed
this.ExportExcelAttendance2();
}
);
if you still want your observables to run in parallel, you can use forkJoin Operator, which will emit the last value of all the passed observables to a one subscriber when all observables are completed.
check the below example:
let contactsAttendanceObservables = this.contactsAttendance
.map((item) => {
return this.practiceService.GetPracticebyDBID(item.practiceId);
});
Observable.forkJoin(...contactsAttendanceObservables)
.subscribe(
(practices: Practice[]) => {
this.practicesAttendance = practices;
this.ExportExcelAttendance2();
}
);
The forkJoin operator is simple to use. It waits until all observables complete, then emit an array with all the items emitted.
ExportExcelAttendance() {
const all = this.contactsAttendance.map(it => this.practiceService.GetPracticebyDBID(it.practiceId));
Rx.Observable.forkJoin(all)
.subscribe(
practicesAttendance => this.ExportExcelAttendance2(practicesAttendance),
error => this.errorMessage = < any > error);
}

NodeJs delay each promise within Promise.all()

I'm trying to update a tool that was created a while ago which uses nodejs (I am not a JS developer, so I'm trying to piece the code together) and am getting stuck at the last hurdle.
The new functionality will take in a swagger .json definition, compare the endpoints against the matching API Gateway on the AWS Service, using the 'aws-sdk' SDK for JS and then updates the Gateway accordingly.
The code runs fine on a small definition file (about 15 endpoints) but as soon as I give it a bigger one, I start getting tons of TooManyRequestsException errors.
I understand that this is due to my calls to the API Gateway service being too quick and a delay / pause is needed. This is where I am stuck
I have tried adding;
a delay() to each promise being returned
running a setTimeout() in each promise
adding a delay to the Promise.all and Promise.mapSeries
Currently my code loops through each endpoint within the definition and then adds the response of each promise to a promise array:
promises.push(getMethodResponse(resourceMethod, value, apiName, resourcePath));
Once the loop is finished I run this:
return Promise.all(promises)
.catch((err) => {
winston.error(err);
})
I have tried the same with a mapSeries (no luck).
It looks like the functions within the (getMethodResponse promise) are run immediately and hence, no matter what type of delay I add they all still just execute. My suspicious is that the I need to make (getMethodResponse) return a function and then use mapSeries but I cant get this to work either.
Code I tried:
Wrapped the getMethodResponse in this:
return function(value){}
Then added this after the loop (and within the loop - no difference):
Promise.mapSeries(function (promises) {
return 'a'();
}).then(function (results) {
console.log('result', results);
});
Also tried many other suggestions:
Here
Here
Any suggestions please?
EDIT
As request, some additional code to try pin-point the issue.
The code currently working with a small set of endpoints (within the Swagger file):
module.exports = (apiName, externalUrl) => {
return getSwaggerFromHttp(externalUrl)
.then((swagger) => {
let paths = swagger.paths;
let resourcePath = '';
let resourceMethod = '';
let promises = [];
_.each(paths, function (value, key) {
resourcePath = key;
_.each(value, function (value, key) {
resourceMethod = key;
let statusList = [];
_.each(value.responses, function (value, key) {
if (key >= 200 && key <= 204) {
statusList.push(key)
}
});
_.each(statusList, function (value, key) { //Only for 200-201 range
//Working with small set
promises.push(getMethodResponse(resourceMethod, value, apiName, resourcePath))
});
});
});
//Working with small set
return Promise.all(promises)
.catch((err) => {
winston.error(err);
})
})
.catch((err) => {
winston.error(err);
});
};
I have since tried adding this in place of the return Promise.all():
Promise.map(promises, function() {
// Promise.map awaits for returned promises as well.
console.log('X');
},{concurrency: 5})
.then(function() {
return console.log("y");
});
Results of this spits out something like this (it's the same for each endpoint, there are many):
Error: TooManyRequestsException: Too Many Requests
X
Error: TooManyRequestsException: Too Many Requests
X
Error: TooManyRequestsException: Too Many Requests
The AWS SDK is being called 3 times within each promise, the functions of which are (get initiated from the getMethodResponse() function):
apigateway.getRestApisAsync()
return apigateway.getResourcesAsync(resourceParams)
apigateway.getMethodAsync(params, function (err, data) {}
The typical AWS SDK documentation state that this is typical behaviour for when too many consecutive calls are made (too fast). I've had a similar issue in the past which was resolved by simply adding a .delay(500) into the code being called;
Something like:
return apigateway.updateModelAsync(updateModelParams)
.tap(() => logger.verbose(`Updated model ${updatedModel.name}`))
.tap(() => bar.tick())
.delay(500)
EDIT #2
I thought in the name of thorough-ness, to include my entire .js file.
'use strict';
const AWS = require('aws-sdk');
let apigateway, lambda;
const Promise = require('bluebird');
const R = require('ramda');
const logger = require('../logger');
const config = require('../config/default');
const helpers = require('../library/helpers');
const winston = require('winston');
const request = require('request');
const _ = require('lodash');
const region = 'ap-southeast-2';
const methodLib = require('../aws/methods');
const emitter = require('../library/emitter');
emitter.on('updateRegion', (region) => {
region = region;
AWS.config.update({ region: region });
apigateway = new AWS.APIGateway({ apiVersion: '2015-07-09' });
Promise.promisifyAll(apigateway);
});
function getSwaggerFromHttp(externalUrl) {
return new Promise((resolve, reject) => {
request.get({
url: externalUrl,
header: {
"content-type": "application/json"
}
}, (err, res, body) => {
if (err) {
winston.error(err);
reject(err);
}
let result = JSON.parse(body);
resolve(result);
})
});
}
/*
Deletes a method response
*/
function deleteMethodResponse(httpMethod, resourceId, restApiId, statusCode, resourcePath) {
let methodResponseParams = {
httpMethod: httpMethod,
resourceId: resourceId,
restApiId: restApiId,
statusCode: statusCode
};
return apigateway.deleteMethodResponseAsync(methodResponseParams)
.delay(1200)
.tap(() => logger.verbose(`Method response ${statusCode} deleted for path: ${resourcePath}`))
.error((e) => {
return console.log(`Error deleting Method Response ${httpMethod} not found on resource path: ${resourcePath} (resourceId: ${resourceId})`); // an error occurred
logger.error('Error: ' + e.stack)
});
}
/*
Deletes an integration response
*/
function deleteIntegrationResponse(httpMethod, resourceId, restApiId, statusCode, resourcePath) {
let methodResponseParams = {
httpMethod: httpMethod,
resourceId: resourceId,
restApiId: restApiId,
statusCode: statusCode
};
return apigateway.deleteIntegrationResponseAsync(methodResponseParams)
.delay(1200)
.tap(() => logger.verbose(`Integration response ${statusCode} deleted for path ${resourcePath}`))
.error((e) => {
return console.log(`Error deleting Integration Response ${httpMethod} not found on resource path: ${resourcePath} (resourceId: ${resourceId})`); // an error occurred
logger.error('Error: ' + e.stack)
});
}
/*
Get Resource
*/
function getMethodResponse(httpMethod, statusCode, apiName, resourcePath) {
let params = {
httpMethod: httpMethod.toUpperCase(),
resourceId: '',
restApiId: ''
}
return getResourceDetails(apiName, resourcePath)
.error((e) => {
logger.unimportant('Error: ' + e.stack)
})
.then((result) => {
//Only run the comparrison of models if the resourceId (from the url passed in) is found within the AWS Gateway
if (result) {
params.resourceId = result.resourceId
params.restApiId = result.apiId
var awsMethodResponses = [];
try {
apigateway.getMethodAsync(params, function (err, data) {
if (err) {
if (err.statusCode == 404) {
return console.log(`Method ${params.httpMethod} not found on resource path: ${resourcePath} (resourceId: ${params.resourceId})`); // an error occurred
}
console.log(err, err.stack); // an error occurred
}
else {
if (data) {
_.each(data.methodResponses, function (value, key) {
if (key >= 200 && key <= 204) {
awsMethodResponses.push(key)
}
});
awsMethodResponses = _.pull(awsMethodResponses, statusCode); //List of items not found within the Gateway - to be removed.
_.each(awsMethodResponses, function (value, key) {
if (data.methodResponses[value].responseModels) {
var existingModel = data.methodResponses[value].responseModels['application/json']; //Check if there is currently a model attached to the resource / method about to be deleted
methodLib.updateResponseAssociation(params.httpMethod, params.resourceId, params.restApiId, statusCode, existingModel); //Associate this model to the same resource / method, under the new response status
}
deleteMethodResponse(params.httpMethod, params.resourceId, params.restApiId, value, resourcePath)
.delay(1200)
.done();
deleteIntegrationResponse(params.httpMethod, params.resourceId, params.restApiId, value, resourcePath)
.delay(1200)
.done();
})
}
}
})
.catch(err => {
console.log(`Error: ${err}`);
});
}
catch (e) {
console.log(`getMethodAsync failed, Error: ${e}`);
}
}
})
};
function getResourceDetails(apiName, resourcePath) {
let resourceExpr = new RegExp(resourcePath + '$', 'i');
let result = {
apiId: '',
resourceId: '',
path: ''
}
return helpers.apiByName(apiName, AWS.config.region)
.delay(1200)
.then(apiId => {
result.apiId = apiId;
let resourceParams = {
restApiId: apiId,
limit: config.awsGetResourceLimit,
};
return apigateway.getResourcesAsync(resourceParams)
})
.then(R.prop('items'))
.filter(R.pipe(R.prop('path'), R.test(resourceExpr)))
.tap(helpers.handleNotFound('resource'))
.then(R.head)
.then([R.prop('path'), R.prop('id')])
.then(returnedObj => {
if (returnedObj.id) {
result.path = returnedObj.path;
result.resourceId = returnedObj.id;
logger.unimportant(`ApiId: ${result.apiId} | ResourceId: ${result.resourceId} | Path: ${result.path}`);
return result;
}
})
.catch(err => {
console.log(`Error: ${err} on API: ${apiName} Resource: ${resourcePath}`);
});
};
function delay(t) {
return new Promise(function(resolve) {
setTimeout(resolve, t)
});
}
module.exports = (apiName, externalUrl) => {
return getSwaggerFromHttp(externalUrl)
.then((swagger) => {
let paths = swagger.paths;
let resourcePath = '';
let resourceMethod = '';
let promises = [];
_.each(paths, function (value, key) {
resourcePath = key;
_.each(value, function (value, key) {
resourceMethod = key;
let statusList = [];
_.each(value.responses, function (value, key) {
if (key >= 200 && key <= 204) {
statusList.push(key)
}
});
_.each(statusList, function (value, key) { //Only for 200-201 range
promises.push(getMethodResponse(resourceMethod, value, apiName, resourcePath))
});
});
});
//Working with small set
return Promise.all(promises)
.catch((err) => {
winston.error(err);
})
})
.catch((err) => {
winston.error(err);
});
};
You apparently have a misunderstanding about what Promise.all() and Promise.map() do.
All Promise.all() does is keep track of a whole array of promises to tell you when the async operations they represent are all done (or one returns an error). When you pass it an array of promises (as you are doing), ALL those async operations have already been started in parallel. So, if you're trying to limit how many async operations are in flight at the same time, it's already too late at that point. So, Promise.all() by itself won't help you control how many are running at once in any way.
I've also noticed since, that it seems this line promises.push(getMethodResponse(resourceMethod, value, apiName, resourcePath)) is actually executing promises and not simply adding them to the array. Seems like the last Promise.all() doesn't actually do much.
Yep, when you execute promises.push(getMethodResponse()), you are calling getMethodResponse() immediately right then. That starts the async operation immediately. That function then returns a promise and Promise.all() will monitor that promise (along with all the other ones you put in the array) to tell you when they are all done. That's all Promise.all() does. It monitors operations you've already started. To keep the max number of requests in flight at the same time below some threshold, you have to NOT START the async operations all at once like you are doing. Promise.all() does not do that for you.
For Bluebird's Promise.map() to help you at all, you have to pass it an array of DATA, not promises. When you pass it an array of promises that represent async operations that you've already started, it can do no more than Promise.all() can do. But, if you pass it an array of data and a callback function that can then initiate an async operation for each element of data in the array, THEN it can help you when you use the concurrency option.
Your code is pretty complex so I will illustrate with a simple web scraper that wants to read a large list of URLs, but for memory considerations, only process 20 at a time.
const rp = require('request-promise');
let urls = [...]; // large array of URLs to process
Promise.map(urls, function(url) {
return rp(url).then(function(data) {
// process scraped data here
return someValue;
});
}, {concurrency: 20}).then(function(results) {
// process array of results here
}).catch(function(err) {
// error here
});
In this example, hopefully you can see that an array of data items are being passed into Promise.map() (not an array of promises). This, then allows Promise.map() to manage how/when the array is processed and, in this case, it will use the concurrency: 20 setting to make sure that no more than 20 requests are in flight at the same time.
Your effort to use Promise.map() was passing an array of promises, which does not help you since the promises represent async operations that have already been started:
Promise.map(promises, function() {
...
});
Then, in addition, you really need to figure out what exactly causes the TooManyRequestsException error by either reading documentation on the target API that exhibits this or by doing a whole bunch of testing because there can be a variety of things that might cause this and without knowing exactly what you need to control, it just takes a lot of wild guesses to try to figure out what might work. The most common things that an API might detect are:
Simultaneous requests from the same account or source.
Requests per unit of time from the same account or source (such as request per second).
The concurrency operation in Promise.map() will easily help you with the first option, but will not necessarily help you with the second option as you can limit to a low number of simultaneous requests and still exceed a requests per second limit. The second needs some actual time control. Inserting delay() statements will sometimes work, but even that is not a very direct method of managing it and will either lead to inconsistent control (something that works sometimes, but not other times) or sub-optimal control (limiting yourself to something far below what you can actually use).
To manage to a request per second limit, you need some actual time control with a rate limiting library or actual rate limiting logic in your own code.
Here's an example of a scheme for limiting the number of requests per second you are making: How to Manage Requests to Stay Below Rate Limiting.

NodeJS constructing array from asynchronious callbacks before returning

I'm writing a function that's returning and array of values. Some of the values are calculated in a callback. But I don't know how to make the program asynchronious so all of my results are in the array, and not added after they're returned.
let array = []
for (stuff : stuffs) {
if (condition) {
array.add(stuff)
} else {
api.compute(stuff, callback(resp) {
array.add(resp.stuff)
}
}
}
res.json({ "stuff": array })
In this example the array is written to the response before the async calls have finished.
How can I make this work asynchronously?
You have to use one of the approaches:
async library
Promise.all
coroutines/generators
async/await
The most cool yet, I think, is async/await. First we modify your function, so it returns a promise:
const compute = function(stuff) {
return new Promise( (resolve, reject) => {
api.compute(stuff, callback(resp){
resolve(resp.stuff)
});
});
};
Then we modify your route with async handler:
app.get('/', async function(req, res, next) {
const array = [];
for (const stuff of stuffs) {
if (condition) {
array.add(stuff);
} else {
const stuff = await compute(stuff);
array.push(stuff);
}
}
res.json({ stuff: array });
});
Note: You might need to update node version to latest.
UPDATE:
Those who are not awared, how event loop works, execute this snippet, and finish with that:
const sleep = async function(ms) {
console.log(`Sleeping ${ms}ms`);
return new Promise( resolve => setTimeout(resolve, ms));
};
async function job() {
console.log('start');
for (let t = 0; t < 10; t++) {
await sleep(100);
}
}
job();
console.log('oops did not expect that oO');
You will be surprised.
Here is an answer without package using callbacks
Create a function that's gonna recursively treat all your stuffs.
getArray(stuffs, callback, index = 0, array = []) {
// Did we treat all stuffs?
if (stuffs.length >= index) {
return callback(array);
}
// Treat one stuff
if (condition) {
array.add(stuffs[index]);
// Call next
return getArray(stuffs, callback, index + 1, array);
}
// Get a stuff asynchronously
return api.compute(stuffs[index], (resp) => {
array.add(resp.stuff);
// Call next
return getArray(stuffs, callback, index + 1, array);
});
}
How to call it?
getArray(stuffs, (array) => {
// Here you have your array
// ...
});
EDIT: more explanation
What we want to do to transform the loop you had into a loop that handle asynchronous function call.
The purpose is that one getArray call gonna treat one index of your stuffs array.
After treating one index, the function will call itself again to treat the next index, until all get treated.
-> Treat index 0 -> Treat index 1 -> Treat index 2 -> Return all result
We are using parameters to pass the infos through the process. Index to know which array part we have to treat, and array to keep a tract of what we did calculate.
EDIT: Improvement to 100% asynchronous soluce
What we have done here it's a simple transposition of your initial for loop into an asynchronous code. it can be improved so by making it totally asynchronous, which make it better but slightly more difficult.
For example :
// Where we store the results
const array = [];
const calculationIsDone = (array) => {
// Here our calculation is done
// ---
};
// Function that's gonna aggregate the results coming asynchronously
// When we did gather all results, we call a function
const gatherCalculResult = (newResult) => {
array.push(newResult);
if (array.length === stuffs.length) {
callback(array);
}
};
// Function that makes the calculation for one stuff
const makeCalculation = (oneStuff) => {
if (condition) {
return gatherCalculResult(oneStuff);
}
// Get a stuff asynchronously
return api.compute(oneStuff, (resp) => {
gatherCalculResult(resp.stuff);
});
};
// We trigger all calculation
stuffs.forEach(x => x.makeCalculation(x));

Resources