Nodejs loop through array of urls in a synchronous way - node.js

i've worked with node now for 2 years but cannot solve the following requirements:
I have an array of ~ 50.000 Parameters
I need to loop through the array and make a get request to always the same url with the parameter added
I need to write the result of the url-call back to the array
It's needed to do this one by one, as i can not call the api with several threads.
I'm sure there is a simple solution for that but everything i tried didn't make the code wait for the get request to return. I know that doing things synchronous in node is not the way we should to things, but in this special situation it is by design that the process shall not go on till the result comes back.
Any hint appreciated
Regards

Use a for loop, use a means of doing the GET request that returns a promise (such as the got() library) and then use await to pause the for loop until your response comes back.
const got = require('got');
const yourArray = [...];
async function run() {
for (let [index, item] of yourArray.entries()) {
try {
let result = await got(item.url);
// do something with the result
} catch(e) {
// either handle the error here or throw to stop further processing
}
}
}
run().then(() => {
console.log("all done");
}).catch(err => {
console.log(err);
});

Related

NodeJS await inside forEach function

I am working with MySQL server, building a NodeJS server.
SomeWhere in the code I have async code that is not waiting, I believe its has something to do with the forEach function.
Is there anything wrong with my implementation of my next function:
async addResponsesToTasks(input, response, isFromCache) {
if (!isFromCache) {
this.saveResponseToCache(input, response);
}
console.log("===7===");
await response.forEach(async (pokemon) => {
console.log("===8===", pokemon);
await this.addTaskToFile(pokemon, false);
console.log("===13===");
});
return true;
}
And this is the output shows that it is not waiting for creation and save in the DB to finish, and jump to do some other things:
Prints of the output - can see 10,12,13,14,15 and only then 11
Hope you could maybe spot what I am missing.
Thanks in advance!
What's happening here is more of a JavaScript thing than a Sequelize thing.
The await is being applied to the lines of code that follow Item.create() inside of this function, but outer functions are continuing to execute.
console.log('9')
saveTaskToDb(task)
console.log('12')
In this code, '12' is going to get logged out before your '11', because this outer code is going to continue to execute, even if the code in saveTaskToDb is awaiting the resolution of the create() method.
To solve this, you'll either need to move all of your asynchronous code into the same code block, or you'll need to return your own promise in the saveTaskToDb function, and await saveTaskToDb.
So Solution was not using forEach, rather using for of loop like that:
async addResponsesToTasks(input, response, isFromCache) {
if (!isFromCache) {
this.saveResponseToCache(input, response);
}
for (const pokemon of response) {
await this.addTaskToFile(pokemon, false);
}
return true;
}
Based on this question: Using async/await with a forEach loop

Retrieving a value in a function in Node.js

I'm struggling with callbacks in Node.js. I simply want playerNumber to be set to the number of players in my collection of Players. The console.log works, but I can't get the variable out of the function and into the playerNumber variable.
And if there's a simpler way get this value for use in the rest of my backend code, I'm all ears. I'm clearly new at Node.js, but the code always seems more involved than I'm expecting.
Thanks in advance!
var playerNumber = function countPlayers(callback){
Player.count(function(err, numOfDocs) {
console.log('I have '+numOfDocs+' documents in my collection');
callback(err, numOfDocs);
});
}
It's probably async, and it's a typical first-timer experience to want to "get back to normal" on the call chain on the way back from async call. This can't be done, but it's not so bad to live with it. Here's how...
Step 1: Promises are better than callbacks. I'll leave the long story
to others.
Step 2: Callbacks can be made into promises
In the OP case...
// The promise constructor takes a function that has two functions as params
// one to call on success, and one to call on error. Instead of a callback
// call the 'resolve' param with the data and the 'reject' param with any error
// mark the function 'async' so callers know it can be 'await'-ed
const playerNumber = async function countPlayers() {
return new Promise((resolve, reject) => {
Player.count(function(err, numOfDocs) {
err ? reject(err) : resolve(numOfDocs);
});
});
}
Step 3: Yes, the callers must deal with this, and the callers of the callers, and so on. It's not so bad.
In the OP case (in the most modern syntax)...
// this has to be async because it 'awaits' the first function
// think of await as stopping serial execution until the async function finishes
// (it's not that at all, but that's an okay starting simplification)
async function printPlayerCount() {
const count = await playerNumber();
console.log(count);
}
// as long as we're calling something async (something that must be awaited)
// we mark the function as async
async function printPlayerCountAndPrintSomethingElse() {
await printPlayerCount();
console.log('do something else');
}
Step 4: Enjoy it, and do some further study. It's actually great that we can do such a complex thing so simply. Here's good reading to start with: MDN on Promises.

Updating a user concurrently in mongoose using model.save() throws error

I am getting a mongoose error when I attempt to update a user field multiple times.
What I want to achieve is to update that user based on some conditions after making an API call to an external resource.
From what I observe, I am hitting both conditions at the same time in the processUser() function
hence, user.save() is getting called almost concurrently and mongoose is not happy about that throwing me this error:
MongooseError [ParallelSaveError]: Can't save() the same doc multiple times in parallel. Document: 5ea1c634c5d4455d76fa4996
I know am guilty and my code is the culprit here because I am a novice. But is there any way I can achieve my desired result without hitting this error? Thanks.
function getLikes(){
var users = [user1, user2, ...userN]
users.forEach((user) => {
processUser(user)
})
}
async function processUser(user){
var result = await makeAPICall(user.url)
// I want to update the user based on the returned value from this call
// I am updating the user using `mongoose save()`
if (result === someCondition) {
user.meta.likes += 1
user.markModified("meta.likes")
try {
await user.save()
return
} catch (error) {
console.log(error)
return
}
} else {
user.meta.likes -= 1
user.markModified("meta.likes")
try {
await user.save()
return
} catch (error) {
console.log(error)
return
}
}
}
setInterval(getLikes, 2000)
There are some issues that need to be addressed in your code.
1) processUser is an asynchronous function. Array.prototype.forEach doesn't respect asynchronous functions as documented here on MDN.
2) setInterval doesn't respect the return value of your function as documented here on MDN, therefore passing a function that returns a promise (async/await) will not behave as intended.
3) setInterval shouldn't be used with functions that could potentially take longer to run than your interval as documented here on MDN in the Usage Section near the bottom of the page.
Hitting an external api for every user every 2 seconds and reacting to the result is going to be problematic under the best of circumstances. I would start by asking myself if this is absolutely the only way to achieve my overall goal.
If it is the only way, you'll probably want to implement your solution using the recursive setTimeout() mentioned at the link in #2 above, or perhaps using an async version of setInterval() there's one on npm here

Node.js waiting for an async Redis hgetall call in a chain of functions

I'm still somewhat new to working with Node and def new to working asynchronously and with promises.
I have an application that is hitting a REST endpoint, then calling a chain of functions. The end of this chain is calling hgetall and I need to wait until I get the result and pass it back. I'm testing with Postman and I'm getting {} back instead of the id. I can console.log the id, so I know that this is because some of the code isn't waiting for the result of hgetall before continuing.
I'm using await to wait for the result of hgetall, but that's only working for the end of the chain. do I need to do this for the entire chain of functions, or is there a way to have everything wait for the result before continuing on? Here's the last bit of the logic chain:
Note: I've removed some of the logic from the below functions and renamed a few things to make it a bit easier to see the flow and whats going on with this particular issue. So, some of it may look a bit weird.
For this example, it will call GetProfileById().
FindProfile(info) {
var profile;
var profileId = this.GenerateProfileIdkey(info); // Yes, this will always give me the correct key
profile = this.GetProfileById(profileId);
return profile;
}
This checks with the Redis exists, to verify if the key exists, then tries to get the id with that key. I am now aware that the Key() returns true instead of what Redis actually returns, but I'll fix that once I get this current issue resolved.
GetProfileById(profileId) {
if ((this.datastore.Key(profileId) === true) && (profileId != null)) {
logger.info('GetProfileById ==> Profile found. Returning the profile');
return this.datastore.GetId(profileId);
} else {
logger.info(`GetProfileById ==> No profile found with key ${profileId}`)
return false;
}
}
GetId() then calls the data_store to get the id. This is also where I started to use await and async to try and wait for the result to come through before proceeding. This part does wait for the result, but the functions prior to this don't seem to wait for this one to return anything. Also curious why it only returns the key and not the value, but when I print out the result in hgetall I get the key and value?
async GetId(key) {
var result = await this.store.RedisGetId(key);
console.log('PDS ==> Here is the GetId result');
console.log(result); // returns [ 'id' ]
return result;
}
and finally, we have the hgetall call. Again, new to promises and async, so this may not be the best way of handling this or right at all, but it is getting the result and waiting for the result before it returns anything
async RedisGetId(key) {
var returnVal;
var values;
return new Promise((resolve, reject) => {
client.hgetall(key, (err, object) => {
if (err) {
reject(err);
} else {
resolve(Object.keys(object));
console.log(object); // returns {id: 'xxxxxxxxxxxxxx'}
return object;
}
});
});
}
Am I going to need to async every single function that could potentially end up making a Redis call, or is there a way to make the app wait for the Redis call to return something, then continue on?
Short answer is "Yes". In general, if a call makes an asynchronous request and you need to await the answer, you will need to do something to wait for it.
Sometimes, you can get smart and issue multiple calls at once and await all of them in parallel using Promise.all.
However, it looks like in your case your workflow is synchronous, so you will need to await each step individually. This can get ugly, so for redis I typically use something like promisify and make it easier to use native promises with redis. There is even an example on how to do this in the redis docs:
const {promisify} = require('util');
const getAsync = promisify(client.get).bind(client);
...
const fooVal = await getAsync('foo');
Makes your code much nicer.

How to do proper error handling with async/await

I'm writing an API where I'm having a bit of trouble with the error handling. What I'm unsure about is whether the first code snippet is sufficient or if I should mix it with promises as in the second code snippet. Any help would be much appreciated!
try {
var decoded = jwt.verify(req.params.token, config.keys.secret);
var user = await models.user.findById(decoded.userId);
user.active = true;
await user.save();
res.status(201).json({user, 'stuff': decoded.jti});
} catch (error) {
next(error);
}
Second code snippet:
try {
var decoded = jwt.verify(req.params.token, config.keys.secret);
var user = models.user.findById(decoded.userId).then(() => {
}).catch((error) => {
});
user.active = true;
await user.save().then(() => {
}).catch((error) => {
})
res.status(201).json({user, 'stuff': decoded.jti});
} catch (error) {
next(error);
}
The answer is: it depends.
Catch every error
Makes sense if you want to react differently on every error.
e.g.:
try {
let decoded;
try {
decoded = jwt.verify(req.params.token, config.keys.secret);
} catch (error) {
return response
.status(401)
.json({ error: 'Unauthorized..' });
}
...
However, the code can get quite messy, and you'd want to split the error handling a bit differently (e.g.: do the JWT validation on some pre request hook and allow only valid requests to the handlers and/or do the findById and save part in a service, and throw once per operation).
You might want to throw a 404 if no entity was found with the given ID.
Catch all at once
If you want to react in the same way if a) or b) or c) goes wrong, then the first example looks just fine.
a) var decoded = jwt.verify(req.params.token, config.keys.secret);
b) var user = await models.user.findById(decoded.userId);
user.active = true;
c) await user.save();
res.status(201).json({user, 'stuff': decoded.jti});
I read some articles that suggested the need of a try/catch block for each request. Is there any truth to that?
No, that is not required. try/catch with await works conceptually like try/catch works with regular synchronous exceptions. If you just want to handle all errors in one place and want all your code to just abort to one error handler no matter where the error occurs and don't need to catch one specific error so you can do something special for that particular error, then a single try/catch is all you need.
But, if you need to handle one particular error specifically, perhaps even allowing the rest of the code to continue, then you may need a more local error handler which can be either a local try/catch or a .catch() on the local asynchronous operation that returns a promise.
or if I should mix it with promises as in the second code snippet.
The phrasing of this suggests that you may not quite understand what is going on with await because promises are involved in both your code blocks.
In both your code blocks models.user.findById(decoded.userId); returns a promise. You have two ways you can use that promise.
You can use await with it to "pause" the internal execution of the function until that promise resolves or rejects.
You can use .then() or .catch() to see when the promise resolves or rejects.
Both are using the promise returns from your models.user.findById(decoded.userId); function call. So, your phrasing would have been better to say "or if I should use a local .catch() handler on a specific promise rather than catching all the rejections in one place.
Doing this:
// skip second async operation if there's an error in the first one
async function someFunc() {
try {
let a = await someFunc():
let b = await someFunc2(a);
return b + something;
} catch(e) {
return "";
}
}
Is analogous to chaining your promise with one .catch() handler at the end:
// skip second async operation if there's an error in the first one
function someFunc() {
return someFunc().then(someFunc2).catch(e => "");
}
No matter which async function rejects, the same error handler is applied. If the first one rejects, the second one is not executed as flow goes directly to the error handler. This is perfectly fine IF that's how you want the flow to go when there's an error in the first asynchronous operation.
But, suppose you wanted an error in the first function to be turned into a default value so that the second asynchronous operation is always executed. Then, this flow of control would not be able to accomplish that. Instead, you'd have to capture the first error right at the source so you could supply the default value and continue processing with the second asynchronous operation:
// always run second async operation, supply default value if error in the first
async function someFunc() {
let a;
try {
a = await someFunc():
} catch(e) {
a = myDefaultValue;
}
try {
let b = await someFunc2(a);
return b + something;
} catch(e) {
return "";
}
}
Is analogous to chaining your promise with one .catch() handler at the end:
// always run second async operation, supply default value if error in the first
function someFunc() {
return someFunc()
.catch(err => myDefaultValue)
.then(someFunc2)
.catch(e => "");
}
Note: This is an example that never rejects the promise that someFunc() returns, but rather supplies a default value (empty string in this example) rather than reject to show you the different ways of handling errors in this function. That is certainly not required. In many cases, just returning the rejected promise is the right thing and that caller can then decide what to do with the rejection error.

Resources