why the for loop doesn't want to wait for the process to finish on a function at node js? - node.js

I am trying to create a script to download pages from multiple urls using node js but the loop didn't want to wait for the request to finish and continued printing, I also got a hint to use the async for loop, but still it didn't work.
here's my code
function GetPage(url){
console.log(` Downloading page ${url}`);
request({
url: `${url}`
},(err,res,body) => {
if(err) throw err;
console.log(` Writing html to file` );
fs.writeFile(`${url.split('/').slice(-1)[0]}`,`${body}`,(err) => {
if(err) throw err;
console.log('saved');
});
});
}
var list = [ 'https://www.someurl.com/page1.html', 'https://www.someurl.com/page2.html', 'https://www.someurl.com/page3.html' ]
const main = async () => {
for(let i = 0; i < list.length; i++){
console.log(` processing ${list[i]}`);
await GetPage(list[i]);
}
};
main().catch(console.error);
Output :
processing https://www.someurl.com/page1.html
Downloading page https://www.someurl.com/page1.html
processing https://www.someurl.com/page2.html
Downloading page https://www.someurl.com/page2.html
processing https://www.someurl.com/page3.html
Downloading page https://www.someurl.com/page3.html
Writing html to file
Writing html to file
saved
saved
Writing html to file
saved

There are a couple of problems with your code.
You are mixing code that uses the callback style programming and code that should be using promises. Also, your getPage function is not async (it doesn't return a promise) so you cannot await on it.
You just have to return a promise from your getPage() function, and correctly resolve it or reject it.
function getPage(url) {
return new Promise((resolve, reject) => {
console.log(` Downloading page ${url}`);
request({ url: `${url}` }, (err, res, body) => {
if (err) reject(err);
console.log(` Writing html to file`);
fs.writeFile(`${url.replace(/\//g,'-')}.html`, `${body}`, (writeErr) => {
if (writeErr) reject(writeErr);
console.log("saved");
resolve();
});
});
});
}
You don't have to change your main() function loop will await for the getPage() function.

For loop doesn't wait for callback to be finished, it will continue executing it. You need to turn either getPage function to promise or use Promise.all as shown below.
var list = [
"https://www.someurl.com/page1.html",
"https://www.someurl.com/page2.html",
"https://www.someurl.com/page3.html",
];
function getPage(url) {
return new Promise((resolve, reject) => {
console.log(` Downloading page ${url}`);
request({ url: `${url}` }, (err, res, body) => {
if (err) reject(err);
console.log(` Writing html to file`);
fs.writeFile(`${url}.html`, `${body}`, (writeErr) => {
if (writeErr) reject(writeErr);
console.log("saved");
resolve();
});
});
});
}
const main = async () => {
return new Promise((resolve, reject) => {
let promises = [];
list.map((path) => promises.push(getPage(path)));
Promise.all(promises).then(resolve).catch(reject);
});
};
main().catch(console.error);

GetPage() is not built around promises and doesn't even return a promise so await on its result does NOTHING. await has no magic powers. It awaits a promise. If you don't give it a promise that properly resolves/rejects when your async operation is done, then the await does nothing. Your GetPage() function returns nothing so the await has nothing to do.
What you need is to fix GetPage() so it returns a promise that is properly tied to your asynchronous result. Because the request() library has been deprecated and is no longer recommended for new projects and because you need a promise-based solution anyway so you can use await with it, I'd suggest you switch to one of the alternative promise-based libraries recommended here. My favorite from that list is got(), but you can choose whichever one you like best. In addition, you can use fs.promises.writeFile() for promise-based file writing.
Here's how that code would look like using got():
const got = require('got');
const { URL } = require('url');
const path = require('path');
const fs = require('fs');
function getPage(url) {
console.log(` Downloading page ${url}`);
return got(url).text().then(data => {
// can't just use an URL for your filename as it contains potentially illegal
// characters for the file system
// so, add some code to create a sanitized filename here
// find just the root filename in the URL
let urlObj = new URL(url);
let filename = path.basename(urlObj.pathname);
if (!filename) {
filename = "index.html";
}
let extension = path.extname(filename);
if (!extension) {
filename += ".html";
} else if (extension === ".") {
filename += "html";
}
console.log(` Writing file ${filename}`)
return fs.promises.writeFile(filename, data);
});
}
const list = ['https://www.someurl.com/page1.html', 'https://www.someurl.com/page2.html', 'https://www.someurl.com/page3.html'];
async function main() {
for (let url of list) {
console.log(` processing ${url}`);
await getPage(url);
}
}
main().then(() => {
console.log("all done");
}).catch(console.error);
If you put real URLs in the array, this is directly runnable in nodejs. I ran it myself with my own URLs.
Summary of Changes and Improvements:
Switched from request() to got() because it's promise-based and not deprecated.
Modified getPage() to return a promise that represents the asynchronous operations in the function.
Switched to fs.promises.writeFile() so we are using only promises for asynchronous control-flow.
Added legal filename generation from the base path of the URL since you can't just use a full URL as a filename (at least in some file systems).
Switched to simpler for/of loop

Related

multiple IF conditions execute one by one in nodejs using Async/await

How can i run this if statements synchronously. I have been trying many times but unable to fixed this. (I am nodejs beginner).
i am trying to use async/await here but it is not work.
how can i check first if condition is completed and then second if statement will run!
Please help.
here is my dummy codes:
record1='10';
record2='20';
function main(){
if(record1){
console.log('-------------------------');
console.log('I am record 1')
val='John',
firstJob(val)
}
if(record2){
console.log('I am record 2')
val='Rahul',
firstJob(val)
}
}
async function firstJob(val){
console.log('Hello, I am ' + val)
await secondJob(val);
}
async function secondJob(val){
console.log(val+ ' is a nodeJs beginner!')
await listFiles()
}
function thirdJob(arg){
if (arg='pass'){
console.log('This is end of the one if condition')
}
}
function listFiles (){
return new Promise((resolve, reject) => {
setTimeout(() => {
const path = require('path');
const fs = require('fs');
const { exit } = require('process');
const directoryPath = path.join(__dirname, '../');
console.log('List of Avaialble Files :');
fs.readdir(directoryPath, { withFileTypes: true },function (err, files) {
if (err) {
return console.log('Unable to scan directory: ' + err);
}
files.forEach(function (file) {
if (file.isFile()){
console.log(file);
}
});
});
arg='pass'
thirdJob();
}, 2000)
})
}
main();
The short answer is you can't "make them run synchronously".
You have to patiently wait until they're done to get the answer.
So, without making main async, you have to use the promises the old fashioned way, and sequence the actions using then.
record1='10';
record2='20';
function main(){
Promise.resolve()
.then(() => {
if(record1){
console.log('-------------------------');
console.log('I am record 1');
val='John';
return firstJob(val)
}
})
.then(() => {
if(record2){
console.log('I am record 2')
val='Rahul';
return firstJob(val)
}
});
}
async function firstJob(val){
console.log('Hello, I am ' + val)
await secondJob(val);
}
async function secondJob(val){
console.log(val+ ' is a nodeJs beginner!')
await listFiles()
}
main();
I've just included the snippet for the if and promise stuff. The gist here is that you conditionally chain together your calls to firstJob.
Each call to then allows you to (potentially, it's not required) attach another promise to the execution of the one that just finished. In the snippet above, we're doing that only if the condition is truthy by returning the promise from the calls to firstJob.
By the way, your implementation of listFiles isn't ever going to finish because you never invoke resolve to the promise you made inside the function. This solves the problem by resolving your promise once the looping is done.
function listFiles (){
return new Promise((resolve, reject) => {
setTimeout(() => {
const path = require('path');
const fs = require('fs');
const { exit } = require('process');
const directoryPath = path.join(__dirname, '../');
console.log('List of Avaialble Files :');
fs.readdir(directoryPath, { withFileTypes: true },function (err, files) {
if (err) {
console.log('Unable to scan directory: ' + err);
reject(err);
}
files.forEach(function (file) {
if (file.isFile()){
console.log(file);
}
});
resolve();
});
arg='pass'
thirdJob();
}, 2000)
})
}
Note the added call to resolve once you've completed your loop.
I also added a call to reject in the case that readdir returned an error, since that is the proper way to propagate it with your manual promise.
A few more pointers, generally modules are required once at the top of the file, instead of dynamically inside of a function. The penalty for doing it that way you have is not bad, there is a cache for required modules. It's just not idiomatic.
if (arg='pass'){
Doesn't do any equality check, that's an assignment, you need ==, or === if you want to check for equality.

How to make fs.readFile async await?

I have this nodejs code here which read a folder and process the file. The code works. But it is still printing all the file name first, then only read the file. How do I get a file and then read a content of the file first and not getting all the files first?
async function readingDirectory(directory) {
try {
fileNames = await fs.readdir(directory);
fileNames.map(file => {
const absolutePath = path.resolve(folder, file);
log(absolutePath);
fs.readFile(absolutePath, (err, data) => {
log(data); // How to make it async await here?
});
});
} catch {
console.log('Directory Reading Error');
}
}
readingDirectory(folder);
To use await, you need to use promise versions of fs.readFile() and fs.readdir() which you can get on fs.promises and if you want these to run sequentially, then use a for loop instead of .map():
async function readingDirectory(directory) {
const fileNames = await fs.promises.readdir(directory);
for (let file of fileNames) {
const absolutePath = path.join(directory, file);
log(absolutePath);
const data = await fs.promises.readFile(absolutePath);
log(data);
}
}
readingDirectory(folder).then(() => {
log("all done");
}).catch(err => {
log(err);
});

Problem with async when downloading a series of files with nodejs

I'm trying to download a bunch of files. Let's say 1.jpg, 2.jpg, 3.jpg and so on. If 1.jpg exist, then I want to try and download 2.jpg. And if that exist I will try the next, and so on.
But the current "getFile" returns a promise, so I can't loop through it. I thought I had solved it by adding await in front of the http.get method. But it looks like it doesn't wait for the callback method to finish. Is there a more elegant way to solve this than to wrap the whole thing in a new async method?
// this returns a promise
var result = getFile(url, fileToDownload);
const getFile = async (url, saveName) => {
try {
const file = fs.createWriteStream(saveName);
const request = await http.get(url, function(response) {
const { statusCode } = response;
if (statusCode === 200) {
response.pipe(file);
return true;
}
else
return false;
});
} catch (e) {
console.log(e);
return false;
}
}
I don't think your getFile method is returning promise and also there is no point of awaiting a callback. You should split functionality in to two parts
- get file - which gets the file
- saving file which saves the file if get file returns something.
try the code like this
const getFile = url => {
return new Promise((resolve, reject) => {
http.get(url, response => {
const {statusCode} = response;
if (statusCode === 200) {
resolve(response);
}
reject(null);
});
});
};
async function save(saveName) {
const result = await getFile(url);
if (result) {
const file = fs.createWriteStream(saveName);
response.pipe(file);
}
}
What you are trying to do is getting / requesting images in some sync fashion.
Possible solutions :
You know the exact number of images you want to get, then go ahead with "request" or "http" module and use promoise chain.
You do not how the exact number of images, but will stop at image no. N-1 if N not found. then go ahed with sync-request module.
your getFile does return a promise, but only because it has async keyword before it, and it's not a kind of promise you want. http.get uses old callback style handling, luckily it's easy to convert it to Promise to suit your needs
const tryToGetFile = (url, saveName) => {
return new Promise((resolve) => {
http.get(url, response => {
if (response.statusCode === 200) {
const stream = fs.createWriteStream(saveName)
response.pipe(stream)
resolve(true);
} else {
// usually it is better to reject promise and propagate errors further
// but the function is called tryToGetFile as it expects that some file will not be available
// and this is not an error. Simply resolve to false
resolve(false);
}
})
})
}
const fileUrls = [
'somesite.file1.jpg',
'somesite.file2.jpg',
'somesite.file3.jpg',
'somesite.file4.jpg',
]
const downloadInSequence = async () => {
// using for..of instead of forEach to be able to pause
// downloadInSequence function execution while getting file
// can also use classic for
for (const fileUrl of fileUrls) {
const success = await tryToGetFile('http://' + fileUrl, fileUrl)
if (!success) {
// file with this name wasn't found
return;
}
}
}
This is a basic setup to show how to wrap http.get in a Promise and run it in sequence. Add error handling wherever you want. Also it's worth noting that it will proceed to the next file as soon as it has received a 200 status code and started downloading it rather than waiting for a full download before proceeding

NodeJS Async / Await - Build configuration file with API call

I would like to have a configuration file with variables set with data I fetch from an API.
I think I must use async and await features to do so, otherwise my variable would stay undefined.
But I don't know how to integrate this and keep the node exports.myVariable = myData available within an async function ?
Below is the code I tried to write to do so (all in the same file) :
const fetchAPI = function(jsonQuery) {
return new Promise(function (resolve, reject) {
var reqOptions = {
headers: apiHeaders,
json:jsonQuery,
}
request.post(apiURL, function (error, res, body) {
if (!error && res.statusCode == 200) {
resolve(body);
} else {
reject(error);
}
});
});
}
var wallsData = {}
const fetchWalls = async function (){
var jsonQuery = [{ "recordType": "page","query": "pageTemplate = 1011"}]
let body = await utils.fetchAPI(jsonQuery)
let pageList = await body[0].dataHashes
for(i=0;i<pageList.length;i++){
var page = pageList[i]
wallsData[page.title.fr] = [page.difficultyList,page.wallType]
}
return wallsData
throw new Error("WOOPS")
}
try{
const wallsData = fetchWalls()
console.log(wallsData)
exports.wallsData = wallsData
}catch(err){
console.log(err)
}
The output of console.log(wallsData) shows Promise { <pending> }, therefore it is not resolved and the configuration file keep being executed without the data in wallsData...
What do I miss ?
Thanks,
Cheers
A promise is a special object that either succeeds with a result or fails with a rejection. The async-await-syntax is syntactic sugar to help to deal with promises.
If you define a function as aync it always will return a promise.
Even a function like that reads like
const foo = async() => {
return "hello";
}
returns a promise of a string, not only a string. And you need to wait until it's been resolved or rejected.
It's analogue to:
const foo = async() => {
return Promise.resolve("Hello");
}
or:
const foo = async() => {
return new Promise(resolve => resolve("Hello"));
}
Your fetchWalls similarly is a promise that will remain pending for a time. You'll have to make sure it either succeeds or fails by setting up the then or catch handlers in your outer scope:
fetchWalls()
.then(console.log)
.catch(console.error);
The outer scope is never async, so you cannot use await there. You can only use await inside other async functions.
I would also not use your try-catch for that outer scope promise handling. I think you are confusing the try-catch approach that is intended to be used within async functions, as there it helps to avoid nesting and reads like synchronous code:
E.g. you could do inside your fetchWalls defintion:
const fetchWalls = async function (){
var jsonQuery = [{ "recordType": "page","query": "pageTemplate = 1011"}]
try {
let body = await utils.fetchAPI(jsonQuery)
} catch(e) {
// e is the reason of the promise rejection if you want to decide what to do based on it. If you would not catch it, the rejection would chain through to the first error handler.
}
...
}
Can you change the statements like,
try{
const wallsData = fetchWalls();
wallsData.then((result) => {
console.log(result);
});
exports.wallsData = wallsData; // when importing in other file this returns as promise and we should use async/await to handle this.
}catch(err){
console.log(err)
}

Question about end of request for node/JS request package

I'm trying to understand what .on('end', ...) does in the node package request.
My code:
const fs = require('fs');
const request = require('request');
function downloadAsset(relativeAssetURL, fileName) {
return new Promise((resolve, reject) => {
try {
let writeStream = fs.createWriteStream(fileName);
var remoteImage = request(`https:${relativeAssetURL}`);
remoteImage.on('data', function(chunk) {
writeStream.write(chunk);
});
remoteImage.on('end', function() {
let stats = fs.statSync(fileName);
resolve({ fileName: fileName, stats: stats });
});
} catch (err) {
reject(err);
}
});
}
What I'm trying to do is download a remote image, get some file statistics, and then resolve the promise so my code can do other things.
What I'm finding is that the promise doesn't always resolve after the file has been downloaded; it may resolve a little before then. I thought that's what .on('end', ... ) was for.
What can I do to have this promise resolve after the image has been downloaded in full?
As the docs say:
The writable.write() method writes some data to the stream, and calls the supplied callback once the data has been fully handled.
So, writable.write() is asynchronous. Just because your last writeStream.write has been called does not necessarily mean that all write operations have been completed. You probably want to call the .end method, which means:
Calling the writable.end() method signals that no more data will be written to the Writable. The optional chunk and encoding arguments allow one final additional chunk of data to be written immediately before closing the stream. If provided, the optional callback function is attached as a listener for the 'finish' event.
So, try calling writeStream.end when the remoteImage request ends, and pass a callback to writeStream.end that resolves the Promise once the writing is finished:
function downloadAsset(relativeAssetURL, fileName) {
return new Promise((resolve, reject) => {
try {
const writeStream = fs.createWriteStream(fileName);
const remoteImage = request(`https:${relativeAssetURL}`);
remoteImage.on('data', function(chunk) {
writeStream.write(chunk);
});
remoteImage.on('end', function() {
writeStream.end(() => {
const stats = fs.statSync(fileName);
resolve({ fileName: fileName, stats: stats });
});
});
} catch (err) {
reject(err);
}
});
}
(also try not to mix var and let/const - in an ES6+ environment, prefer const, which is generally easier to read and has fewer problems, like hoisting)

Resources