return an array from glob node js - node.js

The issue
I'm using the answer here Get all files recursively in directories NodejS
however, when I assign it to a constant I'm trying to return the directories so I can have them available in an array, I have looked through globs documentation for an asnwer https://github.com/isaacs/node-glob, however I have had no successful results, I have tried using glob.end() and I have also console.log the folds variable below, I can see all the list of available methods and I have tried to use some of them with no success, does anyone know how to return the array like in the code example below? Thank you!
const glob = require('glob');
const src = 'assets';
function getFiles(err, res){
if (err) {
console.log('Error', err);
} else {
return res
}
}
let folds = glob(src + '/**/*', getFiles);

I had the same problem.
glob() is asynchronous and that can make returning the end result somewhat complicated.
Use glob.sync() instead (where .sync stands for synchronous)
Example:
const files = glob.sync(src + '/**/*');

Related

Why await within async function doesn't work for fs modules?

I am trying to read a sample.json file through my js code. First my program checks for sample.json within every folder in the specified path. And it reads the sample.json if available and fetches the data. But the await used doesn't work as expected and simply passes the empty object to the calling function before the async functions completes it execution. I have attached the image for the issue.
async function getAvailableJson(filesPath) {
let detectedJson = {};
let folders = await fs.promises.readdir(filesPath);
folders.forEach(async function(folder) {
await fs.promises.access(path.join(filesPath, folder, "Sample.json")).then(async function() {
jsonData = await fs.promises.readFile(path.join(filesPath, folder ,"Sample.json"))
const directory = JSON.parse(jsonData)
const hashvalue = Hash.MD5(jsonData)
detectedJson[directory["dirName"]] = {
name: directory["dirName"],
version: directory["dirVersion"],
hash: hashvalue
}
console.log(detectedJson);
}).catch(function(err) {
if(err.code === "ENOENT")
{}
});
});
return detectedJson;
}
I don't want to use any sync functions since it creates unnecessary locks. I have also tried with fs.readdir, fs.access and fs.readFile functions. Could someone point out what I am doing wrong here since I am new to Node.js thanks in advance.
Sample Image
Change your .forEach() to use for/of instead and generally simplify by not mixing await and .then().
async function getAvailableJson(filesPath) {
let detectedJson = {};
let folders = await fs.promises.readdir(filesPath);
let detectedJson = {};
for (let folder of folders) {
let file = path.join(filesPath, folder, "Sample.json");
try {
let jsonData = await fs.promises.readFile(file);
const directory = JSON.parse(jsonData);
const hashvalue = Hash.MD5(jsonData);
detectedJson[directory["dirName"]] = {
name: directory["dirName"],
version: directory["dirVersion"],
hash: hashvalue
};
} catch (err) {
// silently skip any directories that don't have sample.json in them
// otherwise, throw the error to stop further processing
if (err.code !== "ENOENT") {
console.log(`Error on file ${file}`, err);
throw err;
}
}
console.log(detectedJson);
}
return detectedJson;
}
Summary of Changes:
Replace .forEach() with for/of.
Remove .then() and use only await.
Remove .catch() and use only try/catch.
Remove call to fs.promises.access() since the error can just be handled on fs.promises.readFile()
Add logging if the error is not ENOENT so you can see what the error is and what file it's on. You pretty much never want to silently eat an error with no logging. Though you may want to skip some particular errors, others must be logged. Rethrow errors that are not ENOENT so the caller will see them.
Declare and initialize all variables in use here as local variables.
.forEach() is not promise-aware so using await inside it does not pause the outer function at all. Instead, use a for/of loop which doesn't create the extra function scope and will allow await to pause the parent function.
Also, I consider .forEach() to be pretty much obsolete these days. It's not promise-aware. for/of is a more efficient and more generic way to iterate. And, there's no longer a need to create a new function scope using the .forEach() callback because we have block-scoped variables with let and const. I don't use it any more.
Also, I see no reason why you're preflighting things with fs.promises.access(). That just creates a race condition and you may as well just handle whatever error you get from fs.promises.readFile() as that will do the same thing without the race condition.
See also a related answer on a similar issue.

Can anyone help me understand what is the use of .spread in Bluebird library

I'm working on my first Nodejs application, someone else has developed this application before me and I'm trying to fix some issue and having trouble understanding the following.
return Promise.join(
findStagingAdvanced(stagingQuery),
findDboAdvanced(dboQuery)
)
.spread((stagingIssues, dboIssues) => _.concat(dboIssues, stagingIssues))
.then(....)
If you have a promise that is fulfilled with an array and that array has a known length, then you can use .spread() to convert the array to individual function arguments. It is a substitute for .then() that converts the arguments from an array to individual arguments before calling your handler.
So, instead of this:
someFunction().then(function(arrayOfArgs) {
let arg1 = arrayOfArgs[0];
let arg2 = arrayOfArgs[1];
});
You can do this:
someFunction().spread(function(arg1, arg2) {
// can directly access arg1 and arg2 here
});
So, in your specific code example, Promise.join() already offers a callback that separates out the individual results so it should not be needed at all. So, you could just do this:
return Promise.join(
findStagingAdvanced(stagingQuery),
findDboAdvanced(dboQuery),
(stagingIssues, dboIssues) => _.concat(dboIssues, stagingIssues)
).then(allIssues => {
// allIssues contains combined results of both functions above
});
What this code is doing is collecting the results from findStagingAdvanced() and findDboAdvanced() and merging those results together into a single array of results.
It could be written in standard ES6 (e.g. without Bluebird's extra capabilities) like this:
return Promise.all([findStagingAdvanced(stagingQuery), findDboAdvanced(dboQuery)])
.then(results => results[0].concat(results[1]))
.then(allIssues => {
// allIssues contains combined results of both functions above
});
it allows you to get the result of findStagingAdvanced and findDboAdvanced and merge them together without intermediate variable
without spread you would have an extra variable that will be mutated :
var staging;
findStagingAdvanced(stagingQuery)
.then(stagingQuery => {
staging = stagingQuery; // not that good practice
return findDboAdvanced(dboQuery);
})
.then(dboQuery => {
var merged = [staging, dboQuery];
return ... // another promise that use staging and dboQuery together
})

how to use Node.JS foreach function with Event listerner

I am not sure where I am going wrong but I think that the event listener is getting invoked multiple times and parsing the files multiple times.
I have five files in the directory and they are getting parsed. However the pdf file with array 0 gets parsed once and the next one twice and third one three times.
I want the each file in the directory to be parsed once and create a text file by extracting the data from pdf.
The Idea is to parse the pdf get the content as text and convert the text in to json in a specific format.
To make it simple, the plan is to complete one task first then use the output from the below code to perform the next task.
Hope anyone can help and point out where i am going wrong and explain a bit about my mistake so i understand it. (new to the JS and Node)
Regards,
Jai
Using the module from here:
https://github.com/modesty/pdf2json
var fs = require('fs')
PDFParser = require('C:/Users/Administrator/node_modules/pdf2json/PDFParser')
var pdfParser = new PDFParser(this, 1)
fs.readdir('C:/Users/Administrator/Desktop/Project/Input/',function(err,pdffiles){
//console.log(pdffiles)
pdffiles.forEach(function(pdffile){
console.log(pdffile)
pdfParser.once("pdfParser_dataReady",function(){
fs.writeFile('C:/Users/Administrator/Desktop/Project/Jsonoutput/'+pdffile, pdfParser.getRawTextContent())
pdfParser.loadPDF('C:/Users/Administrator/Desktop/Project/Input/'+pdffile)
})
})
})
As mentioned in the comment, just contributing 'work-around' ideas for OP to temporary resolve this issue.
Assuming performance is not an issue then you should be able to asynchronously parse the pdf files in a sequential matter. That is, only parse the next file when the first one is done.
Unfortunately I have never used the npm module PDFParser before so it is really difficult for me to try the code below. Pardon me as it may require some minor tweaks to make it to work, syntactically they should be fine as they were written using an IDE.
Example:
var fs = require('fs');
PDFParser = require('C:/Users/Administrator/node_modules/pdf2json/PDFParser');
var parseFile = function(files, done) {
var pdfFile = files.pop();
if (pdfFile) {
var pdfParser = new PDFParser();
pdfParser.on("pdfParser_dataError", errData => { return done(errData); });
pdfParser.on("pdfParser_dataReady", pdfData => {
fs.writeFile("'C:/Users/Administrator/Desktop/Project/Jsonoutput/" + pdfFile, JSON.stringify(pdfData));
parseFile(files, done);
});
pdfParser.loadPDF('C:/Users/Administrator/Desktop/Project/Input/' + pdfFile);
}
else {
return done(null, "All pdf files parsed.")
}
};
fs.readdir('C:/Users/Administrator/Desktop/Project/Input/',function(err,pdffiles){
parseFile(pdffiles, (err, message) => {
if (err) { console.error(err.parseError); }
else { console.log(message); }
})
});
In the code above, I have isolated out the parsing logic into a separated function called parseFile. In this function it first checks to see if there are still files to process or not, if none then it invokes the callback function done otherwise it will do an array.pop operation to get the next file in queue and starts parsing it.
When parsing is done then it recursively call the parseFile function until the last file is parsed.

nodejs list or remove all files and directories asynchronously passing only start path

I have been going through stackoverflow topics to find anything useful and there is really nothing. What i would need is (probably) some module, which you can call like this:
someModule('/start/path/', 'list', function(err, list) {
// list contains properly structured object of all subdirectories and files
});
also this
someModule('/start/path/', 'remove', function(err, doneFlag) {
// doneFlag contains something like true so i can run callback
});
I need above functionalities to create mini web-build ftp/code editor for my students.
It is important that listing includes correct structure of NOT only files but also subdirectories they are in. It doesnt really have to be that easy like in my desirable example, most important is that functionality is there. Thank you for all recomendations.
I made a module for my own needs which may help you. Look at alinex-fs. This is an extension of the node.js fs module and can be used as replacement.
Additionally it has a very powerful fs.find() method which will search recursively and match files like the linux find command. What to search for is done by an easy configuration hash.
Then you may loop over the result and remove everything (also recursive).
An example use may look like:
# include the module
var fs = require('alinex-fs');
# search asynchronouse
fs.find('/tmp/some/directory', {
include: 'test*',
type: 'dir'
modifiedBefore: 'yesterday 12:00'
# and much more possibilities...
}, function(err, list) {
if (err) return console.error(err);
# async included here for readability but mostly moved to top
var async = require('async');
# parallel loop over list
async.each(list, function(file, cb) {
# remove file or dir
return fs.remove(file, cb);
}, function(err) {
if (err) return console.log(err);
console.log('done');
});
});
if you already have the list of entries which you need to remove, you can also only use the inner function of the above code.
I hope that will help you to come a step further. If not please make your question more specific.

nodejs express fs iterating files into array or object failing

So Im trying to use the nodejs express FS module to iterate a directory in my app, store each filename in an array, which I can pass to my express view and iterate through the list, but Im struggling to do so. When I do a console.log within the files.forEach function loop, its printing the filename just fine, but as soon as I try to do anything such as:
var myfiles = [];
var fs = require('fs');
fs.readdir('./myfiles/', function (err, files) { if (err) throw err;
files.forEach( function (file) {
myfiles.push(file);
});
});
console.log(myfiles);
it fails, just logs an empty object. So Im not sure exactly what is going on, I think it has to do with callback functions, but if someone could walk me through what Im doing wrong, and why its not working, (and how to make it work), it would be much appreciated.
The myfiles array is empty because the callback hasn't been called before you call console.log().
You'll need to do something like:
var fs = require('fs');
fs.readdir('./myfiles/',function(err,files){
if(err) throw err;
files.forEach(function(file){
// do something with each file HERE!
});
});
// because trying to do something with files here won't work because
// the callback hasn't fired yet.
Remember, everything in node happens at the same time, in the sense that, unless you're doing your processing inside your callbacks, you cannot guarantee asynchronous functions have completed yet.
One way around this problem for you would be to use an EventEmitter:
var fs=require('fs'),
EventEmitter=require('events').EventEmitter,
filesEE=new EventEmitter(),
myfiles=[];
// this event will be called when all files have been added to myfiles
filesEE.on('files_ready',function(){
console.dir(myfiles);
});
// read all files from current directory
fs.readdir('.',function(err,files){
if(err) throw err;
files.forEach(function(file){
myfiles.push(file);
});
filesEE.emit('files_ready'); // trigger files_ready event
});
As several have mentioned, you are using an async method, so you have a nondeterministic execution path.
However, there is an easy way around this. Simply use the Sync version of the method:
var myfiles = [];
var fs = require('fs');
var arrayOfFiles = fs.readdirSync('./myfiles/');
//Yes, the following is not super-smart, but you might want to process the files. This is how:
arrayOfFiles.forEach( function (file) {
myfiles.push(file);
});
console.log(myfiles);
That should work as you want. However, using sync statements is not good, so you should not do it unless it is vitally important for it to be sync.
Read more here: fs.readdirSync
fs.readdir is asynchronous (as with many operations in node.js). This means that the console.log line is going to run before readdir has a chance to call the function passed to it.
You need to either:
Put the console.log line within the callback function given to readdir, i.e:
fs.readdir('./myfiles/', function (err, files) { if (err) throw err;
files.forEach( function (file) {
myfiles.push(file);
});
console.log(myfiles);
});
Or simply perform some action with each file inside the forEach.
I think it has to do with callback functions,
Exactly.
fs.readdir makes an asynchronous request to the file system for that information, and calls the callback at some later time with the results.
So function (err, files) { ... } doesn't run immediately, but console.log(myfiles) does.
At some later point in time, myfiles will contain the desired information.
You should note BTW that files is already an Array, so there is really no point in manually appending each element to some other blank array. If the idea is to put together the results from several calls, then use .concat; if you just want to get the data once, then you can just assign myfiles = files directly.
Overall, you really ought to read up on "Continuation-passing style".
I faced the same problem, and basing on answers given in this post I've solved it with Promises, that seem to be of perfect use in this situation:
router.get('/', (req, res) => {
var viewBag = {}; // It's just my little habit from .NET MVC ;)
var readFiles = new Promise((resolve, reject) => {
fs.readdir('./myfiles/',(err,files) => {
if(err) {
reject(err);
} else {
resolve(files);
}
});
});
// showcase just in case you will need to implement more async operations before route will response
var anotherPromise = new Promise((resolve, reject) => {
doAsyncStuff((err, anotherResult) => {
if(err) {
reject(err);
} else {
resolve(anotherResult);
}
});
});
Promise.all([readFiles, anotherPromise]).then((values) => {
viewBag.files = values[0];
viewBag.otherStuff = values[1];
console.log(viewBag.files); // logs e.g. [ 'file.txt' ]
res.render('your_view', viewBag);
}).catch((errors) => {
res.render('your_view',{errors:errors}); // you can use 'errors' property to render errors in view or implement different error handling schema
});
});
Note: you don't have to push found files into new array because you already get an array from fs.readdir()'c callback. According to node docs:
The callback gets two arguments (err, files) where files is an array
of the names of the files in the directory excluding '.' and '..'.
I belive this is very elegant and handy solution, and most of all - it doesn't require you to bring in and handle new modules to your script.

Resources