How to scan an entire directory tree with node.js? - node.js

Often I would want to scan an entire directory tree (a directory, and everything inside it, including files, subdirs, and those subdir contents too, and their subdir contents too, etc etc).
How would one accomplish this with node? Requirements, is that it should be asynchronous to take advantage of super fast non-blocking IO - and not crash when processing too many files at once.

-- I've updated this answer in 2017 for the progress since 2012 --
Ended up creating these to accomplish it:
https://github.com/bevry/safefs - which now uses https://npmjs.org/package/graceful-fs (which didn't exist before)
https://github.com/bevry/scandirectory - there is also now a vast array of like projects like this
I also created this which is lightweight and super fast:
https://github.com/bevry/readdir-cluster

You can use the module npm dree if you want to achive that. It returns a json that describes the directory tree and it allows you to specify also a fileCallback and a dirCallback, so you can do this:
Here is the code:
const dree = require('dree');
const fileCb = function(file) {
// do what you want
}
const dirCb = function(directory) {
// do what you want
}
dree.scan('path-to-directory', { extensions: [ 'html', 'js' ] }, fileCb, dirCb);

If you want to stick with the 'fs' module, you can do some recursive functions to get them.
Heres a function I made recently to get the tree of a directory.
const fs = require("fs");
// dir is file, depth is how far into a directory it will read.
function treeFiles(dir, depth = 1000) {
if (depth < 1) return;
var sitesList = {};
fs.readdirSync(dir).forEach((file) => {
let base = dir + '/' + file;
// Add file to siteslist object.
sitesList[file] = {"stats": fs.statSync(base), "dir": false};
// Recursive to get directory and tree of files
if (fs.statSync(base).isDirectory()) {
sitesList[file]["dir"] = true;
sitesList[file]["ls"] = treeFiles(base, depth - 1);
}
});
return sitesList;
}
So if I have a file structure which looks like
nodejs_app >
- app.js
- config.js
- images >
- - logo.png
Then the final output of my function reading the nodejs_app directory will look like
{
"app.js": {"stats": {}, "dir": false},
"config.js": {"stats": {}, "dir": false},
"images": {"stats": {}, "dir": true, "ls": {
"logo.png": {"stats": {}, "dir": false}
}
}
Then just call the function with directory and depth into the directory if you want.
let dir = require("path").join(__dirname, "nodejs_app");
let tree = treeFiles(dir);
console.log(tree);
Of course change paths and names to fit your code. I included the depth as to reduce time it takes for it to finish reading a directory.

Related

NodeJS - Find the last instance of a specific subdirectory and create a similar one next to it (Closed)

Edit: This issue was solved in another issue I posted Here, This is closed.
So I figured this would be an easy one but I can't seem to find anything on how this is done on Google nor on StackOverflow.
So basically what I'm doing is I am using the following code constructed with fs.readdir below...
var workDir = 'C:/Users/user/path/to/MyFolder';
fs.readdir(workDir, function (err, filesPath) {
if (err) throw err;
var result = [];
result = filesPath.map(function (filePath) {
return filePath;
});
console.log(result);
});
...which allows us to read a specified working directory and then print the first set of contents (i.e subdirectories and files) inside of this working directory like so...
[
'EmptyFile.txt',
'Empty.txt',
'emptyDir',
'dirEmpty',
'noContent',
'folder',
'folder_classes2',
'folder_classes3'
]
What I want to do from here, is retrieve the very last instance of all of the folder_classes** subdirs, and then create a new folder_classes** subdir following the number pattern from the title of the previous/last folder_classes** subdir, and unfortunately I don't know how I would achieve this, Google is no help nor did I find anything on Stackoverflow.
It's important to note that these folder_classes** directories follow a number pattern of 2 through 1000.
For those that don't understand, here's an example of what I'm trying to achieve...
Example 1
<MyFolder>
|— folder
|— folder_classes2
|— folder_classes3 // <= function finds this subdir to be last subdirectory
|— folder_classes4 // <= so this new directory
// is created by the function
// following the number pattern
// in the previous `folder_classes**` subdirectory title
Example 2
<MyFolder>
|— folder
|— folder_classes2
|— folder_classes3
|— folder_classes4 // <= function finds this subdir to be last subdirectory
|— folder_classes5 // <= so this new directory
// is created by the function
// following the number pattern
// in the previous "folder_classes**" subdirectory title
I hope everyone understands what I'm trying to achieve, any sort of help on this problem would be highly appreciated.
If the pattern is always fixed(i.e folderN, N represents as a number), then you can get all directories and then convert each of them and creates an array of numbers. By having this array you can find the maximum number and predicts the next candidate number.
This function can give you an idea of what goes on:
const { readdir, mkdir } = require('fs/promises');
const createDirectory = async source => {
const foldersNumber = (await readdir(source, { withFileTypes: true }))
.filter(dirent => dirent.isDirectory())
.map(dirent => parseInt(dirent.name.replace('folder', '')));
return mkdir(`${source}/folder${Math.max(...foldersNumber) + 1}`);
}

Loop trough nested folder structure and change a line in all files with node.js

I want to write a script that loops through a nested folder structure and changes for instance console.log('hello world') in each javascript file. I simplified the task and basically can't use the in-built VS search and replace tool. Folder structure looks like this:
My try:
function changeLine(folderPaths) {
folderPaths.forEach(folderPath=> {
const results = fs.readdirSync(folderPath)
const folders = results.filter(res=> fs.lstatSync(path.resolve(folderPath, res)).isDirectory())
const innerFolderPaths = folders.map(folder => path.resolve(folderPath, folder));
if(innerFolderPaths.length ===0) {
return
}
innerFolderPaths.forEach(
inner=>{
if(fs.lstatSync(inner).isDirectory()) {
//go deeper and change individual file
let individualFile = fs.readFileSync(<go to individual file>).toString()
individualFile = fileString.replace(/console.log('hello world')/, 'line changed');
}}
)
})
}
changeLine([path.resolve(__dirname, 'src')] )
I don't have much experience with node.js and that's what I came up with so far. The if condition inside the forEach Loop is not correct, hope someone can make sense of it. Thanks for reading!

After switching from a filesystem to mongodb approach, how can I recreate nested directory iteration?

I was storing files in a nested file structure like
data/category/year/month/day/type/subtype/0.json
and to get a list of files / directories within any folder I'd simply use a func like:
getDirectoryContentsSync(pathName) {
let exists = this.exists(pathName);
if (exists) {
let files = fs.readdirSync(pathName)
return files;
} else {
return [];
}
}
Now I'm switching to storing files in mongodb, identifying the files by their original file-path string like:
store(data, path) {
this.db.collection('fs').insertOne({
path,
data
});
}
load(path) {
let data = this.db.collection('fs').find({
path
})[0].data;
return data;
}
But I'm struggling to figure out a way to continue iterating through the structure the way I used to. The approach I'm thinking is pretty gross, like to assign a separate value for each sub-path pointing to children but I think its gonna be a really bad approach, something like:
store(data, path) {
this.db.collection('fs').insertOne({
path,
data
});
let pathParts = path.split("/");
let subPath = "";
pathParts.forEach(dir => {
subPath += dir;
this.db.collection('dir').insertOne({
path: subPath,
data: true,
children: []
});
});
}
That's not the full code for the concept because I realized it seemed like an overly complicated way to do it and decided to stop and ask. I'm just new to mongo db and I bet there's a much better way to handle this but have no idea where to start. What's a good way to do what I want?

Conditional settings for Gulp plugins dependent on source file

The plugin gulp-pug allows to pass global variables to pug files via data property.
What if we don't need full data set in each .pug file? To implement conditional data injection, we need to access to current vinyl file instance inside pipe(this.gulpPlugins.pug({}) or at least to know the source file absolute path. Possible?
const dataSetForTopPage = {
foo: "alpha",
bar: "bravo"
};
const dataSetForAboutPage = {
baz: "charlie",
hoge: "delta"
};
gulp.src(sourceFileGlobsOrAbsolutePath)
.pipe(gulpPlugins.pug({
data: /*
if path is 'top.pug' -> 'dataSetForTopPage',
else if path is 'about.pug' -> 'dataSetForAboutPage'
else -> empty object*/
}))
.pipe(Gulp.dest("output"));
I am using gulp-intercept plugin. But how to synchronize it with gulpPlugins.pug?
gulp.src(sourceFileGlobsOrAbsolutePath)
.pipe(this.gulpPlugins.intercept(vinylFile => {
// I can compute conditional data set here
// but how to execute gulpPlugins.pug() here?
}))
// ...
It was just one example, but we will deal with same problem when need to conditional plugins options for other gulp plugins, too. E. g:
.pipe(gulpPlugins.htmlPrettify({
indent_char: " ",
indent_size: // if source file in 'admin/**' -> 2, else if in 'auth/**' -> 3 else 4
}))
You'll need to modify the stream manually - through2 is probably the most used package for this purpose. Once in the through2 callback, you can pass the stream to your gulp plugins (as long as their transform functions are exposed) and conditionally pass them options. For example, here is a task:
pugtest = () => {
const dataSet = {
'top.pug': {
foo: "alpha",
bar: "bravo"
},
'about.pug': {
foo: "charlie",
bar: "delta"
}
};
return gulp.src('src/**/*.pug')
.pipe(through2.obj((file, enc, next) =>
gulpPlugins.pug({
// Grab the filename, and set pug data to the value found in dataSet by that name
data: dataSet[file.basename] || {}
})._transform(file, enc, next)
))
.pipe(through2.obj((file, enc, next) => {
const options = {
indent_char: ' ',
indent_size: 4
};
if(file.relative.match(/admin\//)) {
options.indent_size = 2;
} else if(file.relative.match(/auth\//)) {
options.indent_size = 3;
}
file.contents = new Buffer.from(html.prettyPrint(String(file.contents), options), enc);
next(null, file);
}))
.pipe(gulp.dest('output'));
}
For the pug step, we call through2.obj and create the pug plugin, passing it data grabbed from our object literal, indexed by filename in this example. So now the data passed into the compiler comes from that object literal.
For the html step you mention, gulp-html-prettify doesn't expose its transform function, so we can't reach into it and pass the transform back to the stream. But in this case that's OK, if you look at the source it's just a wrapper to prettyPrint in the html package. That's quite literally all it is doing. So we can just rig up our step using through2 to do the same thing, but changing our options based on the vinyl file's relative path.
That's it! For a working example see this repo: https://github.com/joshdavenport/stack-overflow-61314141-gulp-pug-conditional

Dom_munger issue with Node 7.7.3 - Path must be a string

I'm trying to update an application to support Node -v 7.7.3. But when I am running the grunt task dom_munger as per below:
dom_munger:{
read: {
options: {
read:[
{selector:'script[data-concat!="false"]',attribute:'src',writeto:'appjs', isPath: true},
{selector:'link[rel="stylesheet"][data-concat!="false"]',attribute:'href',writeto:'appcss'}
]
},
src: 'app/index.html'
}
}
I receive error:
Warning: Path must be a string. Received [ 'app/index.html' ] Use --force to continue.
I wonder if there is a way to rewrite above grunt task or if there might be a good alternative to dom_munger. Any help would be appreciated.
Per the grunt-dom-munger Github:
When isPath is true, the extracted values are assumed to be file
references and their path is made relative to the Gruntfile.js rather
than the file they're read from.
Try removing the isPath property, or altering it to match the path from your Gruntfile to the index.html file.
Remove isPath: true, and make sure that path in src attribute relative to the Gruntfile.js rather than the file they're read from.
If needs make a replace in path:
dom_munger: {
replacePath: {
options: {
callback: function($, file){
var scripts = $('script[data-concat!="false"]');
// NOTE: path is made relative to the Gruntfile.js rather than the file they're read from
for(var i=0, s, il=scripts.length; i<il; i++){
s = scripts[i];
if(s.attribs.src){
s.attribs.src = s.attribs.src.replace('../', '');
}
}
}
},
src: 'temp/index.html'
},
read: {
options: {
read: [
{selector:'script[data-concat!="false"]',attribute:'src',writeto:'appjs'},
{selector:'link[rel="stylesheet"][data-concat!="false"]',attribute:'href',writeto:'appcss'}
]
},
src: 'temp/index.html'
}
}
Thanks you! But this only seems to work if the Grunt and Index are in the same folder structure. My structure looks like this:
- /app
-index.html
- gruntfile.js
And without the attribute 'isPath' the dom_munger will look for js files in the same directory as where the Gruntfile is places.

Resources