Node.js archiver Need syntax for excluding file types via glob - node.js

Using archiver.js (for Node.js), I need to exclude images from a recursive (multi-subdir) archive. Here is my code:
const zip = archiver('zip', { zlib: { level: 9 } });
const output = await fs.createWriteStream(`backup/${fileName}.zip`);
res.setHeader('Content-disposition', `attachment; filename=${fileName}.zip`);
res.setHeader('Content-type', 'application/download');
output.on('close', function () {
res.download(`backup/${fileName}.zip`, `${fileName}.zip`);
});
output.on('end', function () {
res.download(`backup/${fileName}.zip`, `${fileName}.zip`);
});
zip.pipe(output);
zip.glob('**/*',
{
cwd: 'user_uploads',
ignore: ['*.jpg', '*.png', '*.webp', '*.bmp'],
},
{});
zip.finalize();
The problem is that it did not exclude the ignore files. How can I correct the syntax?

Archiver uses Readdir-Glob for globbing which uses minimatch to match.
The matching in Readdir-Glob (node-readdir-glob/index.js#L147) is done against the full filename including the pathname and it does not allow us to apply the option matchBase which will much just the basename of the full path.
In order for to make it work you have 2 options:
1. Make your glob to exclude the file extensions
You can just convert your glob expression to exclude all the file extensions you don't want to be in your archive file using the glob negation !(...) and it will include everything except what matches the negation expression:
zip.glob(
'**/!(*.jpg|*.png|*.webp|*.bmp)',
{
cwd: 'user_uploads',
},
{}
);
2. Make minimatch to work with full file pathname
To make minimatch to work without us being able to set the matchBase option, we have to include the matching directory glob for it to work:
zip.glob(
'**/*',
{
cwd: 'user_uploads',
ignore: ['**/*.jpg', '**/*.png', '**/*.webp', '**/*.bmp'],
},
{}
);
Behaviour
This behaviour of Readdir-Glob is a bit confusing regarding the ignore option:
Options
ignore: Glob pattern or Array of Glob patterns to exclude matches. If a file or a folder matches at least one of the provided patterns, it's not returned. It doesn't prevent files from folder content to be returned.
This means that igrore items have to be actual glob expressions that must include the whole path/file expression. When we specify *.jpg, it will match files only in the root directory and not the subdirectories. If we want to exclude JPG files deep into the directory tree, we have to do it using the include all directories pattern in addition with the file extension pattern which is **/*.jpg.
Exclude only in subdirectories
If you want to exclude some file extensions only inside specific subdirectories, you can add the subdirectory into the path with a negation pattern like this:
// The glob pattern '**/!(Subdir)/*.jpg' will exclude all JPG files,
// that are inside any 'Subdir/' subdirectory.
zip.glob(
'**/*',
{
cwd: 'user_uploads',
ignore: ['**/!(Subdir)/*.jpg'],
},
{}
);

The following code is working with this directory structure :
node-app
|
|_ upload
|_subdir1
|_subdir2
|_...
In the code __dirname is the node-app directory (node-app is the directory where your app resides). The code is an adaptation of the code on https://www.archiverjs.com/ in paragraph Quick Start
// require modules
const fs = require('fs');
const archiver = require('archiver');
// create a file to stream archive data to.
const output = fs.createWriteStream(__dirname + '/example.zip');
const archive = archiver('zip', {
zlib: { level: 9 } // Sets the compression level.
});
// listen for all archive data to be written
// 'close' event is fired only when a file descriptor is involved
output.on('close', function() {
console.log(archive.pointer() + ' total bytes');
console.log('archiver has been finalized and the output file descriptor has closed.');
});
// This event is fired when the data source is drained no matter what was the data source.
// It is not part of this library but rather from the NodeJS Stream API.
// #see: https://nodejs.org/api/stream.html#stream_event_end
output.on('end', function() {
console.log('Data has been drained');
});
// good practice to catch warnings (ie stat failures and other non-blocking errors)
archive.on('warning', function(err) {
if (err.code === 'ENOENT') {
// log warning
} else {
// throw error
throw err;
}
});
// good practice to catch this error explicitly
archive.on('error', function(err) {
throw err;
});
// pipe archive data to the file
archive.pipe(output);
archive.glob('**',
{
cwd: __dirname + '/upload',
ignore: ['*.png','*.jpg']}
);
// finalize the archive (ie we are done appending files but streams have to finish yet)
// 'close', 'end' or 'finish' may be fired right after calling this method so register to them beforehand
archive.finalize();
glob is an abbreviation for 'global' so you use wildcards like * in the filenames ( https://en.wikipedia.org/wiki/Glob_(programming) ). So one possible accurate wildcard expression is *.jpg, *.png,... depending on the file type you want to exclude. In general the asterisk wildcard * replaces an arbitrary number of literal characters or an empty string in in the context of file systems ( file and directory names , https://en.wikipedia.org/wiki/Wildcard_character)
See also node.js - Archiving folder using archiver generate an empty zip

Related

How can you archive with tar in NodeJS while only storing the subdirectory you want?

Basically I want to do the equivalent of this How to strip path while archiving with TAR but with the tar commands imported to NodeJS, so currently I'm doing this:
const gzip = zlib.createGzip();
const pack = new tar.Pack(prefix="");
const source = Readable.from('public/images/');
const destination = fs.createWriteStream('public/archive.tar.gz');
pipeline(source, pack, gzip, destination, (err) => {
if (err) {
console.error('An error occurred:', err);
process.exitCode = 1;
}
});
But doing so leaves me with files like: "/public/images/a.png" and "public/images/b.png", when what I want is files like "/a.png" and "/b.png". I want to know how I can add to this process to strip out the unneeded directories, while keeping the files where they are.
You need to change working directory:
// cwd The current working directory for creating the archive. Defaults to process.cwd().
new tar.Pack({ cwd: "./public/images" });
const source = Readable.from('');
Source: documentation of node-tar
Example: https://github.com/npm/node-tar/blob/main/test/pack.js#L93

fs.writeFile not working when the file path has a folder with timestamp

I am trying to write a data to a file which is inside the folder with folder name having timestamp.
fs.writeFileSync(`.files/${process.env.TIMESTAMP}/data.txt`, "Welcome",
"utf8", function (err) {
if (err) {
return console.log(err);
}
});
and as time stamp i mentioned as
`${new Date().toLocaleDateString()}_${new Date().toLocaleTimeString()}`;
There is no error displayed, but its not getting written. If i remove and give as below : its works
fs.writeFileSync('.files/data.txt', "Welcome",
"utf8", function (err) {
if (err) {
return console.log(err);
}
});
Please help me to understand how to give with timestamp.
In first case , the reason is that you are trying to write in to a folder which is not present at all.There is no folder inside files with name ${process.env.TIMESTAMP}.
First create a directory with name as required by you and then try writing into a file in that folder
Do something like this
var dir = `.files/${process.env.TIMESTAMP}`;
if (!fs.existsSync(dir)){
fs.mkdirSync(dir);
}
fs.writeFileSync(`.files/` + dir + `/data.txt`, "Welcome",
"utf8", function (err) {
if (err) {
return console.log(err);
}
});
You have couple errors in your code:
1) writeFileSync(file, data[, options]) doesn't have callback as third argument. Callback argument exists only for async version of this method writeFile(file, data[, options], callback).
In this case you should get exception if there will be error:
fs.writeFileSync(`.files/${process.env.TIMESTAMP}/data.txt`, "Welcome", "utf8");
2) This expression could produce not valid folder name:
`${new Date().toLocaleDateString()}_${new Date().toLocaleTimeString()}`
In my browser it produced:
"6/25/2019_2:01:44 PM"
But here is rules for folder and files names in UNIX systems which says:
In short, filenames may contain any character except / (root directory), which is reserved as the separator between files and directories in a pathname. You cannot use the null character.
You should make more safe directory name. Use this approach:
`${d.getFullYear()}_${d.getMonth()}_${d.getDate()}__${d.getHours()}_${d.getMinutes()}`
_ - is allowed character for folders and files names.
3) You need create directory using mkdirSync() before creating files in it

Nodejs readdir - only find files

When reading a directory, I currently have this:
fs.readdir(tests, (err, items) => {
if(err){
return cb(err);
}
const cmd = items.filter(v => fs.lstatSync(tests + '/' + v).isFile());
k.stdin.end(`${cmd}`);
});
first of all I need a try/catch in there around fs.lstatSync, which I don't want to add. But is there a way to use fs.readdir to only find files?
Something like:
fs.readdir(tests, {type:'f'}, (err, items) => {});
does anyone know how?
Starting from node v10.10.0, you can add withFileTypes as options parameter to get fs.Dirent instead of string.
// or readdir to get a promise
const subPaths = fs.readdirSync(YOUR_BASE_PATH, {
withFileTypes: true
});
// subPaths is fs.Dirent[] type
const directories = subPaths.filter((dirent) => dirent.isFile());
// directories is string[] type
more info is located at node documentation:
fs.Dirent
fs.readdirSync
fs.readdir
Unfortunately, fs.readdir doesn't have an option to specify that you're only looking for files, not folders/directories (per docs). Filtering the results from fs.readdir to knock out the directories is your best bet.
https://nodejs.org/dist/latest-v10.x/docs/api/fs.html#fs_fs_readdir_path_options_callback
The optional options argument can be a string specifying an
encoding, or an object with an encoding property specifying the
character encoding to use for the filenames passed to the callback. If
the encoding is set to 'buffer', the filenames returned will be
passed as Buffer objects.
Yeah fs.readdir can't do this currently (only read files or only read dirs).
I filed an issue with Node.js and looks like it may be a good feature to add.
https://github.com/nodejs/node/issues/21804
If your use case is scripting/automation. You might try fs-jetpack library. That can find files in folder for you, but also can be configured for much more sophisticated searches.
const jetpack = require("fs-jetpack");
// Find all files in my_folder
const filesInFolder = jetpack.find("my_folder", { recursive: false }));
console.log(filesInFolder);
// Example of more sophisticated search:
// Find all `.js` files in the folder tree, with modify date newer than 2020-05-01
const borderDate = new Date("2020-05-01")
const found = jetpack.find("foo", {
matching: "*.js",
filter: (file) => {
return file.modifyTime > borderDate
}
});
console.log(found);

check the type of files is present or not using nodejs

I want to find the type of files which is present or not, I am using nodejs, fs. Here is my code
var location = '**/*.js';
log(fs.statSync(location).isFile());
which always returns the error.
Error: ENOENT, no such file or directory '**/*.js'
How to I find the files is present or not. Thanks in Advance.
node doesn't have support for globbing (**/*.js) built-in. You'll need to either recursively walk the directories and iterate over the array of file names to find the file types you want, or use something like node-glob.
Using recusrive-readdir-sync
var recursiveReadSync = require('recursive-readdir-sync'),
files;
files = recursiveReadSync('./');
files.forEach(function (fileName) {
if (fileName.search(/\.js$/g) !== -1) {
console.log("Found a *.js file");
}
});
Using node-glob:
var glob = require("glob")
glob("**/*.js", function (er, files) {
files.forEach(function (fileName) {
if (fileName.search(/\.js$/g) !== -1) {
console.log("Found a *.js file");
}
});
node.js dose not support "glob" wildcards by default. You can use external package like this one

Gulp copying empty directories

In my gulp build I've made a task that runs after all compiling, uglifying and minification has occurred. This task simply copies everything from the src into the dest directory that hasn't been touched/processed by earlier tasks. The little issue I'm having is that this results in empty directories in the dest directory.
Is there a way to tell the gulp.src glob to only include files in the pattern matching (like providing the 'is_file' flag)?
Thanks.
Fixed it by adding a filter to the pipeline:
var es = require('event-stream');
var onlyDirs = function(es) {
return es.map(function(file, cb) {
if (file.stat.isFile()) {
return cb(null, file);
} else {
return cb();
}
});
};
// ...
var s = gulp.src(globs)
.pipe(onlyDirs(es))
.pipe(gulp.dest(folders.dest + '/' + module.folder));
// ...
I know I'm late to the party on this one, but for anyone else stumbling upon this question, there is another way to do this that seems pretty elegant in my eyes. I found it in this question
To exclude the empty folders I added { nodir: true } after the glob pattern.
Your general pattern could be such (using the variables from Nick's answer):
gulp.src(globs, { nodir: true })
.pipe(gulp.dest(folders.dest + '/' + module.folder));
Mine was as follows:
gulp.src(['src/**/*', '!src/scss/**/*.scss', '!src/js/**/*.js'], { nodir: true })
.pipe(gulp.dest('dev/'));
This selects all the files from the src directory that are not scss or js files, and does not copy any empty folders from those two directories either.

Resources