Compress directory with Node.js but exclude some files - node.js

I'm trying to compress a directory with Archiver. I want to exclude certain directories or files such as node_modules recursively.
For example, if I have a directory structure like this:
folder-to-compress
| node_modules
| sub-folder
| ignored-file-name
| included-file-name
| ignored-file-name
Below script only excludes from root level. So ignored-file-name in root will not be included in zip but sub-folder/ignored-file-name will be included. I'm wondering if there's a way to exclude recursively?
const fs = require('fs');
const archiver = require('archiver');
const output = fs.createWriteStream(__dirname);
const archive = archiver('zip', { zlib: { level: 9 } });
archive.pipe(output);
archive.glob('*/**', {
cwd: __dirname,
ignore: ['mode_modules', 'ignored-file-name', '*.zip']
});
archive.finalize();

You can ignore files using glob patterns.
In your example:
ignore: ['mode_modules', 'ignored-file-name', '*.zip']
Should be (I corrected the misspelt node_modules)
ignore: ['node_modules/**', '**/ignored-file-name.*', '*.zip']
Whenever you want to exclude an entire directory, you will need to add a globstar to the end /**
Gulp has a good article explaining globs with additional resources at the bottom: https://gulpjs.com/docs/en/getting-started/explaining-globs/
Just a few points for completeness in case anyone comes across this post.
You're not defining a file name. You are just defining the directory.
const output = fs.createWriteStream(__dirname);
Should be
const output = fs.createWriteStream(__dirname + "/example.zip");
Which is how it is defined in the example: https://github.com/archiverjs/node-archiver#quick-start

Related

How to delete all files and subdirectories in a directory with Node.js

I am working with node.js and need to empty a folder. I read a lot of deleting files or folders. But I didn't find answers, how to delete all files AND folders in my folder Test, without deleting my folder Test` itself.
I try to find a solution with fs or extra-fs. Happy for some help!
EDIT 1: Hey #Harald, you should use the del library that #ziishaned posted above. Because it's much more clean and scalable. And use my answer to learn how it works under the hood :)
EDIT: 2 (Dec 26 2021): I didn't know that there is a fs method named fs.rm that you can use to accomplish the task with just one line of code.
fs.rm(path_to_delete, { recursive: true }, callback)
// or use the synchronous version
fs.rmSync(path_to_delete, { recursive: true })
The above code is analogous to the linux shell command: rm -r path_to_delete.
We use fs.unlink and fs.rmdir to remove files and empty directories respectively. To check if a path represents a directory we can use fs.stat().
So we've to list all the contents in your test directory and remove them one by one.
By the way, I'll be using the synchronous version of fs methods mentioned above (e.g., fs.readdirSync instead of fs.readdir) to make my code simple. But if you're writing a production application then you should use asynchronous version of all the fs methods. I leave it up to you to read the docs here Node.js v14.18.1 File System documentation.
const fs = require("fs");
const path = require("path");
const DIR_TO_CLEAR = "./trash";
emptyDir(DIR_TO_CLEAR);
function emptyDir(dirPath) {
const dirContents = fs.readdirSync(dirPath); // List dir content
for (const fileOrDirPath of dirContents) {
try {
// Get Full path
const fullPath = path.join(dirPath, fileOrDirPath);
const stat = fs.statSync(fullPath);
if (stat.isDirectory()) {
// It's a sub directory
if (fs.readdirSync(fullPath).length) emptyDir(fullPath);
// If the dir is not empty then remove it's contents too(recursively)
fs.rmdirSync(fullPath);
} else fs.unlinkSync(fullPath); // It's a file
} catch (ex) {
console.error(ex.message);
}
}
}
Feel free to ask me if you don't understand anything in the code above :)
You can use del package to delete files and folder within a directory recursively without deleting the parent directory:
Install the required dependency:
npm install del
Use below code to delete subdirectories or files within Test directory without deleting Test directory itself:
const del = require("del");
del.sync(['Test/**', '!Test']);

How can you archive with tar in NodeJS while only storing the subdirectory you want?

Basically I want to do the equivalent of this How to strip path while archiving with TAR but with the tar commands imported to NodeJS, so currently I'm doing this:
const gzip = zlib.createGzip();
const pack = new tar.Pack(prefix="");
const source = Readable.from('public/images/');
const destination = fs.createWriteStream('public/archive.tar.gz');
pipeline(source, pack, gzip, destination, (err) => {
if (err) {
console.error('An error occurred:', err);
process.exitCode = 1;
}
});
But doing so leaves me with files like: "/public/images/a.png" and "public/images/b.png", when what I want is files like "/a.png" and "/b.png". I want to know how I can add to this process to strip out the unneeded directories, while keeping the files where they are.
You need to change working directory:
// cwd The current working directory for creating the archive. Defaults to process.cwd().
new tar.Pack({ cwd: "./public/images" });
const source = Readable.from('');
Source: documentation of node-tar
Example: https://github.com/npm/node-tar/blob/main/test/pack.js#L93

Node.js archiver Need syntax for excluding file types via glob

Using archiver.js (for Node.js), I need to exclude images from a recursive (multi-subdir) archive. Here is my code:
const zip = archiver('zip', { zlib: { level: 9 } });
const output = await fs.createWriteStream(`backup/${fileName}.zip`);
res.setHeader('Content-disposition', `attachment; filename=${fileName}.zip`);
res.setHeader('Content-type', 'application/download');
output.on('close', function () {
res.download(`backup/${fileName}.zip`, `${fileName}.zip`);
});
output.on('end', function () {
res.download(`backup/${fileName}.zip`, `${fileName}.zip`);
});
zip.pipe(output);
zip.glob('**/*',
{
cwd: 'user_uploads',
ignore: ['*.jpg', '*.png', '*.webp', '*.bmp'],
},
{});
zip.finalize();
The problem is that it did not exclude the ignore files. How can I correct the syntax?
Archiver uses Readdir-Glob for globbing which uses minimatch to match.
The matching in Readdir-Glob (node-readdir-glob/index.js#L147) is done against the full filename including the pathname and it does not allow us to apply the option matchBase which will much just the basename of the full path.
In order for to make it work you have 2 options:
1. Make your glob to exclude the file extensions
You can just convert your glob expression to exclude all the file extensions you don't want to be in your archive file using the glob negation !(...) and it will include everything except what matches the negation expression:
zip.glob(
'**/!(*.jpg|*.png|*.webp|*.bmp)',
{
cwd: 'user_uploads',
},
{}
);
2. Make minimatch to work with full file pathname
To make minimatch to work without us being able to set the matchBase option, we have to include the matching directory glob for it to work:
zip.glob(
'**/*',
{
cwd: 'user_uploads',
ignore: ['**/*.jpg', '**/*.png', '**/*.webp', '**/*.bmp'],
},
{}
);
Behaviour
This behaviour of Readdir-Glob is a bit confusing regarding the ignore option:
Options
ignore: Glob pattern or Array of Glob patterns to exclude matches. If a file or a folder matches at least one of the provided patterns, it's not returned. It doesn't prevent files from folder content to be returned.
This means that igrore items have to be actual glob expressions that must include the whole path/file expression. When we specify *.jpg, it will match files only in the root directory and not the subdirectories. If we want to exclude JPG files deep into the directory tree, we have to do it using the include all directories pattern in addition with the file extension pattern which is **/*.jpg.
Exclude only in subdirectories
If you want to exclude some file extensions only inside specific subdirectories, you can add the subdirectory into the path with a negation pattern like this:
// The glob pattern '**/!(Subdir)/*.jpg' will exclude all JPG files,
// that are inside any 'Subdir/' subdirectory.
zip.glob(
'**/*',
{
cwd: 'user_uploads',
ignore: ['**/!(Subdir)/*.jpg'],
},
{}
);
The following code is working with this directory structure :
node-app
|
|_ upload
|_subdir1
|_subdir2
|_...
In the code __dirname is the node-app directory (node-app is the directory where your app resides). The code is an adaptation of the code on https://www.archiverjs.com/ in paragraph Quick Start
// require modules
const fs = require('fs');
const archiver = require('archiver');
// create a file to stream archive data to.
const output = fs.createWriteStream(__dirname + '/example.zip');
const archive = archiver('zip', {
zlib: { level: 9 } // Sets the compression level.
});
// listen for all archive data to be written
// 'close' event is fired only when a file descriptor is involved
output.on('close', function() {
console.log(archive.pointer() + ' total bytes');
console.log('archiver has been finalized and the output file descriptor has closed.');
});
// This event is fired when the data source is drained no matter what was the data source.
// It is not part of this library but rather from the NodeJS Stream API.
// #see: https://nodejs.org/api/stream.html#stream_event_end
output.on('end', function() {
console.log('Data has been drained');
});
// good practice to catch warnings (ie stat failures and other non-blocking errors)
archive.on('warning', function(err) {
if (err.code === 'ENOENT') {
// log warning
} else {
// throw error
throw err;
}
});
// good practice to catch this error explicitly
archive.on('error', function(err) {
throw err;
});
// pipe archive data to the file
archive.pipe(output);
archive.glob('**',
{
cwd: __dirname + '/upload',
ignore: ['*.png','*.jpg']}
);
// finalize the archive (ie we are done appending files but streams have to finish yet)
// 'close', 'end' or 'finish' may be fired right after calling this method so register to them beforehand
archive.finalize();
glob is an abbreviation for 'global' so you use wildcards like * in the filenames ( https://en.wikipedia.org/wiki/Glob_(programming) ). So one possible accurate wildcard expression is *.jpg, *.png,... depending on the file type you want to exclude. In general the asterisk wildcard * replaces an arbitrary number of literal characters or an empty string in in the context of file systems ( file and directory names , https://en.wikipedia.org/wiki/Wildcard_character)
See also node.js - Archiving folder using archiver generate an empty zip

Read/Write files from relative paths after TypeScript compilation in node

I have the following folder structure:
/
/src/
file.ts
one.txt
/resources/
two.txt
and in file.ts I want to read contents of one.txt and two.txt by doing something like the following:
import fs from 'fs';
import path from 'path';
// sync is bad.
fs.readFileSync('one.txt');
fs.readFileSync(path.resolve(__dirname, '../resources/file.txt'));
Everything works just fine when using ts-node.
The problem is that when I run tsc it compiles all files to /dist (which I have told the compiler to do in tsconfig.json by setting outDir to ./dist), but both fs.readFileSync(...) fail, because the *.txt files are not copied to /dist, so fs can not find the files.
Now, my question is: Is there a beautiful way to handle this, and make fs read and writes work as expected both, when using ts-node, and after tsc?
I've managed to handle this in several projects by doing something weird like:
// file.ts
const getResourcesDir = () => {
const dir = path.basename(path.dirname(__dirname));
if (dir === 'dist') {
return path.resolve(__dirname, '../../resources');
} else {
return path.resolve(__dirname, '../resources');
}
}
But this seems just wrong. I believe that there should be a nicer solution, but I can't find it.
dir = path.basename(path.dirname(__dirname));
const certDir = (dir === 'dist' ? './dist/cert': '../cert');
console.log('Cert file', certDir, `${certDir}/server-key.pem`);
console.log('Key file', certDir, `${certDir}/server-key.pem`);

NodeJS - Copy and Rename all contents in existing directory recursively

I have a directory with folders and files within. I want to copy the entire directory with all its contents to a different location while renaming all the files to something more meaningful. I want to use nodejs to complete this series of operations. What is an easy way to do it, other than moving it one by one and renaming it one by one?
Thanks.
-- Thanks for the comment! So here is an example directory that I have in mind:
-MyFridge
- MyFood.txt
- MyApple.txt
- MyOrange.txt
- ...
- MyDrinks
- MySoda
- MyDietCoke.txt
- MyMilk.txt
- ...
- MyDesserts
- MyIce
...
I want to replace "My" with "Tom," for instance, and I also would like to rename "My" to Tom in all the text files. I am able to copy the directory to a different location using node-fs-extra, but I am having a hard time with renaming the file names.
Define your own tools
const fs = require('fs');
const path = require('path');
function renameFilesRecursive(dir, from, to) {
fs.readdirSync(dir).forEach(it => {
const itsPath = path.resolve(dir, it);
const itsStat = fs.statSync(itsPath);
if (itsPath.search(from) > -1) {
fs.renameSync(itsPath, itsPath.replace(from, to))
}
if (itsStat.isDirectory()) {
renameFilesRecursive(itsPath.replace(from, to), from, to)
}
})
}
Usage
const dir = path.resolve(__dirname, 'src/app');
renameFilesRecursive(dir, /^My/, 'Tom');
renameFilesRecursive(dir, /\.txt$/, '.class');
fs-jetpack has a pretty nice API to deal with problems like that...
const jetpack = require("fs-jetpack");
// Create two fs-jetpack contexts that point
// to source and destination directories.
const src = jetpack.cwd("path/to/source/folder");
const dst = jetpack.cwd("path/to/destination");
// List all files (recursively) in the source directory.
src.find().forEach(path => {
const content = src.read(path, "buffer");
// Transform the path however you need...
const transformedPath = path.replace("My", "Tom");
// Write the file content under new name in the destination directory.
dst.write(transformedPath, content);
});

Resources