Node.js - Zip/Unzip a folder

Node.js - Zip/Unzip a folder - node.js

I'm diving into Zlib of node.js. I was able to compress and uncompress files using the provided examples (http://nodejs.org/api/zlib.html#zlib_examples) but I didn't be able to find more about doing the same for folders?
One possibility (but that I consider as tinkering) is to use node-zip module and adding all the files of the folder one by one. But I'll face a problem when uncompressing (I will lose the folders in this case).
Any idea how to compress (and then uncompress) a whole folder (respecting the sub-solders hierarchy) using Node.js?
Thanks.

I've finally got it, with the help of #generalhenry (see comments on the question) and
as mentioned in the comments, we need to compress the folder in two steps:
Convert the folder into a .tar file
Compress the .tar file
In order to perform the first step, I needed two node.js modules:
npm install tar
npm install fstream
The first one allows us to create .tar files. You can have access to the source code here https://github.com/isaacs/node-tar
The second node module will help us to read a folder and write a file. Concerning the basic fs node.js module, I don't know if it is possible to read a directory (I'm not talking about getting all the files in an array, using fs.readdir, but handling all the files and their organization in folders).
Then, when I convert the folder to .tar file, I can compress it using Gzip() of Zlib.
Here is the final code:
var fstream = require('fstream'),
tar = require('tar'),
zlib = require('zlib');
fstream.Reader({ 'path': 'path/to/my/dir/', 'type': 'Directory' }) /* Read the source directory */
.pipe(tar.Pack()) /* Convert the directory to a .tar file */
.pipe(zlib.Gzip()) /* Compress the .tar file */
.pipe(fstream.Writer({ 'path': 'compressed_folder.tar.gz' })); /* Give the output file name */
This helped me to compress an entire folder using node.js
2 more things:
As you can see, there is a lack of documentation on tar module. I hope this will be improved soon since the two examples that was provided talk about how to extract content from the .tar file.
I used the fstream module to help me handle the source directory. Can this be bypassed using fs? I don't know (please, comment if you have an idea).

Providing an updated answer as package tar has been updated since 2013.
To achieve the same result, the code is much simpler and straightforward:
const tar = require("tar"); // version ^6.0.1
const fs = require("fs");
tar.c(
{
gzip: true // this will perform the compression too
},
["path/to/my/dir/"]
).pipe(fs.createWriteStream('path/to/my/dir.tgz'));
No need to explicitly use zlib.

You can use the tar-stream module to create a tar archive. Its much more flexible and simpler than node-tar in that:
You can add files to the archive (not just directories, which is a limitation of node-tar)
It works with normal node filesystem streams (node-tar strangely requires the use of the fstream module)
Its pretty fully documented (node-tar isn't well documented)
You can create an archive without hitting the filesystem
Create the tar archive, then compress it using zlib, then write it wherever you want (network, filesystem, etc).

Related

Unzip single file from zip archive using node.js zlib module

Let say I have a zip archive test.zip which contained two files:
test1.txt and text2.txt
I want to extract only test1.txt using the node inbuilt zlib module.
How to do that?
I don't want to install any package.

You could run a shell command to unzip, assuming that unzip is installed on your system. (It very likely is.)
As far as I can tell, there is no zip functionality within node.js without installing a package.
You can use zlib to help you with the decompression part, but you will have to write your own code to interpret the zip format. You can use zlib.inflateRaw to decompress the raw deflate compressed data of a zip entry. You have to first find where that compressed data starts by reading and interpreting the zip file headers.
The zip format is documented here.

How do I read an ini file from a node executable(electron app)

I have a file structure like so:
ElectronAppExecutable.exe
File1.dll
File2.dll
File3.dll
Config.ini
How would I get my node executable to read a Config.ini file in the same folder?

You would need to import a package to read files most commonly in node you would use the FS library. However, newer versions of electron might need you to flag node_packages as true to access them. < (Done some digging and this may not apply anymore or just happened in my use case)
Just to note FS reads the file as a string so you would need to parse values you need out of the config.ini or you can put your faith into a package like this > https://www.npmjs.com/package/ini .

Untar file in specific directory

I try to download a tar.gz file and uncompress it on /tmp/apps.
However i don't want to uncompress it if the directory already exists.
If the file exists or even doesn't exist it is downloaded and uncompressed.
I cant find my code is missing a parameters on my exec block or if I made a mistake somewhere else.
I'm using Puppet 3.8.
Gist file of my puppet

Use the puppet/archive module, https://forge.puppet.com/puppet/archive. It will download the archive, check for existing files and even tidy up after itself.

NodeJS archive manager

I need to get the content of archives and then I want to uncompress the selected one - but I dont want to uncompress the archives to know what's in it. I'd like to list and uncompress at least zip and rar, but (if that's possible) I don't want to be limited to only these two.
Can you advise good npm modules or other projects to achieve this?
Here's what I came up with:
zip
I found node-zip can only unzip files, but not list archive content.
rar
The best solution seems node-rar, but I can't install it on Windows.
node-uncompress This does what it says: It's an "Command-line wrapper for uncompressing various file types." So there is again no possibility to list archive content.
Currently I try to get node-uncompress to list files and hopefully it must never run cross-platform.

Solution:
I am now using 7zip with the node module node-7z instead of trying to get every archive working on its own. The corresponding site is: https://www.npmjs.com/package/node-7z
This library uses the OS independent archive manager 7zip. On Windows 7za is used. "7za.exe (a = alone) is a standalone version of 7-Zip". I've tested it on Windows and Ubuntu and it works great.
Update:
At Windows: Somehow I just got it working by adding 7za to the Path variables - not by adding 7za.exe to the "the same directory of your package.json file." like the description says.
Update 2:
On Windows 7za, that's referred in the node-7z post, cannot handle .rar-archives. So I'm using the "casual" 7-zip instead of 7za.exe. I just renamed the commanline 7z.exe to 7za.exe and added the 7-zip folder to the Path Variables.

Zip files corrupt over 4 gigabytes - No warnings or errors - Did I lose my data?

I created a bunch of zip files on my computer (Mac OS X) using a command like this:
zip -r bigdirectory.zip bigdirectory
Then, I saved these zip files somewhere and deleted the original directories.
Now, when I try to extract the zip files, I get this kind of error:
$ unzip -l bigdirectory.zip
Archive: bigdirectory.zip
warning [bigdirectory.zip]: 5162376229 extra bytes at beginning or within zipfile
(attempting to process anyway)
error [bigdirectory.zip]: start of central directory not found;
zipfile corrupt.
(please check that you have transferred or created the zipfile in the
appropriate BINARY mode and that you have compiled UnZip properly)
I have since discovered that this could be because zip can't handle files over a certain size, maybe 4 gigs. At least I read that somewhere.
But why would the zip command let me create these files? The zip file in question is 9457464293 bytes and it let me make many more like this with absolutely no errors.
So clearly it can create these files.
I really hope my files aren't lost. I've learned my lesson and in the future I will check my archives before deleting the original files, and I'll probably also use another file format like tar/gzip.
For now though, what can I do? I really need my files.
Update
Some people have suggested that my unzip tool did not support big enough files (which is weird, because I used the builtin OS X zip and unzip). At any rate, I installed a new unzip from homebrew, and lo and behold, I do get a different error now:
$ unzip -t bigdirectory.zip
testing: bigdirectory/1.JPG OK
testing: bigdirectory/2.JPG OK
testing: bigdiretoryy/3.JPG OK
testing: bigdirectory/4.JPG OK
:
:
file #289: bad zipfile offset (local header sig): 4294967295
(attempting to re-compensate)
file #289: bad zipfile offset (local header sig): 4294967295
file #290: bad zipfile offset (local header sig): 9457343448
file #291: bad zipfile offset (local header sig): 9457343448
file #292: bad zipfile offset (local header sig): 9457343448
file #293: bad zipfile offset (local header sig): 9457343448
:
:
This is really worrisome because I need these files back. And there were definitely no errors upon creation of this zip file using the system zip tool. In fact, I made several of these at the same time and now they are all exhibiting the same problem.
If the file really is corrupt, how do I fix it?
Or, if it is not corrupt, how do I extract it?

Unzip below 6 seemingly fails, use
jar -xf <zipfile>
if you have java installed, or yet another unzip before you write the file off.
See: https://serverfault.com/questions/235139/how-to-unzip-files-bigger-than-4gb

Try 7z x
I had the same issue with unzip %x on Linux for a .zip file larger than 4GB, compounded with a only DEFLATED entries can have EXT descriptor error.
The command 7z x resolved all my issues though.
Be careful though, the command 7z x will extract all files with a path rooted in the current directory. The option -o allows to specify an output directory.

I had a similar problem backing up a 12GB directory before performing a hard disk format. Funnily enough I used the same command as you.
I read around and found suggestions to run:
zip -F
and
zip -FF
to try to fix the file.
Unfortunately these did not work and I still received errors.
After looking around some more, I found the ditto command and it worked perfectly against my original (untouched) zip file:
ditto -x -k original-file.zip dst-directory
-x to extract an archive
-k Specifies it to be a PKZip archive instead of the default CPIO
After using this command, I successfully extracted all of the files.

The built-in macOS Archive Utility (which is the default used when you select something in Finder and go to File -> Compress "<item>") also creates "corrupt" archives when a file in the archive is over 4 gigabytes in size, the size of the archive itself is over 4 gigabytes or you are trying to compress more than 65536 files into a single zip. This happens because it doesn't use the Zip64 extension format.
This is mentioned on https://apple.stackexchange.com/questions/221020/large-zip-files-created-in-os-x-cannot-be-opened-in-windows and is well covered in the "Apple Archive Utility (and ditto) and very large ZIP archives" 2009 blog post for the now defunct Springy utility. You can also see the 7-Zip folks are aware of the Apple tools creating corrupt zips issue too.
But why would the zip command let me create these files?
Strictly speaking, the original zip format only supports archives up to 2^32 bytes (4GiB) and which do not contain files that were originally larger than 4GiB and you there must be less than 65535 files. Because the command line version of the Infozip command tools shipped with OSX up to version OSX 10.11 (El Capitan) was no newer than 5.52, it could only produce non-conformant archives if you forced it to exceed the original zip format limits. Infozip 6.0 and above know how to make Zip64 archives and that standard has much higher limits. The Infozip 6.0 command line tools started shipping with macOS 10.12 (Sierra). In 2014 when the question was originally asked the newest OSX was 10.10 (Yosemite).
As stated above, even in macOS 10.15 (Catalina) the GUI Archive Utility still creates such "corrupt" zips.
If the file really is corrupt, how do I fix it?
It's corrupt in the sense that its non-conformant and will cause a lot of conformant tools to choke. You could extract (it see below) and then compress again with a tool that knows how to make Zip64 files...
Or, if it is not corrupt, how do I extract it?
Technically, all of the data from the files that have been compressed is still in the archive but the headers that allow fast listing of the zip's content are broken. Such zips can be a struggle to work with when using other tools (even testing such a zip with the command line unzip tool on the same version of macOS can indicate issues like invalid compressed data to inflate / bad zipfile offset (local header sig)).
To get at the files of such zips you need to use a program that will quietly just extract whatever was compressed without checking for conformance or trying to check/list the files. Examples of tools that can do this are:
macOS Archive Utility GUI tool
macOS command line tool ditto
7-zip
Java's jar tool
Infozip based tools won't be able to work with or repair such zip files once you've made such a problem zip file.

you can use
zip -FF corrupted.zip --out fixed.zip
replace corrupted.zip by your zip with issues
replace fixed.zip by the name of new .zip file fixed

I have faced exactly the same issue when I tried to unzip zip files of huge sizes (~7GB). I was damn sure that there was no error while copying the zip files to the server. (I double-checked it with rsync).
Depending on your situation, the solution is:
1) If you're doing this in a local machine, right click on the zip file and give Extract Here, this will work for (.zip) files of any size.
2) If your zip files are in a remote server, first load the server filesystem locally using sftp (sftp://username#server.url.address.com). After that just navigate to the directory and again do the same thing as you did in (1). i.e. right click on the zip file and extract it.
Might not be the best solution but that's one way of doing it.

Develop Reference

node.js excel linux python-3.x azure haskell apache-spark rust .htaccess string