Unzip single file from zip archive using node.js zlib module - node.js

Let say I have a zip archive test.zip which contained two files:
test1.txt and text2.txt
I want to extract only test1.txt using the node inbuilt zlib module.
How to do that?
I don't want to install any package.

You could run a shell command to unzip, assuming that unzip is installed on your system. (It very likely is.)
As far as I can tell, there is no zip functionality within node.js without installing a package.
You can use zlib to help you with the decompression part, but you will have to write your own code to interpret the zip format. You can use zlib.inflateRaw to decompress the raw deflate compressed data of a zip entry. You have to first find where that compressed data starts by reading and interpreting the zip file headers.
The zip format is documented here.

Related

Some zip files not unarchiving in Python

I am downloading a zip file from an API and trying to unzip it using Python shutil.
shutil.unpack_archive(file_name)
It gives weird behaviour, for some files it works, for others it shows the following error:
name.zip is not a zip file
There is no issue with the downloaded file, I am able to unarchive it manually.
Any help here would be appreciated.
You should use zipfile (for zip archives) or tarfile (for tar archives)

Unzip split zip files

Is there any node library which able to unzip split zip files? Or any other way? I have tried a lot of node libraries but no luck.
I am receiving files like below.
test.z01
test.z02
test.zip
I need to extract these files.

Unable to extract shape_predictor_68_face_landmarks.dat for bz

I am trying to run some face frontalization code (using Python3 on Windows10), the code uses opencv and dlib and requires a file called shape_predictor_68_face_landmarks.dat. The code tries to automatically download it and then unzip it but it fails to unzip giving an unexpected end of archive error. I tried to use WinRaR to repair the file (which I also tried manualy downloading from http://sourceforge.net/projects/dclib/files/dlib/v18.10/shape_predictor_68_face_landmarks.dat.bz2) but it says it can only repair .zip and .rar files.
Does anyone know where I can download the uncompressed .dat file from? Or alternatively how I can repair a damaged .bz file in Windows?
The file is available at
http://dlib.net/files/shape_predictor_68_face_landmarks.dat.bz2
I downloaded it and verified that extraction works. The file is smaller than the one used in the previous version, but I think that is due to improvements.
In case this does not work, let me (or Davis King, who maintains the dlib blog) know so that you can get the uncompressed version.
Downloading using the CLI is a lot easier.
wget http://dlib.net/files/shape_predictor_68_face_landmarks.dat.bz2
To decompress the compressed file you just downloaded, use the following command
bzip2 -d shape_predictor_68_face_landmarks.dat.bz2
As mentioned above, download shape_predictor_68_face_landmarks.dat
from here. But while downloading, downloads gets failed(i faced this issue). So, if you're also facing the same issue, then i recommend to download it via command-line:
$ wget link

Zip files corrupt over 4 gigabytes - No warnings or errors - Did I lose my data?

I created a bunch of zip files on my computer (Mac OS X) using a command like this:
zip -r bigdirectory.zip bigdirectory
Then, I saved these zip files somewhere and deleted the original directories.
Now, when I try to extract the zip files, I get this kind of error:
$ unzip -l bigdirectory.zip
Archive: bigdirectory.zip
warning [bigdirectory.zip]: 5162376229 extra bytes at beginning or within zipfile
(attempting to process anyway)
error [bigdirectory.zip]: start of central directory not found;
zipfile corrupt.
(please check that you have transferred or created the zipfile in the
appropriate BINARY mode and that you have compiled UnZip properly)
I have since discovered that this could be because zip can't handle files over a certain size, maybe 4 gigs. At least I read that somewhere.
But why would the zip command let me create these files? The zip file in question is 9457464293 bytes and it let me make many more like this with absolutely no errors.
So clearly it can create these files.
I really hope my files aren't lost. I've learned my lesson and in the future I will check my archives before deleting the original files, and I'll probably also use another file format like tar/gzip.
For now though, what can I do? I really need my files.
Update
Some people have suggested that my unzip tool did not support big enough files (which is weird, because I used the builtin OS X zip and unzip). At any rate, I installed a new unzip from homebrew, and lo and behold, I do get a different error now:
$ unzip -t bigdirectory.zip
testing: bigdirectory/1.JPG OK
testing: bigdirectory/2.JPG OK
testing: bigdiretoryy/3.JPG OK
testing: bigdirectory/4.JPG OK
:
:
file #289: bad zipfile offset (local header sig): 4294967295
(attempting to re-compensate)
file #289: bad zipfile offset (local header sig): 4294967295
file #290: bad zipfile offset (local header sig): 9457343448
file #291: bad zipfile offset (local header sig): 9457343448
file #292: bad zipfile offset (local header sig): 9457343448
file #293: bad zipfile offset (local header sig): 9457343448
:
:
This is really worrisome because I need these files back. And there were definitely no errors upon creation of this zip file using the system zip tool. In fact, I made several of these at the same time and now they are all exhibiting the same problem.
If the file really is corrupt, how do I fix it?
Or, if it is not corrupt, how do I extract it?
Unzip below 6 seemingly fails, use
jar -xf <zipfile>
if you have java installed, or yet another unzip before you write the file off.
See: https://serverfault.com/questions/235139/how-to-unzip-files-bigger-than-4gb
Try 7z x
I had the same issue with unzip %x on Linux for a .zip file larger than 4GB, compounded with a only DEFLATED entries can have EXT descriptor error.
The command 7z x resolved all my issues though.
Be careful though, the command 7z x will extract all files with a path rooted in the current directory. The option -o allows to specify an output directory.
I had a similar problem backing up a 12GB directory before performing a hard disk format. Funnily enough I used the same command as you.
I read around and found suggestions to run:
zip -F
and
zip -FF
to try to fix the file.
Unfortunately these did not work and I still received errors.
After looking around some more, I found the ditto command and it worked perfectly against my original (untouched) zip file:
ditto -x -k original-file.zip dst-directory
-x to extract an archive
-k Specifies it to be a PKZip archive instead of the default CPIO
After using this command, I successfully extracted all of the files.
The built-in macOS Archive Utility (which is the default used when you select something in Finder and go to File -> Compress "<item>") also creates "corrupt" archives when a file in the archive is over 4 gigabytes in size, the size of the archive itself is over 4 gigabytes or you are trying to compress more than 65536 files into a single zip. This happens because it doesn't use the Zip64 extension format.
This is mentioned on https://apple.stackexchange.com/questions/221020/large-zip-files-created-in-os-x-cannot-be-opened-in-windows and is well covered in the "Apple Archive Utility (and ditto) and very large ZIP archives" 2009 blog post for the now defunct Springy utility. You can also see the 7-Zip folks are aware of the Apple tools creating corrupt zips issue too.
But why would the zip command let me create these files?
Strictly speaking, the original zip format only supports archives up to 2^32 bytes (4GiB) and which do not contain files that were originally larger than 4GiB and you there must be less than 65535 files. Because the command line version of the Infozip command tools shipped with OSX up to version OSX 10.11 (El Capitan) was no newer than 5.52, it could only produce non-conformant archives if you forced it to exceed the original zip format limits. Infozip 6.0 and above know how to make Zip64 archives and that standard has much higher limits. The Infozip 6.0 command line tools started shipping with macOS 10.12 (Sierra). In 2014 when the question was originally asked the newest OSX was 10.10 (Yosemite).
As stated above, even in macOS 10.15 (Catalina) the GUI Archive Utility still creates such "corrupt" zips.
If the file really is corrupt, how do I fix it?
It's corrupt in the sense that its non-conformant and will cause a lot of conformant tools to choke. You could extract (it see below) and then compress again with a tool that knows how to make Zip64 files...
Or, if it is not corrupt, how do I extract it?
Technically, all of the data from the files that have been compressed is still in the archive but the headers that allow fast listing of the zip's content are broken. Such zips can be a struggle to work with when using other tools (even testing such a zip with the command line unzip tool on the same version of macOS can indicate issues like invalid compressed data to inflate / bad zipfile offset (local header sig)).
To get at the files of such zips you need to use a program that will quietly just extract whatever was compressed without checking for conformance or trying to check/list the files. Examples of tools that can do this are:
macOS Archive Utility GUI tool
macOS command line tool ditto
7-zip
Java's jar tool
Infozip based tools won't be able to work with or repair such zip files once you've made such a problem zip file.
you can use
zip -FF corrupted.zip --out fixed.zip
replace corrupted.zip by your zip with issues
replace fixed.zip by the name of new .zip file fixed
I have faced exactly the same issue when I tried to unzip zip files of huge sizes (~7GB). I was damn sure that there was no error while copying the zip files to the server. (I double-checked it with rsync).
Depending on your situation, the solution is:
1) If you're doing this in a local machine, right click on the zip file and give Extract Here, this will work for (.zip) files of any size.
2) If your zip files are in a remote server, first load the server filesystem locally using sftp (sftp://username#server.url.address.com). After that just navigate to the directory and again do the same thing as you did in (1). i.e. right click on the zip file and extract it.
Might not be the best solution but that's one way of doing it.

Node.js - Zip/Unzip a folder

I'm diving into Zlib of node.js. I was able to compress and uncompress files using the provided examples (http://nodejs.org/api/zlib.html#zlib_examples) but I didn't be able to find more about doing the same for folders?
One possibility (but that I consider as tinkering) is to use node-zip module and adding all the files of the folder one by one. But I'll face a problem when uncompressing (I will lose the folders in this case).
Any idea how to compress (and then uncompress) a whole folder (respecting the sub-solders hierarchy) using Node.js?
Thanks.
I've finally got it, with the help of #generalhenry (see comments on the question) and
as mentioned in the comments, we need to compress the folder in two steps:
Convert the folder into a .tar file
Compress the .tar file
In order to perform the first step, I needed two node.js modules:
npm install tar
npm install fstream
The first one allows us to create .tar files. You can have access to the source code here https://github.com/isaacs/node-tar
The second node module will help us to read a folder and write a file. Concerning the basic fs node.js module, I don't know if it is possible to read a directory (I'm not talking about getting all the files in an array, using fs.readdir, but handling all the files and their organization in folders).
Then, when I convert the folder to .tar file, I can compress it using Gzip() of Zlib.
Here is the final code:
var fstream = require('fstream'),
tar = require('tar'),
zlib = require('zlib');
fstream.Reader({ 'path': 'path/to/my/dir/', 'type': 'Directory' }) /* Read the source directory */
.pipe(tar.Pack()) /* Convert the directory to a .tar file */
.pipe(zlib.Gzip()) /* Compress the .tar file */
.pipe(fstream.Writer({ 'path': 'compressed_folder.tar.gz' })); /* Give the output file name */
This helped me to compress an entire folder using node.js
2 more things:
As you can see, there is a lack of documentation on tar module. I hope this will be improved soon since the two examples that was provided talk about how to extract content from the .tar file.
I used the fstream module to help me handle the source directory. Can this be bypassed using fs? I don't know (please, comment if you have an idea).
Providing an updated answer as package tar has been updated since 2013.
To achieve the same result, the code is much simpler and straightforward:
const tar = require("tar"); // version ^6.0.1
const fs = require("fs");
tar.c(
{
gzip: true // this will perform the compression too
},
["path/to/my/dir/"]
).pipe(fs.createWriteStream('path/to/my/dir.tgz'));
No need to explicitly use zlib.
You can use the tar-stream module to create a tar archive. Its much more flexible and simpler than node-tar in that:
You can add files to the archive (not just directories, which is a limitation of node-tar)
It works with normal node filesystem streams (node-tar strangely requires the use of the fstream module)
Its pretty fully documented (node-tar isn't well documented)
You can create an archive without hitting the filesystem
Create the tar archive, then compress it using zlib, then write it wherever you want (network, filesystem, etc).

Resources