decompressing a gz file using zlib package in NodeJS - node.js

I am trying to decompress a .gz file using zlib, one of the inbuilt library of NodeJS. But while decompressing it is throwing incorrect header check error. I am using following code to decompress.
import fs from 'fs';
import zlib from 'zlib';
const rStream = fs.createReadStream('./path-to-gz-file');
const wStream = fs.createWriteStream('./path-to-gz-file'.replace('.gz', '');
rStream
.pipe(zlib.createGunzip())
.on('error', err => { console.log(err); }
.pipe(wStream);
.on('error', err => { console.log(err); }
.pipe(wStream);
In one of the solution on internet, it was suggested to change the encoding of read and write stream to binary but that also doesn't work. I've also tried almost every solution of this issues that are available online, but nothing works.
If anyone have any further question please let me know, I will clarify as soon as possible.
PS: The same file when decompressed using gzip which is default compression library of linux it get extracted as expected by using the following command.

Related

How to download zip and directly extract zip via node?

I was wondering if it is possible to use https.get() from the Node standard library to download a zip and directly extract it into a subfolder.
I have found many solutions that download the zip first and extract it afterwards. But is there a way to do it directly?
This was my attempt:
const zlib = require("node:zlib");
const fs = require("fs");
const { pipeline } = require("node:stream");
const https = require("https");
const DOWNLOAD_URL =
"https://downloadserver.com/data.zip";
const unzip = zlib.createUnzip();
const output = fs.createWriteStream("folderToExtract");
https.get(DOWNLOAD_URL, (res) => {
pipeline(res, unzip, output, (error) => {
if (error) console.log(error);
});
});
But I get this error:
Error: incorrect header check
at Zlib.zlibOnError [as onerror] (node:zlib:189:17) {
errno: -3,
code: 'Z_DATA_ERROR'
}
I am curious, is this even possible?
Most unzippers start at the end of the zip file, reading the central directory there and using that to find the entries to unzip. This requires having the entire zip file accessible at the start.
What you'd need is a streaming unzipper, which starts at the beginning of the zip file. You can try unzip-stream and see if it meets your needs.
I think this is similar to Simplest way to download and unzip files in Node.js cross-platform?
An answer in the above discussion using same package:
You're getting the error probably because zlib only support gzip files and not zip

NodeJS Zlib incorrect header check

OSX 10.12.6
node v12.2.0
gzip 1.10
I gzipped some plaintext and I'm trying to read it
fs = require('fs')
zlib = require('zlib')
fs.createReadStream(filepath, {'encoding': 'UTF-8'})
.pipe(zlib.createGunzip()) // createUnzip behaves similarly.
.pipe(somethingelse())
.on('finish', function(){
console.log("finished reading");
});
This shows
Thrown:
Error: incorrect header check
errno -3
I hadn't realized that setting the UTF-8 encoding begins parsing in a different way. Removing the {encoding: 'UTF-8'} lets the zlib step decompress correctly, and my next step can consume directly from the stream.

gunzip partials read from read-stream

I use Node.JS to fetch files from my S3 bucket.
The files over there are gzipped (gz).
I know that the contents of each file is composed by lines, where each line is a JSON of some record that failed to be put on Kinesis.
Each file consists of ~12K such records. and I would like to be able to process the records while the file is being downloaded.
If the file was not gzipped, that could be easily done using streams and readline module.
So, the only thing that stopping me from doing this is the gunzip process which, to my knowledge, needs to be executed on the whole file.
Is there any way of gunzipping a partial of a file?
Thanks.
EDIT 1: (bad example)
Trying what #Mark Adler suggested:
const fileStream = s3.getObject(params).createReadStream();
const lineReader = readline.createInterface({input: fileStream});
lineReader.on('line', line => {
const gunzipped = zlib.gunzipSync(line);
console.log(gunzipped);
})
I get the following error:
Error: incorrect header check
at Zlib._handle.onerror (zlib.js:363:17)
Yes. node.js has a complete interface to zlib, which allows you to decompress as much of a gzip file at a time as you like.
A working example that solves the above problem
The following solves the problem in the above code:
const fileStream = s3.getObject(params).createReadStream().pipe(zlib.createGunzip());
const lineReader = readline.createInterface({input: fileStream});
lineReader.on('line', gunzippedLine => {
console.log(gunzippedLine);
})

Unzipping with zlib in Node.js results in incorrect header error

In short I'm trying to read a .zip file from my file system, deflate the zip-file and them stream it with xml-stream to do some things with the contents in the file.
I thought this would be fairly simple and started with this:
var fs = require('fs')
, XmlStream = require('xml-stream')
, zlib = require('zlib');
//- read the file and buffer it.
var path = '../path/to/some.zip';
var fileBuffer = fs.readFileSync(path, { encoding: 'utf8' });
//- use zlib to unzip it
zlib.gunzip(fileBuffer, function(err, buffer) {
if (!err) {
console.log(buffer.toString());
}
console.log(err);
});
But this results in a
{ [Error: incorrect header check] errno: -3, code: 'Z_DATA_ERROR' }
Changing the encoding or the method (.unzip, .gunzip or .inflate) isn't working either.
What am I missing here?
Gzip is not zip. They're different compression formats, just like RAR is. The error indicates that what you're trying to read is not a gzipped file.
You can use a different library, such as JSZip.
I'm using zlib.unzip instead zlib.gunzip

JPEG File Encoding and writeFile in Node JS

I'm using http.request to download JPEG file. I am then using fs.writeFile to try to write the JPEG file out to the hard drive.
None of my JPEG files can be opened, they all show an error (but they do have a file size). I have tried all of the different encodings with fs.writeFile.
What am I messing up in this process?
Here's what the working one is showing when viewing it raw:
And here is what the bad one using fs.writeFile is showing:
Figured it out, needed to use res.setEncoding('binary'); on my http.request.
Thank you, looking to the previous response, I was able to save de media correctly:
fs.writeFile(
filepath + fileName + extension,
mediaReceived, // to use with writeFile
{ encoding: "binary" }, // to use with writeFile ***************WORKING
(err) => {
if (err) {
console.log("An error ocurred while writing the media file.");
return console.log(err);
}
}
);

Resources