Node.js piping gzipped streams in CSV module

Node.js piping gzipped streams in CSV module - node.js

I have a gzipped CSV file that I would like to read, perform some transformations, and write back somewhere gzipped. I am using the node-csv module for CSV transformations.
An simplified version of the code looks like this:
// dependencies
var fs = require('fs'),
zlib = require('zlib'),
csv = require('csv'); // http://www.adaltas.com/projects/node-csv/
// filenames
var sourceFileName = process.argv[2] || 'foo.csv.gz';
targetFileName = process.argv[3] || 'bar.csv.gz';
// streams
var reader = fs.createReadStream(sourceFileName),
writer = fs.createWriteStream(__dirname + '\\' + targetFileName),
gunzip = zlib.createGunzip(),
gzip = zlib.createGzip();
csv()
.from.stream( reader.pipe(gunzip) )
.to.stream( gzip.pipe(writer) ) // <-- the output stream
.transform( function(row) {
// some operation here
return row;
});
The problem is that this codes effectively writes a file with the specified name, although not gzipped, i.e. if the file gets the .gz removed, it can be opened as a regular CSV.
The question then is, how can the csv().to.stream() be passed an output stream that gzips the data and pipes it to a writer?
Thanks!

You're pipeing the csv to the writer because .pipe returns it's argument for chaining.
You need to change:
.to.stream( gzip.pipe(writer) ) // <-- the output stream
To:
.to.stream( gzip ) // <-- the output stream
. . .
gzip.pipe(writer);

Related

Write multiple files to http response with streams in nodejs

I have an array of files that I have to pack into a gzip archive and send them through http response on the fly. That means I can't store the whole file in the memory yet I have to synchronously pipe them into tar.entry or everything is going to break.
const tar = require('tar-stream'); //lib for tar stream
const { createGzip } = require('zlib'); //lib for gzip stream
//large list of huge files.
const files = [ 'file1', 'file2', 'file3', ..., 'file99999' ];
...
//http request handler:
const pack = tar.pack(); //tar stream, creates .tar
const gzipStream = createGzip(); //gzip stream so we could reduce the size
//pipe archive data trough gzip stream
//and send it to the client on the fly
pack.pipe(gzipStream).pipe(response);
//The issue comes here, when I need to pass multiple files to pack.entry
files.forEach(name => {
const src = fs.createReadStream(name); //create stream from file
const size = fs.statSync(name).size; //determine it's size
const entry = pack.entry({ name, size }); //create tar entry
//and this ruins everything because if two different streams
//writes smth into entry, it'll fail and throw an error
src.pipe(entry);
});
Basically I need for the pipe to complete sending data (smth like await src.pipe(entry);), but pipes in nodejs don't do that. So is there any way I could get around it?

Nevermind, just don't use forEach in this case

Cannot read File from fs with FileReader

Hi i am trying to read a file and i am having trouble with the fileReader readAsArrayBuffer function in nodejs.
var FileReader = require("filereader");
let p12_path = __dirname + "/file.p12";
var p12xxx = fs.readFileSync(p12_path, "utf-8");
var reader = new FileReader();
reader.readAsArrayBuffer(p12xxx);//The problem is here
reader.onloadend = function() {
arrayBuffer = reader.result;
var arrayUint8 = new Uint8Array(arrayBuffer);
var p12B64 = forge.util.binary.base64.encode(arrayUint8);
var p12Der = forge.util.decode64(p12B64);
var p12Asn1 = forge.asn1.fromDer(p12Der);
............
}
-------The error
Error: cannot read as File: "0�6�\.............

You are reading a PDF file which is not a text based format and should not have an encoding specified. As per the fs docs "If the encoding option is specified then this function returns a string" but because its mostly a binary file its reading invalid UTF8 characters. When you exclude the encoding it should give you a Buffer object instead which is what you most likely want.

According to the npm filereader Doc, the reader created with fs.readFileSync(p12_path, "utf-8"); needs to get a path of a file in the utf-8 encoding, otherwise it cannot read it.
The printed out "0�6�\............. shows the file is obviously not in utf8 and therefor not readable.

Compress an uncompressed xlsx file using node.js (Electron)

I have an unzipped xlsx file, in it I edit some files to be able to generate a new xlsx file containing new data.
In linux to recompress the file in xlsx I just need to go into the terminal and type
find . -type f | xargs zip ../newfile.xlsx
into the folder where the xlsx files are.
The question now is how can I do this using node.js?

The solution is to compress a direct list of files contained in xlsx, for some reason if we try to compress the folder the file has corrupted.
The code looks like this if you use JSZIP
var fs = require('fs');
var JSZip = require("jszip");
var zip = new JSZip();
var file = [];
file.push("_rels/.rels");
file.push("docProps/core.xml");
file.push("docProps/app.xml");
file.push("docProps/custom.xml");
file.push("[Content_Types].xml");
file.push("xl/_rels/workbook.xml.rels");
file.push("xl/styles.xml");
file.push("xl/pivotTables/_rels/pivotTable3.xml.rels");
file.push("xl/pivotTables/_rels/pivotTable1.xml.rels");
file.push("xl/pivotTables/_rels/pivotTable2.xml.rels");
file.push("xl/pivotTables/pivotTable3.xml");
file.push("xl/pivotTables/pivotTable1.xml");
file.push("xl/pivotTables/pivotTable2.xml");
file.push("xl/workbook.xml");
file.push("xl/worksheets/_rels/sheet2.xml.rels");
file.push("xl/worksheets/_rels/sheet1.xml.rels");
file.push("xl/worksheets/_rels/sheet3.xml.rels");
file.push("xl/worksheets/sheet4.xml");
file.push("xl/worksheets/sheet1.xml");
file.push("xl/worksheets/sheet3.xml");
file.push("xl/worksheets/sheet2.xml");
file.push("xl/sharedStrings.xml");
file.push("xl/pivotCache/_rels/pivotCacheDefinition1.xml.rels");
file.push("xl/pivotCache/pivotCacheDefinition1.xml");
file.push("xl/pivotCache/pivotCacheRecords1.xml");
for (var i = 0; i < file.length; i++) {
zip.file(file[i], fs.readFileSync("/home/user/xlsx_FILES/"+file[i]));
}
zip.generateAsync({type:"blob"}).then(function(content) {
// see FileSaver.js
saveAs(content, "yourfile.xlsx");
});

Take a look at archiver, a compression library for nodejs. The docs for the library look like they are comprehensive. The library also allows you to append archives and take advantage of streaming api's for appending and creating new archives.
Here is an example snippet from their docs which shows how to use the library.
// require modules
var fs = require('fs');
var archiver = require('archiver');
// create a file to stream archive data to.
var output = fs.createWriteStream(__dirname + '/example.zip');
var archive = archiver('zip', {
store: true // Sets the compression method to STORE.
});
// listen for all archive data to be written
output.on('close', function() {
console.log(archive.pointer() + ' total bytes');
console.log('archiver has been finalized and the output file descriptor has closed.');
});
// good practice to catch this error explicitly
archive.on('error', function(err) {
throw err;
});
// pipe archive data to the file
archive.pipe(output);

Node.js - ZLIB Gunzip returns empty file

I'm just testing ZLIB of Node.js but quickly facing strange results. Here is my script (inspired from the Node.js manual example http://nodejs.org/api/zlib.html#zlib_examples):
var zlib = require('zlib') ,
fs = require('fs') ,
inp1 = fs.createReadStream('file.txt') ,
out1 = fs.createWriteStream('file.txt.gz') ,
inp2 = fs.createReadStream('file.txt.gz') ,
out2 = fs.createWriteStream('output.txt') ;
inp1.pipe(zlib.createGzip()).pipe(out1); /* Compress to a .gz file*/
inp2.pipe(zlib.createGunzip()).pipe(out2); /* Uncompress the .gz file */
In this example, and before executing the script, I created a file called file.txt and I fullfill it with a sample text (say a Lorem Ipsum).
The previous script creates successfully the .gz file, that I can unzip from the finder (I'm on Mac OSX), but the uncompressed output.txt file is empty.
Why? Do you have any idea?

Node streams are asynchronous, so both of your streams will run at the same time. That means that when you initially open inp2 that file.txt.gz is empty, because the other write stream hasn't added anything to it yet.
var zlib = require('zlib') ,
fs = require('fs');
var src = 'file.txt',
zip = 'file.txt.gz',
dst = 'output.txt';
var inp1 = fs.createReadStream(src);
var out1 = fs.createWriteStream(zip);
inp1.pipe(zlib.createGzip()).pipe(out1);
out1.on('close', function(){
var inp2 = fs.createReadStream(zip);
var out2 = fs.createWriteStream(dst);
inp2.pipe(zlib.createGunzip()).pipe(out2);
})

How to pipe one readable stream into two writable streams at once in Node.js?

The goal is to:
Create a file read stream.
Pipe it to gzip (zlib.createGzip())
Then pipe the read stream of zlib output to:
1) HTTP response object
2) and writable file stream to save the gzipped output.
Now I can do down to 3.1:
var gzip = zlib.createGzip(),
sourceFileStream = fs.createReadStream(sourceFilePath),
targetFileStream = fs.createWriteStream(targetFilePath);
response.setHeader('Content-Encoding', 'gzip');
sourceFileStream.pipe(gzip).pipe(response);
... which works fine, but I need to also save the gzipped data to a file so that I don't need to regzip every time and be able to directly stream the gzipped data as a response.
So how do I pipe one readable stream into two writable streams at once in Node?
Would sourceFileStream.pipe(gzip).pipe(response).pipe(targetFileStream); work in Node 0.8.x?

Pipe chaining/splitting doesn't work like you're trying to do here, sending the first to two different subsequent steps:
sourceFileStream.pipe(gzip).pipe(response);
However, you can pipe the same readable stream into two writeable streams, eg:
var fs = require('fs');
var source = fs.createReadStream('source.txt');
var dest1 = fs.createWriteStream('dest1.txt');
var dest2 = fs.createWriteStream('dest2.txt');
source.pipe(dest1);
source.pipe(dest2);

I found that zlib returns a readable stream which can be later piped into multiple other streams. So I did the following to solve the above problem:
var sourceFileStream = fs.createReadStream(sourceFile);
// Even though we could chain like
// sourceFileStream.pipe(zlib.createGzip()).pipe(response);
// we need a stream with a gzipped data to pipe to two
// other streams.
var gzip = sourceFileStream.pipe(zlib.createGzip());
// This will pipe the gzipped data to response object
// and automatically close the response object.
gzip.pipe(response);
// Then I can pipe the gzipped data to a file.
gzip.pipe(fs.createWriteStream(targetFilePath));

you can use "readable-stream-clone" package
const fs = require("fs");
const ReadableStreamClone = require("readable-stream-clone");
const readStream = fs.createReadStream('text.txt');
const readStream1 = new ReadableStreamClone(readStream);
const readStream2 = new ReadableStreamClone(readStream);
const writeStream1 = fs.createWriteStream('sample1.txt');
const writeStream2 = fs.createWriteStream('sample2.txt');
readStream1.pipe(writeStream1)
readStream2.pipe(writeStream2)

Develop Reference

node.js excel linux python-3.x azure haskell apache-spark rust .htaccess string

Node.js piping gzipped streams in CSV module - node.js

You're pipeing the csv to the writer because .pipe returns it's argument for chaining. You need to change: .to.stream( gzip.pipe(writer) ) // <-- the output stream To: .to.stream( gzip ) // <-- the output stream . . . gzip.pipe(writer);

Related

Write multiple files to http response with streams in nodejs

Cannot read File from fs with FileReader

Compress an uncompressed xlsx file using node.js (Electron)

Node.js - ZLIB Gunzip returns empty file

How to pipe one readable stream into two writable streams at once in Node.js?

Categories

Resources