nodejs/fs: writing a tar to memory buffer - node.js

I need to be able to tar a directory, and send this to a remote endpoint via HTTP PUT.
I could of course create the tar, save it to disk, then read it again and send it.
But I'd rather like to create the tar, then pipe it to some buffer and send it immediately. I haven't been able to achieve this.
Code so far:
var tar = require('tar');
var fs = require("fs");
var path = "/home/me/uploaddir";
function getTar(path, cb) {
var buf = new Buffer('');
var wbuf = fs.createWriteStream(buf);
wbuf.on("finish", function() {
cb(buf);
});
tar.c({file:""},[path]).
pipe(wbuf);
}
getTar(path, function(tar) {
//send the tar over http
});
This code results in:
fs.js:575
binding.open(pathModule._makeLong(path),
^
TypeError: path must be a string
at TypeError (native)
at Object.fs.open (fs.js:575:11)
I've also tried using an array as buffer, no joy.

The following solution
creates the tar, then pipes it to some buffer and sends it immediately
and with great speed thanks to the tar-fs library:
First install the libraries request for simplified requests and tar-fs, which provides filesystem bindings for tar-stream: npm i -S tar-fs request
var tar = require('tar-fs')
var request = require('request')
var fs = require('fs')
// pack specific files in the directory
function packTar (folderName, pathsArr) {
return tar.pack(folderName, {
entries: pathsArr
})
}
// return put stream
function makePutReq (url) {
return request.put(url)
}
packTar('./testFolder', ['test.txt', 'test1.txt'])
.pipe(makePutReq('https://www.example.com/put'))
I have renamed the function names to be super verbose.

Related

Write multiple files to http response with streams in nodejs

I have an array of files that I have to pack into a gzip archive and send them through http response on the fly. That means I can't store the whole file in the memory yet I have to synchronously pipe them into tar.entry or everything is going to break.
const tar = require('tar-stream'); //lib for tar stream
const { createGzip } = require('zlib'); //lib for gzip stream
//large list of huge files.
const files = [ 'file1', 'file2', 'file3', ..., 'file99999' ];
...
//http request handler:
const pack = tar.pack(); //tar stream, creates .tar
const gzipStream = createGzip(); //gzip stream so we could reduce the size
//pipe archive data trough gzip stream
//and send it to the client on the fly
pack.pipe(gzipStream).pipe(response);
//The issue comes here, when I need to pass multiple files to pack.entry
files.forEach(name => {
const src = fs.createReadStream(name); //create stream from file
const size = fs.statSync(name).size; //determine it's size
const entry = pack.entry({ name, size }); //create tar entry
//and this ruins everything because if two different streams
//writes smth into entry, it'll fail and throw an error
src.pipe(entry);
});
Basically I need for the pipe to complete sending data (smth like await src.pipe(entry);), but pipes in nodejs don't do that. So is there any way I could get around it?
Nevermind, just don't use forEach in this case

Why am I getting a NOENT using Node core module 'fs'

This a repeat question (not yet answered) but I have revised and tightened up the code. And, I have included the specific example. I am sorry to keep beating this drum, but I need help.
This is a Node API. I need to read and write JSON data. I am using the Node core module 'fs', not the npm package by the same name (or fs-extra). I have extracted the particular area of concern onto a standalone module that is shown here:
'use strict';
/*==================================================
This service GETs the list of ids to the json data files
to be processed, from a json file with the id 'ids.json'.
It returns and exports idsList (an array holding the ids of the json data files)
It also calls putIdsCleared to clear the 'ids.json' file for the next batch of processing
==================================================*/
// node modules
const fs = require('fs');
const config = require('config');
const scheme = config.get('json.scheme')
const jsonPath = config.get('json.path');
const url = `${scheme}${jsonPath}/`;
const idsID = 'ids.json';
const uri = `${url}${idsID}`;
let idsList = [];
const getList = async (uri) => {
await fs.readFile(uri, 'utf8', (err, data) => {
if (err) {
return(console.log( new Error(err.message) ));
}
return jsonData = JSON.parse(data);
})
}
// The idea is to get the empty array written back to 'ids.json' before returning to 'process.js'
const clearList = async (uri) => {
let data = JSON.stringify({'ids': []});
await fs.writeFile(uri, data, (err) => {
if (err) {
return (console.log( new Error(err.message) ));
}
return;
})
}
getList(uri);
clearList(uri)
console.log('end of idsList',idsList);
module.exports = idsList;
Here is the console output from the execution of the module:
Error: ENOENT: no such file or directory, open 'File:///Users/doug5solas/sandbox/libertyMutual/server/api/ids.json'
at ReadFileContext.fs.readFile [as callback]
(/Users/doug5solas/sandbox/libertyMutual/server/.playground/ids.js:24:33)
at FSReqWrap.readFileAfterOpen [as oncomplete] (fs.js:235:13)
Error: ENOENT: no such file or directory, open 'File:///Users/doug5solas/sandbox/libertyMutual/server/api/ids.json'
at fs.writeFile
(/Users/doug5solas/sandbox/libertyMutual/server/.playground/ids.js:36:34)
at fs.js:1167:7
at FSReqWrap.oncomplete (fs.js:141:20)
I am being told there is no such file or directory. However I can copy the uri (as shown in the error message)
File:///Users/doug5solas/sandbox/libertyMutual/server/api/ids.json
into the search bar of my browser and this is what is returned to me:
{
"ids": [
"5sM5YLnnNMN_1540338527220.json",
"5sM5YLnnNMN_1540389571029.json",
"6tN6ZMooONO_1540389269289.json"
]
}
This result is the expected result. I do not "get" why I can get the data manually but I cannot get it programmatically, using the same uri. What am I missing? Help appreciated.
Your File URI is in the wrong format.
It shouldn't contain the File:// protocol (that's a browser-specific thing).
I'd imagine you want C://Users/doug5solas/sandbox/libertyMutual/server/api/ids.json.
I solved the problem by going to readFileSync. I don't like it but it works and it is only one read.

No such file or directory when piping from request in Node

I need to request a .zip file from a URL then pass the contents to AdmZip
When attempting to pipe the output of the request library:
const zipFilePath = path.join(batchPath, this.zipFile.filename);
const out = fs.createWriteStream(zipFilePath);
const req = request.get(this.zipFile.url);
req.pipe(out);
req.on('end', function() {
console.log("I should be here, but I'm not");
});
I receive:
Error: ENOENT: no such file or directory, open
'C:\Users\Brandon\work\keystone4-projects\html-email\batch-content\5996588a3bc30010502bfa9e\test.zip'
What am I doing wrong?
Edit:
I added:
if (!fs.existsSync(batchPath)) {
fs.mkdirSync(batchPath);
}
before attempting to pipe the output and my function completed successfully.
Typically when you receive this error when writing a file, it means that the path leading up to the file being written does not exist.

node.js zlib returning 'Check Headers' error when gunzipping stream

So I am working on the nodeschool.io stream-adventure tutorial track and I'm having trouble with the last problem. The instructions say:
An encrypted, gzipped tar file will be piped in on process.stdin. To beat this
challenge, for each file in the tar input, print a hex-encoded md5 hash of the
file contents followed by a single space followed by the filename, then a
newline.
You will receive the cipher name as process.argv[2] and the cipher passphrase as
process.argv[3]. You can pass these arguments directly through to
`crypto.createDecipher()`.
The built-in zlib library you get when you `require('zlib')` has a
`zlib.createGunzip()` that returns a stream for gunzipping.
The `tar` module from npm has a `tar.Parse()` function that emits `'entry'`
events for each file in the tar input. Each `entry` object is a readable stream
of the file contents from the archive and:
`entry.type` is the kind of file ('File', 'Directory', etc)
`entry.path` is the file path
Using the tar module looks like:
var tar = require('tar');
var parser = tar.Parse();
parser.on('entry', function (e) {
console.dir(e);
});
var fs = require('fs');
fs.createReadStream('file.tar').pipe(parser);
Use `crypto.createHash('md5', { encoding: 'hex' })` to generate a stream that
outputs a hex md5 hash for the content written to it.
This is my attempt so far to work on it:
var tar = require('tar');
var crypto = require('crypto');
var zlib = require('zlib');
var map = require('through2-map');
var cipherAlg = process.argv[2];
var passphrase = process.argv[3];
var cryptoStream = crypto.createDecipher(cipherAlg, passphrase);
var parser = tar.Parse(); //emits 'entry' events per file in tar input
var gunzip = zlib.createGunzip();
parser.on('entry', function(e) {
e.pipe(cryptoStream).pipe(map(function(chunk) {
console.log(chunk.toString());
}));
});
process.stdin
.pipe(gunzip)
.pipe(parser);
I know it's not complete yet, but my issue is that when I try to run this, the input never gets piped to the tar file parsing part. It seems to hang up on the piping to gunzip. This is my exact error:
events.js:72
throw er; // Unhandled 'error' event
^
Error: incorrect header check
at Zlib._binding.onerror (zlib.js:295:17)
I'm totally stumped because the node documentation for Zlib has no mention of headers except for when it has examples with the http/request modules. There are a number of other questions regarding this error with node, but most use buffers rather than streams, so I couldn't find a relevant answer to my problem. All help is greatly appreciated
I actually figured it out, I was supposed to decrypt the stream before unzipping it.
So instead of:
process.stdin
.pipe(gunzip)
.pipe(parser);
it should be:
process.stdin
.pipe(cryptoStream)
.pipe(gunzip)
.pipe(parser);

Downloading a Gzip'd file

I'm attempting to download a file using the http module in node. While the file seems to download sucessfully, the resultant file cannot be opened using gzip. I've tried downloading the file through other methods, and that works, and I've tried using multiple ways to open the resultant gzip'd file, but all of those produce the same error.
I did attempt to use the request module, but there seemed to be no way of accessing the returned HTTP headers before the file was finished downloading, which I need because I'd like to offer some sort of visual indicator as to how long this file is going to take to download.
This is (roughly) the code that I've got so far.
var http = require('http');
var fs = require('fs');
var progress = 0;
downloadFile = function() {
http.get(FILE_URL, function(response) {
var maxBytes = parseInt(response.headers['content-length'], 10);
var dumpFile = fs.createWriteStream(FILENAME + '.dl');
response.pipe(dumpFile);
response
.on('data', function(chunk) {
progress += chunk.length;
// progressbar-type code here
})
.on('end', function() {
// pass
})
dumpFile.on('finish', function() {
dumpFile.close();
fs.rename(FILENAME + '.dl', FILENAME);
});
}
So my question: How would you advise I download a file, bearing in mind it's a large file and I need some sort of visual indicator for download progress? Should I give up on http? Or am I doing something monumentally stupid?
Thanks!

Resources