Downloading a Gzip'd file - node.js

I'm attempting to download a file using the http module in node. While the file seems to download sucessfully, the resultant file cannot be opened using gzip. I've tried downloading the file through other methods, and that works, and I've tried using multiple ways to open the resultant gzip'd file, but all of those produce the same error.
I did attempt to use the request module, but there seemed to be no way of accessing the returned HTTP headers before the file was finished downloading, which I need because I'd like to offer some sort of visual indicator as to how long this file is going to take to download.
This is (roughly) the code that I've got so far.
var http = require('http');
var fs = require('fs');
var progress = 0;
downloadFile = function() {
http.get(FILE_URL, function(response) {
var maxBytes = parseInt(response.headers['content-length'], 10);
var dumpFile = fs.createWriteStream(FILENAME + '.dl');
response.pipe(dumpFile);
response
.on('data', function(chunk) {
progress += chunk.length;
// progressbar-type code here
})
.on('end', function() {
// pass
})
dumpFile.on('finish', function() {
dumpFile.close();
fs.rename(FILENAME + '.dl', FILENAME);
});
}
So my question: How would you advise I download a file, bearing in mind it's a large file and I need some sort of visual indicator for download progress? Should I give up on http? Or am I doing something monumentally stupid?
Thanks!

Related

Redirect Readable object stdout process to file in node

I use an NPM library to parse markdown to HTML like this:
var Markdown = require('markdown-to-html').Markdown;
var md = new Markdown();
...
md.render('./test', opts, function(err) {
md.pipe(process.stdout)
});
This outputs the result to my terminal as intended.
However, I need the result inside the execution of my node program. I thought about writing the output stream to file and then reading it in at a later time but I can't figure out a way to write the output to a file instead.
I tried to play around var file = fs.createWriteStream('./test.html'); but the node.js streams rather give me headaches than results.
I've also looked into the library's repo and Markdown inherits from Readable via util like this:
var util = require('util');
var Readable = require('stream').Readable;
util.inherits(Markdown, Readable);
Any resources or advice would be highly appreciated. (I would also take another library for parsing the markdown, but this gave me the best results so far)
Actually creating a writable file-stream and piping the markdown to this stream should work just fine. Try it with:
const writeStream = fs.createWriteStream('./output.html');
md.render('./test', opts, function(err) {
md.pipe(writeStream)
});
// in case of errors you should handle them
writeStream.on('error', function (err) {
console.log(err);
});

nodejs/fs: writing a tar to memory buffer

I need to be able to tar a directory, and send this to a remote endpoint via HTTP PUT.
I could of course create the tar, save it to disk, then read it again and send it.
But I'd rather like to create the tar, then pipe it to some buffer and send it immediately. I haven't been able to achieve this.
Code so far:
var tar = require('tar');
var fs = require("fs");
var path = "/home/me/uploaddir";
function getTar(path, cb) {
var buf = new Buffer('');
var wbuf = fs.createWriteStream(buf);
wbuf.on("finish", function() {
cb(buf);
});
tar.c({file:""},[path]).
pipe(wbuf);
}
getTar(path, function(tar) {
//send the tar over http
});
This code results in:
fs.js:575
binding.open(pathModule._makeLong(path),
^
TypeError: path must be a string
at TypeError (native)
at Object.fs.open (fs.js:575:11)
I've also tried using an array as buffer, no joy.
The following solution
creates the tar, then pipes it to some buffer and sends it immediately
and with great speed thanks to the tar-fs library:
First install the libraries request for simplified requests and tar-fs, which provides filesystem bindings for tar-stream: npm i -S tar-fs request
var tar = require('tar-fs')
var request = require('request')
var fs = require('fs')
// pack specific files in the directory
function packTar (folderName, pathsArr) {
return tar.pack(folderName, {
entries: pathsArr
})
}
// return put stream
function makePutReq (url) {
return request.put(url)
}
packTar('./testFolder', ['test.txt', 'test1.txt'])
.pipe(makePutReq('https://www.example.com/put'))
I have renamed the function names to be super verbose.

Compress an uncompressed xlsx file using node.js (Electron)

I have an unzipped xlsx file, in it I edit some files to be able to generate a new xlsx file containing new data.
In linux to recompress the file in xlsx I just need to go into the terminal and type
find . -type f | xargs zip ../newfile.xlsx
into the folder where the xlsx files are.
The question now is how can I do this using node.js?
The solution is to compress a direct list of files contained in xlsx, for some reason if we try to compress the folder the file has corrupted.
The code looks like this if you use JSZIP
var fs = require('fs');
var JSZip = require("jszip");
var zip = new JSZip();
var file = [];
file.push("_rels/.rels");
file.push("docProps/core.xml");
file.push("docProps/app.xml");
file.push("docProps/custom.xml");
file.push("[Content_Types].xml");
file.push("xl/_rels/workbook.xml.rels");
file.push("xl/styles.xml");
file.push("xl/pivotTables/_rels/pivotTable3.xml.rels");
file.push("xl/pivotTables/_rels/pivotTable1.xml.rels");
file.push("xl/pivotTables/_rels/pivotTable2.xml.rels");
file.push("xl/pivotTables/pivotTable3.xml");
file.push("xl/pivotTables/pivotTable1.xml");
file.push("xl/pivotTables/pivotTable2.xml");
file.push("xl/workbook.xml");
file.push("xl/worksheets/_rels/sheet2.xml.rels");
file.push("xl/worksheets/_rels/sheet1.xml.rels");
file.push("xl/worksheets/_rels/sheet3.xml.rels");
file.push("xl/worksheets/sheet4.xml");
file.push("xl/worksheets/sheet1.xml");
file.push("xl/worksheets/sheet3.xml");
file.push("xl/worksheets/sheet2.xml");
file.push("xl/sharedStrings.xml");
file.push("xl/pivotCache/_rels/pivotCacheDefinition1.xml.rels");
file.push("xl/pivotCache/pivotCacheDefinition1.xml");
file.push("xl/pivotCache/pivotCacheRecords1.xml");
for (var i = 0; i < file.length; i++) {
zip.file(file[i], fs.readFileSync("/home/user/xlsx_FILES/"+file[i]));
}
zip.generateAsync({type:"blob"}).then(function(content) {
// see FileSaver.js
saveAs(content, "yourfile.xlsx");
});
Take a look at archiver, a compression library for nodejs. The docs for the library look like they are comprehensive. The library also allows you to append archives and take advantage of streaming api's for appending and creating new archives.
Here is an example snippet from their docs which shows how to use the library.
// require modules
var fs = require('fs');
var archiver = require('archiver');
// create a file to stream archive data to.
var output = fs.createWriteStream(__dirname + '/example.zip');
var archive = archiver('zip', {
store: true // Sets the compression method to STORE.
});
// listen for all archive data to be written
output.on('close', function() {
console.log(archive.pointer() + ' total bytes');
console.log('archiver has been finalized and the output file descriptor has closed.');
});
// good practice to catch this error explicitly
archive.on('error', function(err) {
throw err;
});
// pipe archive data to the file
archive.pipe(output);

I want to pipe a readable css file to the http response

I have an issue with outputting the readable stream to the http response.
behind the scenes there is a regular request and response streams coming from the generic http createServer. I check to see if the 'req.url' ends in css, and I create a readable stream of this file. I see the css contents in the console.log, with the right css code I expect. Then, I try to pipe the readable css file stream to the response, but in Chrome, the file response is blank when I inspect the response. It is a 200 response though. Any thoughts at first glance? I've tried different variations of where I have code commented out.
router.addRoute("[a-aA-z0-9]{1,50}.css$", function(matches){
var cssFile = matches[0];
var pathToCss = process.cwd() + "/" + cssFile;
// takes care of os diffs regarding path delimiters and such
pathToCss = path.normalize(pathToCss);
console.log(matches);
console.log("PATH TO CSS");
console.log(pathToCss)
var readable = fs.createReadStream(pathToCss);
var write = function(chunk){
this.queue(chunk.toString());
console.log(chunk.toString());
}
var end = function(){
this.queue(null);
}
var thru = through(write,end);
//req.on("end",function(){
res.pipe(readable.pipe(thru)).pipe(res);
//res.end();
//});
});
you need to pipe your readable stream into your through-stream, and then pipe it into the response:
readable.pipe(thru).pipe(res);
edit: for preparing your css path, just use path.join instead of concatenating your path and normalizing it:
var pathToCss = path.join(process.cwd(), cssFile);
I separated out this route (css) from my normal html producing routes, the problem I had was that my normal routes in my router object returned strings, like res.end(compiled_html_str), and the css file readable stream was going through that same routing function. I made it separate by isolating it from my router.
var cssMatch = [];
if(cssMatch = req.url.match(/.+\/(.+\.css$)/)){
res.writeHead({"Content-Type":"text/css"});
var cssFile = cssMatch[1];
var pathToCss = process.cwd() + "/" + cssFile;
// takes care of os diffs regarding path delimiters and such
pathToCss = path.normalize(pathToCss);
console.log(cssMatch);
console.log("PATH TO CSS");
console.log(pathToCss)
var readable = fs.createReadStream(pathToCss);
var cssStr = "";
readable.on("data",function(chunk){
cssStr += chunk.toString();
});
readable.on("end",function(){
res.end(cssStr);
});
}

Why append rather than write when using knox / node.js to grab file from Amazon s3

I'm experimenting with the knox module for node.js as a way of managing some small files in an Amazon S3 bucket. Everything works fine stand-alone: I can upload a file, download a file, etc. However, I want to be able to download a file on recurring schedule. When I modify the code to run on an interval, I'm getting the downloaded file appending to the previous instance instead of overwriting.
I'm not sure if I've made a mistake in the file write code or in the knox handling code. I've tried several different write approaches (writeFile, writeStream, etc.) and I've looked at the knox source code. Nothing obvious to me stands out as a problem. Here's the code I'm using:
knox = require('knox');
fs = require('fs');
var downFile = DOWNFILE;
var downTxt = '';
var timer = INTERVAL;
var path = S3PATH + downFile;
setInterval(function()
{
var s3client = knox.createClient(
{
key: '********************',
secret: '**********************************',
bucket: '********'
});
s3client.get(path).on('response', function(response)
{
response.setEncoding('ascii');
response.on('data', function(chunk)
{
downTxt += chunk;
});
response.on('end', function()
{
fs.writeFileSync(downFile, downTxt, 'ascii');
});
}).end();
},
timer);
The problem is with your placement of var downTxt = '';. That is the only place you set downTxt to blank, so every time you retrieve more data, you add it to the data that you got in the previous request because you never clear the data from the previous request. The simplest fix is to move that line to just before the setEncoding line.
However, the way you are processing the data is unnecessarily complicated. Try something like this instead. You don't need to recreate the client every time, and setting the encoding will just break things if you are downloading non-text files, and it won't make a difference with text files. Next, you shouldn't manually collect the data, you can immediately start writing it to the file as you receive it. Lastly, since request is a standard stream, you don't need to monitor the 'data' event because you can just use pipe.
var knox = require('knox'),
fs = require('fs'),
downFile = DOWNFILE,
timer = INTERVAL,
path = S3PATH + downFile,
s3client = knox.createClient({
key: '********************',
secret: '**********************************',
bucket: '********'
});
(function downloadFile() {
var str = fs.createWriteStream(downFile);
s3client.get(path).pipe(str);
str.on('close', function() {
setTimeout(downloadFile, timer);
});
})();

Resources