I have an endpoint on my Hapi server which builds a (sometimes) large zip file.
I want to return that file back to the frontend, but I can't seem to figure out how to make that work in the current (18) version of Hapi. There were some MAJOR api changes recently and all the examples I can find are completely out of date.
Some code that writes a test zip is below:
handler: async (request, h) => {
let zip = new JSZip();
zip.file('test.txt', 'derp derp');
let stream = zip.generateNodeStream({streamFiles:true})
.pipe(fs.createWriteStream('out.zip'))
.on('finish', function () {
// JSZip generates a readable stream with a "end" event,
// but is piped here in a writable stream which emits a "finish" event.
console.log("out.zip written.");
});
but I cant seem to figure out what to do after. I have tried all of the following so far:
// vanilla hapi
return zip;
// from Inert:
return h.file(zip);
// from Toys
return Toys.stream(zip);
// vanilla Hapi, take 2:
return h.reply(zip).type('application/zip');
But no luck. Ideally I'd like it to stream as the file is being built, which I have seen examples of people doing, but they were all either an outdated version of Hapi or another framework altogether.
Thanks in advance!
Related
I'm trying to use the unzipper node module to extract and process a number of files (exact number is unknown). However, I can't seem to figure out how to know when all the files are processed. So far, my code looks like this:
s3.getObject(params).createReadStream()
.pipe(unzipper.Parse())
.on('entry', async (entry) => {
var fileName = entry.path;
if (fileName.match(someRegex)) {
await processEntry(entry);
console.log("Uploaded");
} else {
entry.autodrain();
console.log("Drained");
}
});
I'm trying to figure out how to know that unzipper has gone through all the files (i.e., no more entry events are forthcoming) and all the entry handlers have finished so that I know I've finished processing all the files I care about.
I've tried experimenting with the close and finish events but when I have, they both trigger before console.log("Uploaded"); has printed, so that doesn't seem right.
Help?
Directly from the docs:
The parser emits finish and error events like any other stream. The parser additionally provides a promise wrapper around those two events to allow easy folding into existing Promise-based structures.
Example:
fs.createReadStream('path/to/archive.zip')
.pipe(unzipper.Parse())
.on('entry', entry => entry.autodrain())
.promise()
.then( () => console.log('done'), e => console.log('error',e));
I have a gulp task that downloads a few JSON files from GitHub, then prompts the user for values to replace in those files. For example, I have an .ftpconfig that gets download, and then the user is asked to enter hostname, username, password, and path.
Because the file first needs to be downloaded before it can be configured, and each file needs to be configured sequentially, I'm using quite a few nested callbacks. I'd like to change this "callback hell" system so that it utilizes async/await and/or promises instead, but I'm having a lot of difficulty understanding exactly why my code isn't working; it seems that promises fire their .then() functions asynchronously, which doesn't make sense to me.
My goals are as follows:
Download all config files asynchronously
Wait for all config files to finish downloading
Read existing settings from the config files
Prompt the user for changed settings in each config file synchronously
I've tried a number of approaches, none of which worked. I discarded the code I've used, but here's a rough recreation of the things I've tried:
Attempt #1:
return new Promise((resolve) => {
// download files...
}).then((resolve) => {
// configure first file...
}).then((resolve) => {
// configure second file...
}).then((resolve) => {
// configure thrid file...
});
Attempt #2:
const CONFIG_FILES = async () => {
const bs_download = await generate_config("browsersync");
const ftp_download = await generate_config("ftp");
const rsync_download = await generate_config("rsync");
return new Promise(() => {
configure_json("browsersync");
}).then(() => {
configure_json("ftp");
}).then(() => {
configure_json("rsync");
});
};
I'm sure I'm doing something very obviously wrong, but I'm not adapt enough at JavaScript to see the problem. Any help would be great appreciated.
My gulp task can be found here:
gulpfile.js
gulp-tasks/config.js
Thanks to #EricB, I was able to figure out what I was doing wrong. It was mostly a matter of making my functions return promises as well.
https://github.com/JacobDB/new-site/blob/d119b8b3c22aa7855791ab6b0ff3c2e33988b4b2/gulp-tasks/config.js
I am not sure where I am going wrong but I think that the event listener is getting invoked multiple times and parsing the files multiple times.
I have five files in the directory and they are getting parsed. However the pdf file with array 0 gets parsed once and the next one twice and third one three times.
I want the each file in the directory to be parsed once and create a text file by extracting the data from pdf.
The Idea is to parse the pdf get the content as text and convert the text in to json in a specific format.
To make it simple, the plan is to complete one task first then use the output from the below code to perform the next task.
Hope anyone can help and point out where i am going wrong and explain a bit about my mistake so i understand it. (new to the JS and Node)
Regards,
Jai
Using the module from here:
https://github.com/modesty/pdf2json
var fs = require('fs')
PDFParser = require('C:/Users/Administrator/node_modules/pdf2json/PDFParser')
var pdfParser = new PDFParser(this, 1)
fs.readdir('C:/Users/Administrator/Desktop/Project/Input/',function(err,pdffiles){
//console.log(pdffiles)
pdffiles.forEach(function(pdffile){
console.log(pdffile)
pdfParser.once("pdfParser_dataReady",function(){
fs.writeFile('C:/Users/Administrator/Desktop/Project/Jsonoutput/'+pdffile, pdfParser.getRawTextContent())
pdfParser.loadPDF('C:/Users/Administrator/Desktop/Project/Input/'+pdffile)
})
})
})
As mentioned in the comment, just contributing 'work-around' ideas for OP to temporary resolve this issue.
Assuming performance is not an issue then you should be able to asynchronously parse the pdf files in a sequential matter. That is, only parse the next file when the first one is done.
Unfortunately I have never used the npm module PDFParser before so it is really difficult for me to try the code below. Pardon me as it may require some minor tweaks to make it to work, syntactically they should be fine as they were written using an IDE.
Example:
var fs = require('fs');
PDFParser = require('C:/Users/Administrator/node_modules/pdf2json/PDFParser');
var parseFile = function(files, done) {
var pdfFile = files.pop();
if (pdfFile) {
var pdfParser = new PDFParser();
pdfParser.on("pdfParser_dataError", errData => { return done(errData); });
pdfParser.on("pdfParser_dataReady", pdfData => {
fs.writeFile("'C:/Users/Administrator/Desktop/Project/Jsonoutput/" + pdfFile, JSON.stringify(pdfData));
parseFile(files, done);
});
pdfParser.loadPDF('C:/Users/Administrator/Desktop/Project/Input/' + pdfFile);
}
else {
return done(null, "All pdf files parsed.")
}
};
fs.readdir('C:/Users/Administrator/Desktop/Project/Input/',function(err,pdffiles){
parseFile(pdffiles, (err, message) => {
if (err) { console.error(err.parseError); }
else { console.log(message); }
})
});
In the code above, I have isolated out the parsing logic into a separated function called parseFile. In this function it first checks to see if there are still files to process or not, if none then it invokes the callback function done otherwise it will do an array.pop operation to get the next file in queue and starts parsing it.
When parsing is done then it recursively call the parseFile function until the last file is parsed.
I'm using node and express to handle file uploads and I'm streaming them directly to conversion services using multiparty/busboy and request.
Is there a way to verify that the streams have some certain filetypes before sending them to the corresponding providers? I tried https://github.com/mscdex/mmmagic to get the MIME type out of the first chunk(s) and it worked nicely. I was wondering if the following workflow might work somehow:
Buffer the file upload stream and check the incoming data for the Mime type.
When the first few chunks are checked and the mime type is correct, empty the buffer into the request-stream.
When the mime type turns out not to be correct, send an error message and return.
I tried to get this working but I seem to have some stream compatibility issues (node 0.8.x vs. node 0.10.x streams, which are not supported by the request library).
Are there any best-practices to solve this problem? Am I looking at it the wrong way?
EDIT: Thanks to Paul I came up with this code:
https://gist.github.com/chmanie/8520572
Besides of checking the Content-Type header of the client's request, I'm not aware of a better and more clever way to check MIME types.
You can implement the solution you described above using a Transform stream. In this example, the transform stream buffers some arbitrary amount of data, then sends it to your MIME checking library. If everything is fine, it re-emits data. The subsequent chunks will be emitted as-is.
var stream = require('readable-stream');
var mmm = require('mmmagic');
var mimeChecker = new stream.Transform();
mimeChecker.data = [];
mimeChecker.mimeFound = false;
mimeChecker._transform = function (chunk, encoding, done) {
var self = this;
if (self.mimeFound) {
self.push(chunk);
return done();
}
self.data.push(chunk);
if (self.data.length < 10) {
return done();
}
else if (self.data.length === 10) {
var buffered = Buffer.concat(this.data);
new mmm.Magic(mmm.MAGIC_MIME_TYPE).detect(buffered, function(err, result) {
if (err) return self.emit('error', err);
if (result !== 'text/plain') return self.emit('error', new Error('Wrong MIME'));
self.data.map(self.push.bind(self));
self.mimeFound = true;
return done();
});
}
};
You can then pipe this transform stream to any other stream, like a request stream (which totally supports Node 0.10 stream by the way).
// Usage example
var fs = require('fs');
fs.createReadStream('input.txt').pipe(mimeChecker).pipe(fs.createWriteStream('output.txt'));
Edit: To be clearer on the incompatibility you encountered between Node 0.8 and 0.10 streams, when you define a stream and attach to it a .on('data') listener, it will switch into flow mode (aka 0.8 streams), which means that it will emit data even if the destination isn't listening. This is what could happen if you launch an asynchronous request to Magic.detect(): the data still flows, even if you listen for it.
In Meteor, on the server side, I want to use the .find() function on a Collection and then get a Node ReadStream interface from the curser that is returned. I've tried using .stream() on the curser as described in the mongoDB docs Seen Here. However I get the error "Object [object Object] has no method 'stream'" So it looks like Meteor collections don't have this option. Is there a way to get a stream from a Meteor Collection's curser?
I am trying to export some data to CSV and I want to pipe the data directly from the collections stream into a CSV parser and then into the response going back to the user. I am able to get the response stream from the Router package we are using, and it's all working except for getting a stream from the collection. Fetching the array from the find to push it into the stream manually would defeat the purpose of a stream since it would put everything in memory. I guess my other option is to use a foreach on the collection and push the rows into the stream one by one, but this seems dirty when I could pipe the stream directly through the parser with a transform on it.
Here's some sample code of what I am trying to do:
response.writeHead(200,{'content-type':'text/csv'});
// Set up a future
var fut = new Future();
var users = Users.find({}).stream();
CSV().from(users)
.to(response)
.on('end', function(count){
log.verbose('finished csv export');
response.end();
fut.ret();
});
return fut.wait();
Have you tried creating a custom function and piping to it?
Though this would only work if Users.find() supported .pipe()(again, only if Users.find inherited from node.js streamble object).
Kind of like
var stream = require('stream')
var util = require('util')
streamreader = function (){
stream.Writable.call(this)
this.end = function() {
console.log(this.data) //this.data contains raw data in a string so do what you need to to make it usable, i.e, do a split on ',' or something or whatever it is you need to make it usable
db.close()
})
}
util.inherits(streamreader,stream.Writeable)
stream.prototype._write = function (chunk, encoding, callback) {
this.data = this.data + chunk.toString('utf8')
callback()
}
Users.find({}).pipe(new streamReader())