Streaming a zip download from cloud functions - node.js

I have a firebase cloud function that uses express to streams a zip file of images to the client. When I test the cloud function locally it works fine. When I upload to firebase I get this error:
Error: Can't set headers after they are sent.
What could be causing this error? Memory limit?
export const zipFiles = async(name, params, response) => {
const zip = archiver('zip', {zlib: { level: 9 }});
const [files] = await storage.bucket(bucketName).getFiles({prefix:`${params.agent}/${params.id}/deliverables`});
if(files.length){
response.attachment(`${name}.zip`);
response.setHeader('Content-Type', 'application/zip');
response.setHeader('Access-Control-Allow-Origin', '*')
zip.pipe(output);
response.on('close', function() {
return output.send('OK').end(); // <--this is the line that fails
});
files.forEach((file, i) => {
const reader = storage.bucket(bucketName).file(file.name).createReadStream();
zip.append(reader, {name: `${name}-${i+1}.jpg`});
});
zip.finalize();
}else{
output.status(404).send('Not Found');
}

What Frank said in comments is true. You need to decide all your headers, including the HTTP status response, before you start sending any of the content body.
If you intend to express that you're sending a successful response, simply say output.status(200) in the same way that you did for your 404 error. Do that up front. When you're piping a response, you don't need to do anything to close the response in the end. When the pipe is done, the response will automatically be flushed and finalized. You're only supposed to call end() when you want to bail out early without sending a response at all.
Bear in mind that Cloud Functions only supports a maximum payload of 10MB (read more about limits), so if you're trying to zip up more than that total, it won't work. In fact, there is no "streaming" or chunked responses at all. The entire payload is being built in memory and transferred out as a unit.

Related

busboy wait for field data before accepting file upload

Is something like this even possible, or are there better ways to do this? Is what Im doing even a good idea, or is this a bad approach?
What I want to do is upload a file to my nodejs server. Along with the file I want to send some meta data. The meta data will determine if the file can be saved and the upload accepted, or if it should be rejected and sending a 403 response.
I am using busboy and I am sending FormData from my client side.
The example below is very much simplified:
Here is a snippet of the client side code.
I am appending the file as well as the meta data to the form
const formData = new FormData();
formData.append('name', JSON.stringify({name: "John Doe"}));
formData.append('file', this.selectedFile, this.selectedFile.name);
Here is the nodejs side:
exports.Upload = async (req, res) => {
try {
var acceptUpload = false;
const bb = busboy({ headers: req.headers });
bb.on('field', (fieldname, val) => {
//Verify data here before accepting file upload
var data = JSON.parse(val);
if (val.name === 'John Doe') {
acceptUpload = true;
} else {
acceptUpload = false;
}
});
bb.on('file', (fieldname, file, filename, encoding, mimetype) => {
if (acceptUpload) {
const saveTo = '/upload/file.txt'
file.pipe(fs.createWriteStream(saveTo));
}else{
response = {
message: 'Not Authorized'
}
res.status(403).json(response);
}
});
bb.on('finish', () => {
response = {
message: 'Upload Successful'
}
res.status(200).json(response);
});
req.pipe(bb);
} catch (error) {
console.log(error)
response = {
message: error.message
}
res.status(500).json(response);
}
}
So basically, is it even possible for the 'field' event-handler to wait for the 'file' event handler? How could one verify some meta data before accepting a file upload?
How can I do validation of all data in the form data object, before accepting the file upload? Is this even possible, or are there other ways of uploading files with this kind of behaviour? I am considering even adding data to the request header, but this does not seem like the ideal solution.
Update
As I suspected, nothing is waiting. Which ever way I try, the upload first has to be completed, only then after is it rejected with a 403
Another Update
Ive tried the same thing with multer and have similar results. Even when I can do the validation, the file is completely uploaded from the client side. Once the upload is complete, only then the request is rejected. The file, however, never gets stored, even though it is uploaded in its entirety.
With busboy, nothing is written to the server if you do not execute the statement file.pipe(fs.createWriteStream(saveTo));
You can prevent more data from even being uploaded to the server by executing the statement req.destroy() in the .on("field", ...) or the .on("file", ...) event handler, even after you have already evaluated some of the fields. Note however, that req.destroy() destroys not only the current HTTP request but the entire TCP connection, which might otherwise have been reused for subsequent HTTP requests. (This applies to HTTP/1.1, in HTTP/2 the relationship between connections and requests is different.)
At any rate, it has no effect on the current HTTP request if everything has already been uploaded. Therefore, whether this saves any network traffic depends on the size of the file. And if the decision whether to req.destroy() involves an asynchronous operation, such as a database lookup, then it may also come too late.
Compare
> curl -v -F name=XXX -F file=#<small file> http://your.server
* We are completely uploaded and fine
* Empty reply from server
* Closing connection 0
curl: (52) Empty reply from server
with
> curl -v -F name=XXX -F file=#<large file> http://your.server
> Expect: 100-continue
< HTTP/1.1 100 Continue
* Send failure: Connection was reset
* Closing connection 0
curl: (55) Send failure: Connection was reset
Note that the client sets the Expect header before uploading a large file. You can use that fact in connection with a special request header name in order to block the upload completely:
http.createServer(app)
.on("checkContinue", function(req, res) {
if (req.headers["name"] === "John Doe") {
res.writeContinue(); // sends HTTP/1.1 100 Continue
app(req, res);
} else {
res.statusCode = 403;
res.end("Not authorized");
}
})
.listen(...);
But for small files, which are uploaded without the Expect request header, you still need to check the name header in the app itself.

When an Express.js middleware modifies the chunk that goes into res.end(), response times go up by 10x

Node.js version: 14.16, Express version: 4.17
We use express-winston for logging our express responses. Since we want additional information to go into our logs, that we don't want our end-users to see, we decided to include a middleware that wraps res.end(), so as to intercept the chunk sent by express-winston, and remove the additional data.
For reference here's the line of code in express-winston that calls res.end(), whose chunk we want to replace before the response is sent to the end-user, without altering the chunk that is logged:
https://github.com/bithavoc/express-winston/blob/bdba1d39965f83b003178646d213cd974b090326/index.js#L317
Here is a sample middleware that we wrote:
module.exports.responseMiddleware = (req, res, next) => {
const { end } = res;
res.end = (chunk, encoding) => {
res.end = end;
// If alterBody is enabled, the chunk sent to res.end is modified
const resultChunk = req.body.alterBody === 'yes'
? Buffer.from(JSON.stringify({}))
: chunk;
res.end(resultChunk, encoding);
};
next();
};
The original response is sent with a call to res.status(...).json(...)
What we found was that when we enable alterBody, the response time goes up by 10x (from 500ms to 5s).
What could be the reason for this? Is there a way that we can maintain the original response time, while also logging and sending two different chunks?

Node.js - Stream Binary Data Straight from Request to Remote server

I've been trying to stream binary data (PDF, images, other resources) directly from a request to a remote server but have had no luck so far. To be clear, I don't want to write the document to any filesystem. The client (browser) will make a request to my node process which will subsequently make a GET request to a remote server and directly stream that data back to the client.
var request = require('request');
app.get('/message/:id', function(req, res) {
// db call for specific id, etc.
var options = {
url: 'https://example.com/document.pdf',
encoding: null
};
// First try - unsuccessful
request(options).pipe(res);
// Second try - unsuccessful
request(options, function (err, response, body) {
var binaryData = body.toString('binary');
res.header('content-type', 'application/pdf');
res.send(binaryData);
});
});
Putting both data and binaryData in a console.log show that the proper data is there but the subsequent PDF that is downloaded is corrupt. I can't figure out why.
Wow, never mind. Found out Postman (Chrome App) was hijacking the request and response somehow. The // First Try example in my code excerpt works properly in browser.

node request pipe hanging after a few hours

I have an endpoint in a node app which is used to download images
var images = {
'car': 'http://someUrlToImage.jpg',
'boat': 'http://someUrlToImage.jpg',
'train': 'http://someUrlToImage.jpg'
}
app.get('/api/download/:id', function(req, res){
var id = req.params.id;
res.setHeader("content-disposition", "attachment; filename=image.jpg");
request.get(images[id]).pipe(res);
});
Now this code works fine, but after a few hours of the app running, the endpoint just hangs.
I am monitoring the memory usage of the app, which remains consistent, and any other endpoints which just return some JSON respond as normal so it is not as if the event loop is somehow being blocked. Is there a gotcha of some kind that I am missing when using the request module to pipe a response? Or is there a better solution to achieve this?
I am also using the Express module.
You should add an error listener on your request because errors are not passed in pipes. That way, if your request has an error, it will close the connection and you'll get the reason.
request
.get(...)
.on('error', function(err) {
console.log(err);
res.end();
})
.pipe(res)

hitting a multipart url in nodejs

I have a client code using form-data module to hit a url that returns a content-type of image/jpeg. Below is my code
var FormData = require('form-data');
var fs = require('fs');
var form = new FormData();
//form.append('POLICE', "hello");
//form.append('PAYSLIP', fs.createReadStream("./Desert.jpg"));
console.log(form);
//https://fbcdn-profile-a.akamaihd.net/hprofile-ak-xfp1/v/t1.0- 1/c8.0.50.50/p50x50/10934065_1389946604648669_2362155902065290483_n.jpg?oh=13640f19512fc3686063a4703494c6c1&oe=55ADC7C8&__gda__=1436921313_bf58cbf91270adcd7b29241838f7d01a
form.submit({
protocol: 'https:',
host: 'fbcdn-profile-a.akamaihd.net',
path: '/hprofile-ak-xfp1/v/t1.0-1/c8.0.50.50/p50x50/10934065_1389946604648669_2362155902065290483_n.jpg?oh=13640f19512fc3686063a3494c6c1&oe=55ADCC8&__gda__=1436921313_bf58cbf91270adcd7b2924183',
method: 'get'
}, function (err, res) {
var data = "";
res.on("data", function (chunks) {
data += chunks;
});
res.on("end", function () {
console.log(data);
console.log("Response Headers - " + JSON.stringify(res.headers));
});
});
I'm getting some chunk data and the response headers i received was
{"last-modified":"Thu, 12 Feb 2015 09:49:26 GMT","content-type":"image/jpeg","timing-allow-origin":"*","access-control-allow-origin":"*","content-length":"1443","cache-control":"no-transform, max-age=1209600","expires":"Thu, 30 Apr 2015 07:05:31 GMT","date":"Thu, 16 Apr 2015 07:05:31 GMT","connection":"keep-alive"}
I am now stuck as how to process the response that i received to a proper image.I tried base64 decoding but it seemed to be a wrong approach any help will be much appreciated.
I expect that data, once the file has been completely downloaded, contains a Buffer.
If that is the case, you should write the buffer as is, without any decoding, to a file:
fs.writeFile('path/to/file.jpg', data, function onFinished (err) {
// Handle possible error
})
See fs.writeFile() documentation - you will see that it accepts either a string or a buffer as data input.
Extra awesomeness by using streams
Since the res object is a readable stream, you can simply pipe the data directly to a file, without keeping it in memory. This has the added benefit that if you download really large file, Node.js will not have to keep the whole file in memory (as it does now), but will write it to the filesystem continuously as it arrives.
form.submit({
// ...
}, function (err, res) {
// res is a readable stream, so let's pipe it to the filesystem
var file = fs.createWriteStream('path/to/file.jpg')
res.on('end', function writeDone (err) {
// File is saved, unless err happened
})
.pipe(file) // Send the incoming file to the filesystem
})
The chunk you got is the raw image. Do whatever it is you want with the image, save it to disk, let the user download it, whatever.
So if I understand your question clearly, you want to download a file from an HTTP endpoint and save it to your computer, right? If so, you should look into using the request module instead of using form-data.
Here's a contrived example for downloading things using request:
var fs = require('fs');
var request = require('request')
request('http://www.example.com/picture.jpg')
.pipe(fs.createWriteStream('picture.jpg'))
Where 'picture.jpg' is the location to save to disk. You can open it up using a normal file browser.

Resources