I am using nodeJS with some additional modules to do web page scraping and media item identification from a set of websites.
The node server basically throws back a JSON markup of all the items identified on the page and its associated metadata. The JSON data is generated correctly as I can see it in the server logs however when I write it to the client, for some reason the JSON response is terminated.
I tested this with all browsers and using rest clients and it seems to be point to an issue with response.write(response, 'utf-8') which may not be sending the whole data or the connection gets closed for some reason.
I verified that there is no chunking involved for my test cases so there is no question of the connection being aggressively closed by the client if its still waiting for the next chunk of data. i.e. response.write in this case returns true which implies that all the data has been written to client.
Any pointers as to what could be causing the connection to be terminated or the response to be truncated? For JSON responses of smaller sizes the response is received correctly by the client.
Code:
return parseDOM(page, url, function(err, response){
if(err){
res.writeHeader(200, {'Content-Type':'application/json'});
res.end('Error Parsing DOM from ' + url);
e.message = 'Error Parsing DOM';
callback(e, req, res, targetUrl);
return;
}
else {
if(response){
res.writeHeader(200, {'Content-Type':'application/json', 'Content-Length':response.length});
console.log(response);
res.write(response, 'UTF-8');
res.end();
callback(null, req, res, targetUrl);
return;
}
}
});
Sorry. My bad. I see that the content length is wrong. Identified solution via issue:
Node.js cuts off files when serving over HTTPS
Related
Is something like this even possible, or are there better ways to do this? Is what Im doing even a good idea, or is this a bad approach?
What I want to do is upload a file to my nodejs server. Along with the file I want to send some meta data. The meta data will determine if the file can be saved and the upload accepted, or if it should be rejected and sending a 403 response.
I am using busboy and I am sending FormData from my client side.
The example below is very much simplified:
Here is a snippet of the client side code.
I am appending the file as well as the meta data to the form
const formData = new FormData();
formData.append('name', JSON.stringify({name: "John Doe"}));
formData.append('file', this.selectedFile, this.selectedFile.name);
Here is the nodejs side:
exports.Upload = async (req, res) => {
try {
var acceptUpload = false;
const bb = busboy({ headers: req.headers });
bb.on('field', (fieldname, val) => {
//Verify data here before accepting file upload
var data = JSON.parse(val);
if (val.name === 'John Doe') {
acceptUpload = true;
} else {
acceptUpload = false;
}
});
bb.on('file', (fieldname, file, filename, encoding, mimetype) => {
if (acceptUpload) {
const saveTo = '/upload/file.txt'
file.pipe(fs.createWriteStream(saveTo));
}else{
response = {
message: 'Not Authorized'
}
res.status(403).json(response);
}
});
bb.on('finish', () => {
response = {
message: 'Upload Successful'
}
res.status(200).json(response);
});
req.pipe(bb);
} catch (error) {
console.log(error)
response = {
message: error.message
}
res.status(500).json(response);
}
}
So basically, is it even possible for the 'field' event-handler to wait for the 'file' event handler? How could one verify some meta data before accepting a file upload?
How can I do validation of all data in the form data object, before accepting the file upload? Is this even possible, or are there other ways of uploading files with this kind of behaviour? I am considering even adding data to the request header, but this does not seem like the ideal solution.
Update
As I suspected, nothing is waiting. Which ever way I try, the upload first has to be completed, only then after is it rejected with a 403
Another Update
Ive tried the same thing with multer and have similar results. Even when I can do the validation, the file is completely uploaded from the client side. Once the upload is complete, only then the request is rejected. The file, however, never gets stored, even though it is uploaded in its entirety.
With busboy, nothing is written to the server if you do not execute the statement file.pipe(fs.createWriteStream(saveTo));
You can prevent more data from even being uploaded to the server by executing the statement req.destroy() in the .on("field", ...) or the .on("file", ...) event handler, even after you have already evaluated some of the fields. Note however, that req.destroy() destroys not only the current HTTP request but the entire TCP connection, which might otherwise have been reused for subsequent HTTP requests. (This applies to HTTP/1.1, in HTTP/2 the relationship between connections and requests is different.)
At any rate, it has no effect on the current HTTP request if everything has already been uploaded. Therefore, whether this saves any network traffic depends on the size of the file. And if the decision whether to req.destroy() involves an asynchronous operation, such as a database lookup, then it may also come too late.
Compare
> curl -v -F name=XXX -F file=#<small file> http://your.server
* We are completely uploaded and fine
* Empty reply from server
* Closing connection 0
curl: (52) Empty reply from server
with
> curl -v -F name=XXX -F file=#<large file> http://your.server
> Expect: 100-continue
< HTTP/1.1 100 Continue
* Send failure: Connection was reset
* Closing connection 0
curl: (55) Send failure: Connection was reset
Note that the client sets the Expect header before uploading a large file. You can use that fact in connection with a special request header name in order to block the upload completely:
http.createServer(app)
.on("checkContinue", function(req, res) {
if (req.headers["name"] === "John Doe") {
res.writeContinue(); // sends HTTP/1.1 100 Continue
app(req, res);
} else {
res.statusCode = 403;
res.end("Not authorized");
}
})
.listen(...);
But for small files, which are uploaded without the Expect request header, you still need to check the name header in the app itself.
I have a firebase cloud function that uses express to streams a zip file of images to the client. When I test the cloud function locally it works fine. When I upload to firebase I get this error:
Error: Can't set headers after they are sent.
What could be causing this error? Memory limit?
export const zipFiles = async(name, params, response) => {
const zip = archiver('zip', {zlib: { level: 9 }});
const [files] = await storage.bucket(bucketName).getFiles({prefix:`${params.agent}/${params.id}/deliverables`});
if(files.length){
response.attachment(`${name}.zip`);
response.setHeader('Content-Type', 'application/zip');
response.setHeader('Access-Control-Allow-Origin', '*')
zip.pipe(output);
response.on('close', function() {
return output.send('OK').end(); // <--this is the line that fails
});
files.forEach((file, i) => {
const reader = storage.bucket(bucketName).file(file.name).createReadStream();
zip.append(reader, {name: `${name}-${i+1}.jpg`});
});
zip.finalize();
}else{
output.status(404).send('Not Found');
}
What Frank said in comments is true. You need to decide all your headers, including the HTTP status response, before you start sending any of the content body.
If you intend to express that you're sending a successful response, simply say output.status(200) in the same way that you did for your 404 error. Do that up front. When you're piping a response, you don't need to do anything to close the response in the end. When the pipe is done, the response will automatically be flushed and finalized. You're only supposed to call end() when you want to bail out early without sending a response at all.
Bear in mind that Cloud Functions only supports a maximum payload of 10MB (read more about limits), so if you're trying to zip up more than that total, it won't work. In fact, there is no "streaming" or chunked responses at all. The entire payload is being built in memory and transferred out as a unit.
I am developing an API which takes input in XML containing IDs for media and gives output in XMLform with details of given IDs. I am facing a problem while sending the response of second simultaneous request; here the second request goes into loop showing "loading" on postman.
What I am doing is calling a function in app.post which parses the media and gives output in the callback and send it using res.send, but it works only for single request.
While doing parallel request to same API either it goes in loop or it gives can't set the headers after they are sent as I am using res.send but res.send is the only way which I can use to send the response (even the next doesn't work).
var getCompositeData = function(req, res, next){
abc.getData(req.body, function(err, xmlOutput){
if(err){
console.log("error");
} else {
xmlData = xmlOutput
return next()
}
}
app.post(apiUrl, [
rawBodyParser({
type: 'application/xml'
}),
app.oauth.authorise()
], getCompositeData, function (req, res) {
res.setHeader('Content-Type', 'application/xml');
res.send(xmlData);
});
There are several issues with your code:
if (err) {
console.log("error");
}
If an error occurs, you still need to make sure a response will be sent back, otherwise the request will stall until a timeout happens. You can pass an error to next, and Express will handle it:
if (err) {
return next(err);
}
Next problem:
xmlData = xmlOutput
xmlData is an undeclared variable, which gets overwritten with each request. If two requests happens at (almost) the same time, it's likely that one client gets back an incorrect response (remember, Node.js runs JS code in a single thread; there is not thread-local storage so xmlData gets shared between all requests).
A good place to "store" this sort of data is in res.locals:
res.locals.xmlData = xmlOutput;
return next();
// and later:
res.send(res.locals.xmlData);
I've been trying to stream binary data (PDF, images, other resources) directly from a request to a remote server but have had no luck so far. To be clear, I don't want to write the document to any filesystem. The client (browser) will make a request to my node process which will subsequently make a GET request to a remote server and directly stream that data back to the client.
var request = require('request');
app.get('/message/:id', function(req, res) {
// db call for specific id, etc.
var options = {
url: 'https://example.com/document.pdf',
encoding: null
};
// First try - unsuccessful
request(options).pipe(res);
// Second try - unsuccessful
request(options, function (err, response, body) {
var binaryData = body.toString('binary');
res.header('content-type', 'application/pdf');
res.send(binaryData);
});
});
Putting both data and binaryData in a console.log show that the proper data is there but the subsequent PDF that is downloaded is corrupt. I can't figure out why.
Wow, never mind. Found out Postman (Chrome App) was hijacking the request and response somehow. The // First Try example in my code excerpt works properly in browser.
I have an endpoint in a node app which is used to download images
var images = {
'car': 'http://someUrlToImage.jpg',
'boat': 'http://someUrlToImage.jpg',
'train': 'http://someUrlToImage.jpg'
}
app.get('/api/download/:id', function(req, res){
var id = req.params.id;
res.setHeader("content-disposition", "attachment; filename=image.jpg");
request.get(images[id]).pipe(res);
});
Now this code works fine, but after a few hours of the app running, the endpoint just hangs.
I am monitoring the memory usage of the app, which remains consistent, and any other endpoints which just return some JSON respond as normal so it is not as if the event loop is somehow being blocked. Is there a gotcha of some kind that I am missing when using the request module to pipe a response? Or is there a better solution to achieve this?
I am also using the Express module.
You should add an error listener on your request because errors are not passed in pipes. That way, if your request has an error, it will close the connection and you'll get the reason.
request
.get(...)
.on('error', function(err) {
console.log(err);
res.end();
})
.pipe(res)