ERR_CONNECTION_RESET when upload large file nodejs multer - node.js

I'm writing an web application that allow user to upload very large file (up to GB). My technical stack include: nodejs, express, multer and pure html. It works fine for small file. But when I upload big file (127 MB), I got error ERR_CONNECTION_RESET after waiting a while (about 2 minutes).
I tried extended response time on server, using both req.setTimeout and res.setTimeout but it didn't help. It's may be because frontend waiting to long to get response.
Below is the error I got:
Thank you all.

Increasing the res-timeout for the corresponding upload-route should definitely work. Try doing it like this:
function extendTimeout (req, res, next) {
// adjust the value for the timeout, here it's set to 3 minutes
res.setTimeout(180000, () => { // you can handle the timeout error here })
next();
})
app.post('/your-upload-route', extendTimeout, upload.single('your-file'), (req, res, next) => {
// handle file upload
})

Related

Cancel File Upload: Multer, MongoDB

I can't seem to find any up-to-date answers on how to cancel a file upload using Mongo, NodeJS & Angular. I've only come across some tuttorials on how to delete a file but that is NOT what I am looking for. I want to be able to cancel the file uploading process by clicking a button on my front-end.
I am storing my files directly to the MongoDB in chuncks using the Mongoose, Multer & GridFSBucket packages. I know that I can stop a file's uploading process on the front-end by unsubscribing from the subsribable responsible for the upload in the front-end, but the upload process keeps going in the back-end when I unsubscribe** (Yes, I have double and triple checked. All the chunks keep getting uploaded untill the file is fully uploaded.)
Here is my Angular code:
ngOnInit(): void {
// Upload the file.
this.sub = this.mediaService.addFile(this.formData).subscribe((event: HttpEvent<any>) => {
console.log(event);
switch (event.type) {
case HttpEventType.Sent:
console.log('Request has been made!');
break;
case HttpEventType.ResponseHeader:
console.log('Response header has been received!');
break;
case HttpEventType.UploadProgress:
// Update the upload progress!
this.progress = Math.round(event.loaded / event.total * 100);
console.log(`Uploading! ${this.progress}%`);
break;
case HttpEventType.Response:
console.log('File successfully uploaded!', event.body);
this.body = 'File successfully uploaded!';
}
},
err => {
this.progress = 0;
this.body = 'Could not upload the file!';
});
}
**CANCEL THE UPLOAD**
cancel() {
// Unsubscribe from the upload method.
this.sub.unsubscribe();
}
Here is my NodeJS (Express) code:
...
// Configure a strategy for uploading files.
const multerUpload = multer({
// Set the storage strategy.
storage: storage,
// Set the size limits for uploading a file to 120MB.
limits: 1024 * 1024 * 120,
// Set the file filter.
fileFilter: fileFilter
});
// Add new media to the database.
router.post('/add', [multerUpload.single('file')], async (req, res)=>{
return res.status(200).send();
});
What is the right way to cancel the upload without leaving any chuncks in the database?
So I have been trying to get to the bottom of this for 2 days now and I believe I have found a satisfying solution:
First, in order to cancel the file upload and delete any chunks that have already been uploaded to MongoDB, you need to adjust the fileFilter in your multer configuration in such a way to detect if the request has been aborted and the upload stream has ended. Then reject the upload by throwing an error using fileFilter's callback:
// Adjust what files can be stored.
const fileFilter = function(req, file, callback){
console.log('The file being filtered', file)
req.on('aborted', () => {
file.stream.on('end', () => {
console.log('Cancel the upload')
callback(new Error('Cancel.'), false);
});
file.stream.emit('end');
})
}
NOTE THAT: When canceling a file upload, you must wait for the changes to show up on your database. The chunks that have already been sent to the database will first have to be uploaded before the canceled file gets deleted from the database. This might take a while depending on your internet speed and the bytes that were sent before canceling the upload.
Finally, you might want to set up a route in your backend to delete any chunks from files that have not been fully uploaded to the database (due to some error that might have occured during the upload). In order to do that you'll need to fetch the all file IDs from your .chunks collection (by following the method specified on this link) and separate the IDs of the files whose chunks have been partially uploaded to the database from the IDs of the files that have been fully uploaded. Then you'll need to call GridFSBucket's delete() method on those IDs in order to get rid of the redundant chunks. This step is purely optional and for database maintenance reasons.
Try using try catch way.
There can be two ways it can be done.
By calling an api which takes the file that is currently been uploaded as it's parameter and then on backend do the steps of delete and clear the chunks that are present on the server
By handling in exception.
By sending a file size as a validation where if the backend api has received the file totally of it size then it is to be kept OR if the size of the received file is less that is due to cancellation of upload bin between then do the clearance steps where you just take the id and mongoose db of the files chuck and clear it.

Node, Express, and parsing streamed JSON in endpoint without blocking thread

I'd like to provide an endpoint in my API to allow third-parties to send large batches of JSON data. I'm free to define the format of the JSON objects, but my initial thought is a simple array of objects:
{[{"id":1, "name":"Larry"}, {"id":2, "name":"Curly"}, {"id":3, "name":"Moe"}]}
As there could be any number of these objects in the array, I'd need to stream this data in, read each of these objects as they're streamed in, and persist them somewhere.
TL;DR: Stream a large array of JSON objects from the body of an Express POST request.
It's easy to get the most basic of examples out there working as all of them seem to demonstrate this idea using "fs" and working w/ the filesystem.
What I've been struggling with is the Express implementation of this. At this point, I think I've got this working using the "stream-json" package:
const express = require("express");
const router = express.Router();
const StreamArray = require("stream-json/streamers/StreamArray");
router.post("/filestream", (req, res, next) => {
const stream = StreamArray.withParser();
req.pipe(stream).on("data", ({key, value}) => {
console.log(key, value);
}).on("finish", () => {
console.log("FINISH!");
}).on("error", e => {
console.log("Stream error :(");
});
res.status(200).send("Finished successfully!");
});
I end up with a proper readout of each object as it's parsed by stream-json. The problem seems to be with the thread getting blocked while the processing is happening. I can hit this once and immediately get the 200 response, but a second hit blocks the thread until the first batch finishes, while the second also begins.
Is there any way to do something like this w/o spawning a child process, or something like that? I'm unsure what to do with this, so that the endpoint can continue to receive requests while streaming/parsing the individual JSON objects.

How do I make sure a promise has been returned before responding to an incoming request (Swagger/Express)

I'm trying to write a simple Swagger API that will allow me to sync a couple of systems on demand. The syncing is one way, so basically the end goal will be to send a request to both system, see what's new/changed/removed on the origin, then update the destination. I've been trying to do this using node.js instead of Java, to which I'm more used to, as a learning experience, but I'm really having a hard time figuring out a key issue due to the async nature.
As a test, I constructed a simple Express node.js app on IntelliJ, where in one of the routes I'm calling a function exported from a different file and trying to get the response back. Unfortunately, this isn't working so well. What I've done is this:
getit.js - (this goes to the Ron Swanson generator to get a quote)
const rp = require('request-promise');
async function dorequest() {
const response = await rp(uri);
return Promise.resolve(response);
};
module.exports = {dorequest}
In the route I've done this:
var getit = require ('./getit.js');
/* GET users listing. */
router.get('/', function(req, res, next) {
var ret = getit.dorequest();
res.send(ret);
console.log('res out' + ret);
});
What I get in the console is
res out[object Promise]
and the response is of course empty.
What am I doing wrong? I've been playing with this for a week now, tried various methods, but I keep getting similar results. I'm obviously missing something out, and would appreciate some help.
Thanks!
Object is empty because it was written on the console before the Promise is resolved. You have to wait until Promise is resolved and then send the response back so try to change your code like this:
var getit = require ('./getit.js');
/* GET users listing. */
router.get('/', function(req, res, next) {
getit.dorequest().then(function(data) {
console.log('res out' + data);
res.send(data);
});
});
Since you are using async/await approach all you need to do is to place await before getit.dorequest();
so this line will look like var ret = await getit.dorequest();

What to do to stop Express.js from duplicating long multipart requests?

Consider the following code:
var routes = function(app) {
app.post('/api/video', passport.authenticate('token', authentication), video.createVideo);
}
function createVideo(request, response) {
logger.info('starting create video');
upload(request, response, function(err) {
logger.info('upload finished', err);
//callback omitted for brevity
}
}
Upload is multer with multer-s3 middleware:
var upload = multer({
storage: s3({
dirname: config.apis.aws.dirname,
bucket: config.apis.aws.bucket,
secretAccessKey: config.apis.aws.secretAccessKey,
accessKeyId: config.apis.aws.accessKeyId,
region: config.apis.aws.region,
filename: function(req, file, cb) {
cb(null, req.user._id + '/' + uuid.v4() + path.extname(file.originalname));
}
}),
limits: {
fileSize: 1000000000
},
fileFilter: function(req, file, cb) {
if (!_.contains(facebookAllowedTypes, path.extname(file.originalname))) {
return cb(new Error('Only following types are allowed: ' + facebookAllowedTypes));
}
cb(null, true);
}
}).fields([{
name: 'video',
maxCount: 1
}]);
The code above does the following: it takes a file that is sent from somewhere and streams it to AWS S3 instance. multer-s3 uses s3fs in the background to create write stream and send the file as 5MB multiparts.
With big files, like 300MB it can take minutes to upload. And now something really strange happens. I can see in our frontend that it sends only one POST request on /api/video. Actually I also tried using Postman to make the request, not trusting our frontend.
It starts the upload, but after around 2 minutes it starts 2nd upload! If I try to upload smaller files, like 2-100MB then nothing of sorts happens. This is from my logs(from the code above):
{"name":"test-app","hostname":"zawarudo","pid":16953,"level":30,"msg":"starting create video","time":"2015-12-02T14:08:22.243Z","src":{"file":"/home/areinu/dev/projects/test-app-uploader/backend/app/services/videoService.js","line":169,"func":"createVideo"},"v":0}
{"name":"test-app","hostname":"zawarudo","pid":16953,"level":30,"msg":"starting create video","time":"2015-12-02T14:10:28.794Z","src":{"file":"/home/areinu/dev/projects/test-app-uploader/backend/app/services/videoService.js","line":169,"func":"createVideo"},"v":0}
{"name":"test-app","hostname":"zawarudo","pid":16953,"level":30,"msg":"upload finished undefined","time":"2015-12-02T14:12:46.433Z","src":{"file":"/home/areinu/dev/projects/test-app-uploader/backend/app/services/videoService.js","line":171},"v":0}
{"name":"test-app","hostname":"zawarudo","pid":16953,"level":30,"msg":"upload finished undefined","time":"2015-12-02T14:12:49.627Z","src":{"file":"/home/areinu/dev/projects/test-app-uploader/backend/app/services/videoService.js","line":171},"v":0}
As you can see both uploads end few ms after each other, but the second one starts after 2 minutes. The problem is - there should be only one upload!
All I did in postman was set my access token(so passport authorizes me) and added a file. This should create only 1 upload, meanwhile 2 happen, and both upload the same file.
Also, notice that both files get uploaded, both have different uuids(notice filename function creates the file names from uuid), both appear on s3, and both has proper size of 300MB, both can be downloaded and both work.
If the upload is smaller the duplication doesn't occur. What is the reason for this behavior? How to fix it?
The problem was very simple(I only spent whole day on figuring it out). It's just default timeout of node requests - 2 minutes. I don't know why it started another one nor why it actually worked, but setting default timeout on my server to 10 minutes fixed the issue.
If someone knows why the timed out requests actually did complete(and twice) please let me know. I'll improve the answer then.

node request pipe hanging after a few hours

I have an endpoint in a node app which is used to download images
var images = {
'car': 'http://someUrlToImage.jpg',
'boat': 'http://someUrlToImage.jpg',
'train': 'http://someUrlToImage.jpg'
}
app.get('/api/download/:id', function(req, res){
var id = req.params.id;
res.setHeader("content-disposition", "attachment; filename=image.jpg");
request.get(images[id]).pipe(res);
});
Now this code works fine, but after a few hours of the app running, the endpoint just hangs.
I am monitoring the memory usage of the app, which remains consistent, and any other endpoints which just return some JSON respond as normal so it is not as if the event loop is somehow being blocked. Is there a gotcha of some kind that I am missing when using the request module to pipe a response? Or is there a better solution to achieve this?
I am also using the Express module.
You should add an error listener on your request because errors are not passed in pipes. That way, if your request has an error, it will close the connection and you'll get the reason.
request
.get(...)
.on('error', function(err) {
console.log(err);
res.end();
})
.pipe(res)

Resources