Multiple download request . from S3 signedUrl using Node.js - node.js

I have created around 500 signed URLs of objects located in S3. Now when I try to download those objects from signed URL in a loop of
await Promise.all(signedUrls.map(async (url) => {
const val = await request(url, (error, response) => {
if (!error) {
console.log('Downloaded successfully');
} else {
console.log('error in downloading', error.message);
}
});
}));
I get this error for some of the URLs.
error in downloading getaddrinfo ENOTFOUND s3.amazonaws.com s3.amazonaws.com:443
I know all the signed URLs created are correct which I checked individually, but suspect that S3 there is an issue in downloading the files.
Need to check if there is any limit on S3 for requesting too many files.

S3 has no practical limit on the number of downloads or the number of concurrent downloads. In theory, there must be a limit because they have a finite amount of hardware in the AWS data centers, but that limit is so high that in practice you cannot reach it.

Related

How to send a stream of data from a JS application to Lamda running on NodeJS and return concurrently

I have a Angular JS application, from this application I send data to my AWS API endpoint
/**
* Bulk sync with master
*/
async syncDataWithMaster(): Promise<AxiosResponse<any> | void> {
{
axios.defaults.headers.post.Authorization = token;
const url = this.endpoint;
return axios.post(url, compressed, {
onUploadProgress: progressEvent => {
console.log('uploading')
},
onDownloadProgress: progressEvent => {
console.log('downloading')
},
}).then((response) => {
if (response.data.status == 'success') {
return response;
} else {
throw new Error('Could not authenticate user');
}
});
} catch (e) {
}
return;
}
the api gateway triggers my Lambda function (NodeJS) with the data it received:
exports.handler = async (event) => {
const localData = JSON.parse(event.body);
/**
Here get data from master and compare with local data and send back any new data
**/
const response = {
statusCode: 200,
body: JSON.stringify(newData),
};
return response;
};
the lambda function will call the database and get the master data for a user (not shown in the example) and then this data is compare using various logic with the local data and it determines if we need to send any new rows back to the local device to be store/ updated. (Before anyone asks, the nature of the application needs full data)
This principle works great for 90% of my users. However some users have fairly large amounts of data the current maximum being around 17mb of data.
So my question is is it possible to stream the data to and from the lambda function? So stream the data to the function, process and stream back? So that it is not affected by payload limits from AWS?
Or is it possible to somehow, begin sending data to the function as a stream and then as data becomes available it starts streaming data back at the same time?
(Data is JSON format)
I am wondering what alternatives to this solution (as it need to be fairly quick as well max 30sec)
(One other idea I had was for certain data above a certain size, frist client saves to s3 using signed url. The calls the api gateway for lambda. Lambda gets the saved file and compare to master. New data to be returned saved to s3 if over certain size. Then signed url returned to client. Client downloads the new data and processes) - However I am not sure if this of cost effective and it sounds live execution time may be long
Thanks for any help, been trying to figure this out for a while now

Lambda stream encoded string

I stored the audio file in an array buffer and encoded it with base64. I need to send the data from the lambda to react client. For the larger audio files, I'm facing a lambda payload limit error.
Is there any way to stream the data in chunks from the Lambda to client ?
function readFile(filepath, callback) {
//Uint8Array
fs.readFile(filepath, (err, data) => {
// Data is a Buffer object
if (err) console.log(err);
callback(data);
})
readFile(`${outputFile}`, function (data) {
try {
let base64enc = base64.encode(data);
responseBody.message = base64enc;
status = statusCodes.OK;
return sendResponse(status, responseBody);//Sending response
} catch (err) {
console.log("error " + err);
reject(err);
}
});
No, there isn't. Each Lambda function call can return one payload and Lambda function calls are independent. I suggest two possible solutions.
Request a specific chunk. You can call the Lambda function multiple times with each call requesting a specific chunk of the data and merging them together as one data (file in your case) on the frontend.
Use S3. You usually handle media files using S3. Assuming you already have the audio file on S3, you generate and return the pre-signed get URL object in Lambda, and use the url to get the object on the frontend. (You can refer to code on Presigned URL generation code times-out as Lambda, works locally or other sources). You can also upload audio files by getting the pre-signed put URL in Lambda and using the url on the frontend to upload.
I would suggest the second solution because it is a more standard way of dealing with media files.

Cancel File Upload: Multer, MongoDB

I can't seem to find any up-to-date answers on how to cancel a file upload using Mongo, NodeJS & Angular. I've only come across some tuttorials on how to delete a file but that is NOT what I am looking for. I want to be able to cancel the file uploading process by clicking a button on my front-end.
I am storing my files directly to the MongoDB in chuncks using the Mongoose, Multer & GridFSBucket packages. I know that I can stop a file's uploading process on the front-end by unsubscribing from the subsribable responsible for the upload in the front-end, but the upload process keeps going in the back-end when I unsubscribe** (Yes, I have double and triple checked. All the chunks keep getting uploaded untill the file is fully uploaded.)
Here is my Angular code:
ngOnInit(): void {
// Upload the file.
this.sub = this.mediaService.addFile(this.formData).subscribe((event: HttpEvent<any>) => {
console.log(event);
switch (event.type) {
case HttpEventType.Sent:
console.log('Request has been made!');
break;
case HttpEventType.ResponseHeader:
console.log('Response header has been received!');
break;
case HttpEventType.UploadProgress:
// Update the upload progress!
this.progress = Math.round(event.loaded / event.total * 100);
console.log(`Uploading! ${this.progress}%`);
break;
case HttpEventType.Response:
console.log('File successfully uploaded!', event.body);
this.body = 'File successfully uploaded!';
}
},
err => {
this.progress = 0;
this.body = 'Could not upload the file!';
});
}
**CANCEL THE UPLOAD**
cancel() {
// Unsubscribe from the upload method.
this.sub.unsubscribe();
}
Here is my NodeJS (Express) code:
...
// Configure a strategy for uploading files.
const multerUpload = multer({
// Set the storage strategy.
storage: storage,
// Set the size limits for uploading a file to 120MB.
limits: 1024 * 1024 * 120,
// Set the file filter.
fileFilter: fileFilter
});
// Add new media to the database.
router.post('/add', [multerUpload.single('file')], async (req, res)=>{
return res.status(200).send();
});
What is the right way to cancel the upload without leaving any chuncks in the database?
So I have been trying to get to the bottom of this for 2 days now and I believe I have found a satisfying solution:
First, in order to cancel the file upload and delete any chunks that have already been uploaded to MongoDB, you need to adjust the fileFilter in your multer configuration in such a way to detect if the request has been aborted and the upload stream has ended. Then reject the upload by throwing an error using fileFilter's callback:
// Adjust what files can be stored.
const fileFilter = function(req, file, callback){
console.log('The file being filtered', file)
req.on('aborted', () => {
file.stream.on('end', () => {
console.log('Cancel the upload')
callback(new Error('Cancel.'), false);
});
file.stream.emit('end');
})
}
NOTE THAT: When canceling a file upload, you must wait for the changes to show up on your database. The chunks that have already been sent to the database will first have to be uploaded before the canceled file gets deleted from the database. This might take a while depending on your internet speed and the bytes that were sent before canceling the upload.
Finally, you might want to set up a route in your backend to delete any chunks from files that have not been fully uploaded to the database (due to some error that might have occured during the upload). In order to do that you'll need to fetch the all file IDs from your .chunks collection (by following the method specified on this link) and separate the IDs of the files whose chunks have been partially uploaded to the database from the IDs of the files that have been fully uploaded. Then you'll need to call GridFSBucket's delete() method on those IDs in order to get rid of the redundant chunks. This step is purely optional and for database maintenance reasons.
Try using try catch way.
There can be two ways it can be done.
By calling an api which takes the file that is currently been uploaded as it's parameter and then on backend do the steps of delete and clear the chunks that are present on the server
By handling in exception.
By sending a file size as a validation where if the backend api has received the file totally of it size then it is to be kept OR if the size of the received file is less that is due to cancellation of upload bin between then do the clearance steps where you just take the id and mongoose db of the files chuck and clear it.

AWS/Node.js S3 Signed URL Denied

I'm using Amazon's Node.js aws-sdk to create expiring pre-signed S3 URLs for digital product downloads, and struggling with the result. I've got the SDK configured with my keys successfully, and I've tried both a synchronous approach (not shown) and an async approach (shown) at collecting signed urls. The calls work, I never hit any errors and I am successfully returned signed URLs. Here's the twist: the URLs I get back don't work.
const promises = skus.map(function(sku) {
const key = productKeys[sku];
return new Promise((resolve, reject) => {
s3.getSignedUrl('getObject', {
Bucket: 'my-products',
Key: key,
Expires: 60 * 60 * 24, // Time in seconds; 24 hours
}, function(err, res) {
if (err) {
reject(err);
} else {
resolve({
text: productNames[sku],
url: res,
});
}
});;
});
});
I had assumed it was an error with the keys I had allocated, which I had assigned to an IAM User who has full S3 bucket access. So, I tried using a root level keypair and I get the same access denied result. Interestingly: the URLs I get back take the form https://my-bucket.s3.amazonaws.com/Path/To/My/Product.zip?AWSAccessKeyId=blahblahMyKey&Expires=43914919&Signature=blahblahmysig&x-amz-security-token=hugelongstring. I've not seen this x-amz-security-token thing before, and if I try just removing that query param, I get Access Denied but for a different reason: the AWSAccessKeyId is one that is not associated with any of my accounts. It's not the one I've configured the SDK with and it's not one I've allocated on my S3 account. No idea where it comes from, and no idea how that relates to the x-amz-security-token param.
Anyway, I'm stumped. I just want a working pre-signed url... what gives? Thanks for your help.

Streaming a zip download from cloud functions

I have a firebase cloud function that uses express to streams a zip file of images to the client. When I test the cloud function locally it works fine. When I upload to firebase I get this error:
Error: Can't set headers after they are sent.
What could be causing this error? Memory limit?
export const zipFiles = async(name, params, response) => {
const zip = archiver('zip', {zlib: { level: 9 }});
const [files] = await storage.bucket(bucketName).getFiles({prefix:`${params.agent}/${params.id}/deliverables`});
if(files.length){
response.attachment(`${name}.zip`);
response.setHeader('Content-Type', 'application/zip');
response.setHeader('Access-Control-Allow-Origin', '*')
zip.pipe(output);
response.on('close', function() {
return output.send('OK').end(); // <--this is the line that fails
});
files.forEach((file, i) => {
const reader = storage.bucket(bucketName).file(file.name).createReadStream();
zip.append(reader, {name: `${name}-${i+1}.jpg`});
});
zip.finalize();
}else{
output.status(404).send('Not Found');
}
What Frank said in comments is true. You need to decide all your headers, including the HTTP status response, before you start sending any of the content body.
If you intend to express that you're sending a successful response, simply say output.status(200) in the same way that you did for your 404 error. Do that up front. When you're piping a response, you don't need to do anything to close the response in the end. When the pipe is done, the response will automatically be flushed and finalized. You're only supposed to call end() when you want to bail out early without sending a response at all.
Bear in mind that Cloud Functions only supports a maximum payload of 10MB (read more about limits), so if you're trying to zip up more than that total, it won't work. In fact, there is no "streaming" or chunked responses at all. The entire payload is being built in memory and transferred out as a unit.

Resources