multiple Images are not fully uploaded to S3 upon first lambda call - node.js

i have issue with uploading multiple files to s3.
what my lambda is doing:
1. uploading single file to s3 ( always work).
2. resizing the file to 4 new sizes (using sharp).
3. upload the resized files to s3.
the problem : sometimes only 2 or 3 out of 4 resized files are uploaded.
the surprising thing is that i noticed that on the next upload - the missing files from previous upload are added.
have no errors, i was thinking that this can be async issue so i awaited the right places to make it Synchronous.
will appreciate any help.
async function uploadImageArrToS3(resizeImagesResponse) {
return new Promise(async function (resolve, reject) {
var params = {
Bucket: bucketName,
ACL: 'public-read'
};
let uploadImgArr = resizeImagesResponse.map(async (buffer) => {
params.Key = buffer.imgParamsArray.Key;
params.Body = buffer.imgParamsArray.Body;
params.ContentType = buffer.imgParamsArray.ContentType;
let filenamePath = await s3.putObject(params, (e, d) => {
if (e) {
reject(e);
} else {
d.name = params.ContentType;
return (d.name);
}
}).params.Key
let parts = filenamePath.split("/");
let fileName = parts[parts.length - 1];
return {
fileName: fileName,
width: buffer.width
};
});
await Promise.all(uploadImgArr).then(function (resizedFiles) {
console.log('succesfully resized the image!');
resolve(resizedFiles);
});
})
}

Related

Slow upload speed into amazon s3 upload

I'm trying to upload 50gb data into s3 bucket using nestJs(nodeJs) and angular 13.
I have thousand of marriage images with row files and I'm uploading these file into s3 bucket but it's take more time to upload.
I also enable transfer-acceleration into s3 configuration and all files upload into multi-part data.
Angular process:
user select thousand of file using file input
after i called one api for each file for upload into loop. for example i have 5 files so it will call 5 api one by one.
Backend process:
file upload into multipart data (using multi-part upload)
after file upload into bucket I will store bucket data into database
any one tell me how can I improve uploading speed into s3 bucket ?
NestJs code
async create(request, photos) {
try {
let allPromises = [];
photos.forEach(async (photo) => {
let promise = new Promise<void>((resolve, reject) => {
let file = new ChangeFileName().changeName(photo);
this.s3fileUploadService.upload(file, `event-gallery-photos/${request.event_id}`).then(async (response: any) => {
console.log(response)
if (response.Location) {
await this.eventPhotoEntity.save({
studio_id: request.studio_id,
client_id: request.client_id,
event_id: request.event_id,
file_name: file.originalname,
original_name: file.userFileName,
file_size: file.size
});
}
resolve();
}).catch((error) => {
console.log(error);
this.logger.error(`s3 file upload error : ${error.message}`);
reject();
})
});
allPromises.push(promise);
});
return Promise.all(allPromises).then(() => {
return new ResponseFormatter(HttpStatus.OK, "Created successfully");
}).catch(() => {
return new ResponseFormatter(HttpStatus.INTERNAL_SERVER_ERROR, "Something went wrong", {});
})
} catch (error) {
console.log(error);
this.logger.error(`event photo create : ${error.message}`);
return new ResponseFormatter(HttpStatus.INTERNAL_SERVER_ERROR, "Something went wrong", {});
}
}
Upload function
async upload(file, bucket) {
return new Promise(async (resolve, reject) => {
bucket = 'photo-swipes/' + bucket
const chunkSize = 1024 * 1024 * 5; // chunk size is set to 10MB
const iterations = Math.ceil(file.buffer.length / chunkSize); // number of chunks to be broken
let arr = [];
for (let i = 1; i <= iterations; i++) {
arr.push(i)
}
try {
let uploadId: any = await this.startUpload(file, bucket);
uploadId = uploadId.UploadId;
const parts = await Promise.allSettled(
arr.map(async (item, index) => {
return await this.uploadPart(
file.originalname,
file.buffer.slice((item - 1) * chunkSize, item * chunkSize),
uploadId,
item,
bucket
)
})
)
const failedParts = parts
.filter((part) => part.status === "rejected")
.map((part: any) => part.reason);
const succeededParts = parts
.filter((part) => part.status === "fulfilled")
.map((part: any) => part.value);
let retriedParts = [];
if (!failedParts.length) // if some parts got failed then retry
retriedParts = await Promise.all(
failedParts.map((item, index) => {
this.uploadPart(
file.originalname,
file.buffer.slice((item) * chunkSize, item * chunkSize),
uploadId,
item,
bucket
)
})
);
const data = await this.completeUpload(
file.originalname,
uploadId,
succeededParts, // needs sorted array
bucket
);
resolve(data);
} catch (err) {
console.error(err);
reject(err)
}
});
}
Bandwidth
This the typical issue of slowness when you upload many number of small files to s3 .
One large 50GB will be uploaded way faster than multiple small files of of total size 50 GB .
The best approach than can work for you to parallelize the upload and if you can use AWS CLI that would be very fast with multiple parallel connection .
Multiple part upload is for single large file
Transfer acceleration is also not going to help much here

Unable to upload multiple images to AWS S3 if I don't first upload one image through a AWS NodeJS Lambda endpoint using Promises

I have the code below on AWS Lambda as an endpoint exposed through API Gateway. The point of this endpoint is to upload images to an S3 bucket. I've been experiencing an interesting bug and could use some help. This code is unable to upload multiple images to S3 if it does not first upload one image. I've listed the scenarios below. The reason I want to use Promises is because I intend to insert data into a mysql table in the same endpoint. Any advice or feedback will be greatly appreciated!
Code Successfully uploads multiple images:
Pass one image to the endpoint to upload to S3 first
Pass several images to the endpoint to upload to S3 after uploading one image first
Code fails to upload images:
Pass several images to the endpoint to upload to s3 first. A random amount of images might be uploaded, but it consistently fails to upload all of them. A 502 error code is returned because it failed to upload all images.
Code
const AWS = require('aws-sdk');
const s3 = new AWS.S3({});
function uploadAllImagesToS3(imageMap) {
console.log('in uploadAllImagesToS3')
return new Promise((resolve, reject) => {
awaitAll(imageMap, uploadToS3)
.then(results => {
console.log('awaitAllFinished. results: ' + results)
resolve(results)
})
.catch(e => {
console.log("awaitAllFinished error: " + e)
reject(e)
})
})
}
function awaitAll(imageMap, asyncFn) {
const promises = [];
imageMap.forEach((value, key) => {
promises.push(asyncFn(key, value));
})
console.log('promises length: ' + promises.length)
return Promise.all(promises)
}
function uploadToS3(key, value) {
return new Promise((resolve, reject) => {
console.log('Promise uploadToS3 | key: ' + key)
// [key, value] = [filePath, Image]
var params = {
"Body": value,
"Bucket": "userpicturebucket",
"Key": key
};
s3.upload(params, function (err, data) {
console.log('uploadToS3. s3.upload. data: ' + JSON.stringify(data))
if (err) {
console.log('error when uploading to s3 | error: ' + err)
reject(JSON.stringify(["Error when uploading data to S3", err]))
} else {
let response = {
"statusCode": 200,
"headers": {
"Access-Control-Allow-Origin": "http://localhost:3000"
},
"body": JSON.stringify(data),
"isBase64Encoded": false
};
resolve(JSON.stringify(["Successfully Uploaded data to S3", response]))
}
});
})
}
exports.handler = (event, context, callback) => {
if (event !== undefined) {
let jsonObject = JSON.parse(event.body)
let pictures = jsonObject.pictures
let location = jsonObject.pictureLocation
let imageMap = new Map()
for (let i = 0; i < pictures.length; i++) {
let base64Image = pictures[i].split('base64,', 2)
let decodedImage = Buffer.from(base64Image[1], 'base64'); // image string is after 'base64'
let base64Metadata = base64Image[0].split(';', 3) // data:image/jpeg,name=coffee.jpg,
let imageNameData = base64Metadata[1].split('=', 2)
let imageName = imageNameData[1]
var filePath = "test/" + imageName
imageMap.set(filePath, decodedImage)
}
const promises = [uploadAllImagesToS3(imageMap)]
Promise.all(promises)
.then(([uploadS3Response]) => {
console.log('return promise!! | uploadS3Response: ' + JSON.stringify([uploadS3Response]))
let res = {
body: JSON.stringify(uploadS3Response),
headers: {
"Access-Control-Allow-Origin": "http://localhost:3000"
}
};
callback(null, res);
})
.catch((err) => {
callback(err);
});
} else {
callback("No pictures were uploaded")
}
};
Reason for problem and solution :
After several hours of debugging this issue I realized what the error was! My Lambda endpoint was timing out early. The reason I was able to upload multiple images after first uploading one image was because my the lambda endpoint was being executed from a warm start - as it was already up and running. The scenario where I was unable to upload multiple images was actually only occurring when I would try to do so after not executing the endpoint in 10+ minutes - therefore a cold start. Therefore, the solution was to increase the Timeout from the default of 3 seconds. I increased it to 20 seconds, but might need to play around with that time.
How to increase the lambda timeout?
Open Lambda function
Scroll down to Basic Settings and select Edit
Increase time in Timeout
TLDR
This error was occurring because Lambda would timeout. Solution is to increase lambda timeout.

JSON files does not contain all the results in AWS Lambda using NodeJS

I'm currently working on a project using AWS S3, Rekognition and Lambda. I'm writing in NodeJS and created a working solution to what I want to achieve. The workflow in short is: an image of a face is loaded onto a S3 bucket, then the 'searchFacesByImage' API is called to see if that face has been indexed to the Master collection in the past. If it is a new face, the result will be false, and the 'indexFaces' API is called to index that face to the Master collection. Once that is done, I write the output to 3 separate JSON files that is in the same S3 bucket, called: 'metadata.json', 'indexing.json', 'rekognition.json'.
The 'metadata.json' file only contains the ExternalImageID (that I create myself), the date and time of indexing, the filename that was indexed, and a count that counts how many times that face has been indexed in the past.
The 'indexing.json' file contains the same ExternalImageID, the same data and time of indexing, and the response from the 'searchFacesByImage' API.
The 'rekognition.json' file contains the same ExternalImageID and date and time, as well as the response from the 'indexFaces' API.
The problem comes in that when I load on image at a time, the 3 JSON files will start to populate accordingly, but as soon as I load more than a few (I've tested it with 7) images at the same time, all 7 images will run through the workflow and the response data is written out to each file according to the Cloudwatch logs, but when I actually go to view the JSON files, not all the response data is there for all 7 images. Sometimes the data of 5 images are in the JSON, other times its 4 images. The data doesn't have to be in any specific order, it must just be there. I've also tested it where I uploaded 18 images at once and only the response of 10 images was in the JSON.
I believe the problem comes in that I'm calling the 'getObject' API on the JSON files, then I append the response data to those files, and then I'm calling the 'putObject' API on those JSON files to put them back into the S3 bucket, but while the first image is going through this process, the next image wants to do the same, but there is no file to use the 'getObject' on, because it is busy with the previous image, so then it just skips over the image, although the Cloudwatch logs said I has been added to the files.
I have no idea how to work around this. I believe the answer lies in Asynchronous JavaScript (which I don't know that much of so I have no idea where to begin)
My apologies for the long post. Here is my code below:
const AWS = require('aws-sdk');
const s3 = new AWS.S3({apiVersion: "2006-03-01"});
const rekognition = new AWS.Rekognition();
//const docClient = new AWS.DynamoDB.DocumentClient();
const uuidv4 = require('uuid/v4');
let bucket, key;
let dataSaveDate = new Date();
console.log('Loading function');
//-----------------------------------Exports Function---------------------------
exports.handler = function(event, context) {
bucket = event.Records[0].s3.bucket.name;
key = event.Records[0].s3.object.key;
console.log(bucket);
console.log(key);
searchingFacesByImage(bucket, key);
};
//---------------------------------------------------------------------------
// Search for a face in an input image
function searchingFacesByImage(bucket, key) {
let params = {
CollectionId: "allFaces",
FaceMatchThreshold: 95,
Image: {
S3Object: {
Bucket: bucket,
Name: key
}
},
MaxFaces: 5
};
const searchingFace = rekognition.searchFacesByImage(params, function(err, searchdata) {
if (err) {
console.log(err, err.stack); // an error occurred
} else {
// console.log(JSON.stringify(searchdata, null, '\t'));
// if data.FaceMatches > 0 : There that face in the image exists in the collection
if (searchdata.FaceMatches.length > 0) {
console.log("Face is a match");
} else {
console.log("Face is not a match");
let mapping_id = createRandomId();
console.log(`Created mapping_id: ${mapping_id}`);
console.log("Start indexing face to 'allFaces'");
indexToAllFaces(mapping_id, searchdata, bucket, key);
}
}
});
return searchingFace;
}
//---------------------------------------------------------------------------
// If face is not a match in 'allFaces', index face to 'allFaces' using mapping_id
function indexToAllFaces(mapping_id, searchData, bucket, key) {
let params = {
CollectionId: "allFaces",
DetectionAttributes: ['ALL'],
ExternalImageId: mapping_id,
Image: {
S3Object: {
Bucket: bucket,
Name: key
}
}
};
const indexFace = rekognition.indexFaces(params, function(err, data) {
if (err) {
console.log(err, err.stack); // an error occurred
} else {
console.log("INDEXING TO 'allFaces'");
//console.log(JSON.stringify(data, null, '\t'));
logAllData(mapping_id, bucket, key, searchData, data);
}
});
return indexFace;
}
//---------------------------------------------------------------------------
// Counting how many times a face has been indexed and logging ALL data in a single log
function logAllData(mapping_id, bucket, key, searchData, data) {
let params = {
CollectionId: mapping_id,
MaxResults: 20
};
const faceDetails = rekognition.listFaces(params, function(err, facedata) {
if (err) {
console.log(err, err.stack); // an error occurred
} else {
//console.log(JSON.stringify(facedata, null, '\t'));
metadata(mapping_id, bucket, key, facedata);
indexing(mapping_id, bucket, searchData);
rekognitionData(mapping_id, bucket, data);
}
});
return faceDetails;
}
//-----------------------------------------------------------------------------
function metadata(mapping_id, bucket, key, faceData) {
let body = [
{
"mapping_id": mapping_id,
"time": dataSaveDate,
"image_name": key,
"indexing_count": faceData.Faces.length - 1
}
];
//console.log(JSON.stringify(body, null, '\t'));
logData("metadata.json", bucket, body);
}
//------------------------------------------------------------------------------
function indexing(mapping_id, bucket, searchData) {
let body = [
{
"mapping_id": mapping_id,
"time": dataSaveDate,
"IndexingData": searchData
}
];
logData("indexing.json", bucket, body);
}
//------------------------------------------------------------------------------
function rekognitionData(mapping_id, bucket, data) {
let body = [
{
"mapping_id": mapping_id,
"time": dataSaveDate,
"rekognition": data
}
];
logData("rekognition.json", bucket, body);
}
//------------------------------------------------------------------------------
// Function to log all data to JSON files
function logData(jsonFileName, bucket, body) {
let params = {
Bucket: bucket,
Key: jsonFileName
};
const readFile = s3.getObject(params, function(err, filedata) {
if (err) {
console.log(err, err.stack); // an error occurred
} else {
console.log(`READING ${jsonFileName} CONTENTS`);
// Read data from 'jsonFileName'
let raw_content = filedata.Body.toString();
let content = JSON.parse(raw_content);
// Add new data to 'jsonFileName'
content.push(...body);
// Put new data back into jsonFileName
s3.putObject(
{
Bucket: bucket,
Key: jsonFileName,
Body: JSON.stringify(content, null, '\t'),
ContentType: "application/json"
},
function(err, res) {
if (err) {
console.log(err);
} else {
console.log(`DATA SAVED TO ${jsonFileName}`);
}
}
);
}
});
return readFile;
}
//----------------------------------SCRIPT ENDS---------------------------------
When a Node.js Lambda reaches the end of the main thread, it ends all other threads.
To make sure that the lambda does not prematurely terminate those threads, wait until that Promise is complete by using await.
The functions s3.getObject and s3.putObject can be made into a Promise like this:
await s3.getObject(params).promise()
await s3.putObject(params).promise()

Serverless lambda trigger read json file

I have lambda (Node) which has trigger to fire when a new JSON file added to our S3 bucket. Here is my lambda code
module.exports.bookInfo = (event, context) => {
console.log('Events ', JSON.stringify(event));
event.Records.forEach((record) => {
const filename = record.s3.object.key;
const bucketname = record.s3.bucket.name;
let logMsg = [];
const s3File = `BucketName: [${bucketname}] FileName: [${filename}]`;
console.log(s3File)
logMsg.push(`Lambda execution started for ${s3File}, Trying to download file from S3`);
try {
s3.getObject({
Bucket: bucketname,
Key: filename
}, function(err, data) {
logMsg.push('Data is ', JSON.stringify(data.Body))
if (err) {
logMsg.push('Generate Error :', err);
console.log(logMsg)
return null;
}
logMsg.push(`File downloaded successfully. Processing started for ${s3File}`);
logMsg.push('Data is ', JSON.stringify(data.Body))
});
} catch (e) {console.log(e)}
});
}
When i run this, i don't get file content and i suspect that lambda finishes execution before file read operation complete. I tried with async await without success. What i am missing here ? I was able to read small file of 1 kb but when my file grows like 100 MB, it causes issue.
Thanks in advance
I was able to do it through async/await. Here is my code
module.exports.bookInfo = (event, context) => {
event.Records.forEach(async(record) => {
const filename = record.s3.object.key;
const bucketname = record.s3.bucket.name;
const s3File = `BucketName: [${bucketname}] FileName: [${filename}]`;
logMsg.push(`Lambda execution started for ${s3File}, Trying to download file from S3`);
let response = await s3.getObject({
Bucket: bucketname,
Key: filename
}).promise();
})
}

How can I delete folder on s3 with node.js?

Yes, I know. There is no folder concept on s3 storage. but I really want to delete a specific folder from s3 with node.js. I tried two solutions, but both didn't work.
My code is below:
Solution 1:
Deleting folder directly.
var key='level/folder1/folder2/';
var strReturn;
var params = {Bucket: MyBucket};
var s3 = new AWS.S3(params);
s3.client.listObjects({
Bucket: MyBucket,
Key: key
}, function (err, data) {
if(err){
strReturn="{\"status\":\"1\"}";
}else{
strReturn=+"{\"status\":\"0\"}";
}
res.send(returnJson);
console.log('error:'+err+' data:'+JSON.stringify(data));
});
Actually, I have a lot of files under folder2. I can delete single file from folder2 if I define key like this:
var key='level/folder1/folder2/file1.txt', but it didn't work when I deleted a folder(key='level/folder1/folder2/').
Solution 2:
I tried to set expiration to an object when I uploaded this file or folder to s3. code is below:
s3.client.putObject({
Bucket: Camera_Bucket,
Key: key,
ACL:'public-read',
Expires: 60
}
But it didn't either. After finishing uploading, I checked the properties of that file. it showed there was nothing value for expiry date:
Expiry Date:none
Expiration Rule:N/A
How can I delete folder on s3 with node.js?
Here is an implementation in ES7 with an async function and using listObjectsV2 (the revised List Objects API):
async function emptyS3Directory(bucket, dir) {
const listParams = {
Bucket: bucket,
Prefix: dir
};
const listedObjects = await s3.listObjectsV2(listParams).promise();
if (listedObjects.Contents.length === 0) return;
const deleteParams = {
Bucket: bucket,
Delete: { Objects: [] }
};
listedObjects.Contents.forEach(({ Key }) => {
deleteParams.Delete.Objects.push({ Key });
});
await s3.deleteObjects(deleteParams).promise();
if (listedObjects.IsTruncated) await emptyS3Directory(bucket, dir);
}
To call it:
await emptyS3Directory(process.env.S3_BUCKET, 'images/')
You can use aws-sdk module for deleting folder. Because you can only delete a folder when it is empty, you should first delete the files in it. I'm doing it like this :
function emptyBucket(bucketName,callback){
var params = {
Bucket: bucketName,
Prefix: 'folder/'
};
s3.listObjects(params, function(err, data) {
if (err) return callback(err);
if (data.Contents.length == 0) callback();
params = {Bucket: bucketName};
params.Delete = {Objects:[]};
data.Contents.forEach(function(content) {
params.Delete.Objects.push({Key: content.Key});
});
s3.deleteObjects(params, function(err, data) {
if (err) return callback(err);
if (data.IsTruncated) {
emptyBucket(bucketName, callback);
} else {
callback();
}
});
});
}
A much simpler way is to fetch all objects (keys) at that path & delete them. In each call fetch 1000 keys & s3 deleteObjects can delete 1000 keys in each request too. Do that recursively to achieve the goal
Written in typescript
/**
* delete a folder recursively
* #param bucket
* #param path - without end /
*/
deleteFolder(bucket: string, path: string) {
return new Promise((resolve, reject) => {
// get all keys and delete objects
const getAndDelete = (ct: string = null) => {
this.s3
.listObjectsV2({
Bucket: bucket,
MaxKeys: 1000,
ContinuationToken: ct,
Prefix: path + "/",
Delimiter: "",
})
.promise()
.then(async (data) => {
// params for delete operation
let params = {
Bucket: bucket,
Delete: { Objects: [] },
};
// add keys to Delete Object
data.Contents.forEach((content) => {
params.Delete.Objects.push({ Key: content.Key });
});
// delete all keys
await this.s3.deleteObjects(params).promise();
// check if ct is present
if (data.NextContinuationToken) getAndDelete(data.NextContinuationToken);
else resolve(true);
})
.catch((err) => reject(err));
};
// init call
getAndDelete();
});
}
According doc at https://docs.aws.amazon.com/AmazonS3/latest/API/API_ListObjects.html:
A response can contain CommonPrefixes only if you specify a delimiter.
CommonPrefixes contains all (if there are any) keys between Prefix and the next occurrence of the string specified by the delimiter.
Omitting Delimiter parameter will make ListObject return all keys starting by the Prefix parameter.
According to accepted answer I created promise returned function, so you can chain it.
function emptyBucket(bucketName){
let currentData;
let params = {
Bucket: bucketName,
Prefix: 'folder/'
};
return S3.listObjects(params).promise().then(data => {
if (data.Contents.length === 0) {
throw new Error('List of objects empty.');
}
currentData = data;
params = {Bucket: bucketName};
params.Delete = {Objects:[]};
currentData.Contents.forEach(content => {
params.Delete.Objects.push({Key: content.Key});
});
return S3.deleteObjects(params).promise();
}).then(() => {
if (currentData.Contents.length === 1000) {
emptyBucket(bucketName, callback);
} else {
return true;
}
});
}
The accepted answer throws an error when used in typescript. I made it work by modifying the code in the following way. I'm very new to Typescript but at least it is working now.
async function emptyS3Directory(prefix: string) {
const listParams = {
Bucket: "bucketName",
Prefix: prefix, // ex. path/to/folder
};
const listedObjects = await s3.listObjectsV2(listParams).promise();
if (listedObjects.Contents.length === 0) return;
const deleteParams = {
Bucket: bucketName,
Delete: { Objects: [] as any },
};
listedObjects.Contents.forEach((content: any) => {
deleteParams.Delete.Objects.push({ Key: content.Key });
});
await s3.deleteObjects(deleteParams).promise();
if (listedObjects.IsTruncated) await emptyS3Directory(prefix);
}
Better solution with #aws-sdk/client-s3 module:
private async _deleteFolder(key: string, bucketName: string): Promise<void> {
const DeletePromises: Promise<DeleteObjectCommandOutput>[] = [];
const { Contents } = await this.client.send(
new ListObjectsCommand({
Bucket: bucketName,
Prefix: key,
}),
);
if (!Contents) return;
Contents.forEach(({ Key }) => {
DeletePromises.push(
this.client.send(
new DeleteObjectCommand({
Bucket: bucketName,
Key,
}),
),
);
});
await Promise.all(DeletePromises);
}
ListObjectsCommand returns the keys of files in the folder, even with subfolders
listObjectsV2 list files only with current dir Prefix not with subfolder Prefix. If you want to delete folder with subfolders recursively this is the source code: https://github.com/tagspaces/tagspaces-common/blob/develop/packages/common-aws/io-objectstore.js#L1060
deleteDirectoryPromise = async (path: string): Promise<Object> => {
const prefixes = await this.getDirectoryPrefixes(path);
if (prefixes.length > 0) {
const deleteParams = {
Bucket: this.config.bucketName,
Delete: { Objects: prefixes }
};
return this.objectStore.deleteObjects(deleteParams).promise();
}
return this.objectStore
.deleteObject({
Bucket: this.config.bucketName,
Key: path
})
.promise();
};
/**
* get recursively all aws directory prefixes
* #param path
*/
getDirectoryPrefixes = async (path: string): Promise<any[]> => {
const prefixes = [];
const promises = [];
const listParams = {
Bucket: this.config.bucketName,
Prefix: path,
Delimiter: '/'
};
const listedObjects = await this.objectStore
.listObjectsV2(listParams)
.promise();
if (
listedObjects.Contents.length > 0 ||
listedObjects.CommonPrefixes.length > 0
) {
listedObjects.Contents.forEach(({ Key }) => {
prefixes.push({ Key });
});
listedObjects.CommonPrefixes.forEach(({ Prefix }) => {
prefixes.push({ Key: Prefix });
promises.push(this.getDirectoryPrefixes(Prefix));
});
// if (listedObjects.IsTruncated) await this.deleteDirectoryPromise(path);
}
const subPrefixes = await Promise.all(promises);
subPrefixes.map(arrPrefixes => {
arrPrefixes.map(prefix => {
prefixes.push(prefix);
});
});
return prefixes;
};
You can try this:
import { s3DeleteDir } from '#zvs001/s3-utils'
import { S3 } from 'aws-sdk'
const s3Client = new S3()
await s3DeleteDir(s3Client, {
Bucket: 'my-bucket',
Prefix: `folder/`,
})
I like the list objects and then delete approach, which is what the aws cmd line does behind the scenes btw. But I didn't want to await the list (few seconds) before deleting them. So I use this 1 step (background) process, I found it slightly faster. You can await the child process if you really want to confirm deletion, but I found that took around 10 seconds, so I don't bother I just fire and forget and check logs instead. The entire API call with other stuff now takes 1.5s which is fine for my situation.
var CHILD = require("child_process").exec;
function removeImagesAndTheFolder(folder_name_str, callback){
var cmd_str = "aws s3 rm s3://"
+ IMAGE_BUCKET_STR
+ "/" + folder_name_str
+ "/ --recursive";
if(process.env.NODE_ENV === "development"){
//When not on an EC2 with a role I use my profile
cmd_str += " " + "--profile " + LOCAL_CONFIG.PROFILE_STR;
}
// In my situation I return early for the user. You could make them wait tho'.
callback(null, {"msg_str": "Check later that these images were actually removed."});
//do not return yet still stuff to do
CHILD(cmd_str, function(error, stdout, stderr){
if(error || stderr){
console.log("Problem removing this folder with a child process:" + stderr);
}else{
console.log("Child process completed, here are the results", stdout);
}
});
}
I suggest you to do it in 2 steps, so you can "follow" whats happen (with a progressBar etc...):
Get all keys to remove
Remove keys
Of course , the #1 is a recursive function, such as:
https://gist.github.com/ebuildy/7ac807fd017452dfaf3b9c9b10ff3b52#file-my-s3-client-ts
import { ListObjectsV2Command, S3Client, S3ClientConfig } from "#aws-sdk/client-s3"
/**
* Get all keys recurively
* #param Prefix
* #returns
*/
public async listObjectsRecursive(Prefix: string, ContinuationToken?: string): Promise<
any[]
> {
// Get objects for current prefix
const listObjects = await this.client.send(
new ListObjectsV2Command({
Delimiter: "/",
Bucket: this.bucket.name,
Prefix,
ContinuationToken
})
);
let deepFiles, nextFiles
// Recurive call to get sub prefixes
if (listObjects.CommonPrefixes) {
const deepFilesPromises = listObjects.CommonPrefixes.flatMap(({Prefix}) => {
return this.listObjectsRecursive(Prefix)
})
deepFiles = (await Promise.all(deepFilesPromises)).flatMap(t => t)
}
// If we must paginate
if (listObjects.IsTruncated) {
nextFiles = await this.listObjectsRecursive(Prefix, listObjects.NextContinuationToken)
}
return [
...(listObjects.Contents || []),
...(deepFiles || []),
...(nextFiles || [])
]
}
Then, delete all objects:
public async deleteKeys(keys: string[]): Promise<any[]> {
function spliceIntoChunks(arr: any[], chunkSize: number) {
const res = [];
while (arr.length > 0) {
const chunk = arr.splice(0, chunkSize);
res.push(chunk);
}
return res;
}
const allKeysToRemovePromises = keys.map(k => this.listObjectsRecursive(k))
const allKeysToRemove = (await Promise.all(allKeysToRemovePromises)).flatMap(k => k)
const allKeysToRemoveGroups = spliceIntoChunks(allKeysToRemove, 3)
const deletePromises = allKeysToRemoveGroups.map(group => {
return this.client.send(
new DeleteObjectsCommand({
Bucket: this.bucket.name,
Delete: {
Objects: group.map(({Key}) => {
return {
Key
}
})
}
})
)
})
const results = await Promise.all(deletePromises)
return results.flatMap(({$metadata, Deleted}) => {
return Deleted.map(({Key}) => {
return {
status: $metadata.httpStatusCode,
key: Key
}
})
})
}
According to Emi's answer I made a npm package so you don'
t need to write the code yourself. Also the code is written in typescript.
See https://github.com/bingtimren/s3-commons/blob/master/src/lib/deleteRecursive.ts
You can delete an empty folder the same way you delete a file. In order to delete a non-empty folder on AWS S3, you'll need to empty it first by deleting all files and folders inside. Once the folder is empty, you can delete it as a regular file. The same applies to the bucket deletion. We've implemented it in this app called Commandeer so you can do it from a GUI.

Resources