Delete file from S3 bucket based on metadata? - node.js

I've tried specifying metadata in parameters passed to DeleteObjectCommand, but the file gets deleted whether or not it has the provided metadata...
static async deleteFile(req, res) {
try {
const key = basename(req.body.key) //file that will be deleted
const params = {
Bucket: "my_bucket",
Key: key,
Metadata: { userId: req.user._id }
}
const data = await s3.send(new DeleteObjectCommand(params))
return res.send(data)
}
catch (err) {
console.log(err)
}
}
I have different user's files in the same bucket and actually want to prevent one app user from deleting files of another user. Is there a better way to do this?
Thanks in advance

Related

How to delete files with NodeJS? [duplicate]

This question already has answers here:
node.js remove file
(21 answers)
Closed last year.
I have two async functions:
One for creating objects in PostgreSQL and upload files with named in this object.
Another one for deleting this entity and deleting files from folder.
I don't know how to extract filenames from PostgreSQL entity and delete certain files in my 'static' folder
Entity in PostgreSQL looks like:
Car:
{
'name': 'Nissan 350Z',
'description': 'new',
'image': '123wejdsefberfkj.jpg',
'video': '23rusdjf8ioysdfs.mp4'
}
Create function:
Here I get files from form-data, create unique name for files and save it in PostgreSQL, then I save files in "static" folder.
let videoName = uuid.v4() + '.mp4';
let imageName = uuid.v4() + '.jpg';
let {name,description} = req.body;
const {video, image} = req.files;
const car = await Car.create( {name, description, video: videoName,image: imageName})
.then(video.mv(path.resolve(__dirname,'..', 'static', videoName)))
.then(image.mv(path.resolve(__dirname,'..', 'static', imgName)))
Delete function:
Here I need to extract file-names from database and delete them from folder
a bit pseudocode:
async delete(req, res) {
try {
const {id} = req.params;
await Car.findOne({where:{id}})
.then( async data => {
if(data) {
await let videoName = Car.#extract_video_name# ({where: {id}})
.then(mv(path.delete(__dirname,'..','static',videoName)))
await Car.destroy({where:{id}}).then(() => {
return res.json("Car deleted");
})
} else {
return res.json("This Car doesn't exist in database");
}
})
} catch (e) {
console.error(e)
}
}
You can use fs, require it using const fs = require('fs');
You have no npm package to install, it's included in node.js.
This is the code that allows you to save your files into a directory by creating the folder if it doesn't exist and upload the file into it
let path = String(`./directoryName/${fileName}`);
fs.mkdirSync('./directoryName', { recursive: true });
fs.writeFileSync(path, data);
And you can delete the file using
fs.unlink(path, (err) => {
if (err) throw err //handle your error the way you want to;
console.log('path/file.txt was deleted');//or else the file will be deleted
});
);
Refernce : https://nodejs.org/api/fs.html

how to make formidable not save to var/folders on nodejs and express app

I'm using formidable to parse incoming files and store them on AWS S3
When I was debugging the code I found out that formidable is first saving it to disk at /var/folders/ and overtime some unnecessary files are stacked up on disk which could lead to a big problem later.
It's very silly of me using a code without fully understanding it and now
I have to figure out how to either remove the parsed file after saving it to S3 or save it to s3 without storing it in disk.
But the question is how do I do it?
I would appreciate if someone could point me in the right direction
this is how i handle the files:
import formidable, { Files, Fields } from 'formidable';
const form = new formidable.IncomingForm();
form.parse(req, async (err: any, fields: Fields, files: Files) => {
let uploadUrl = await util
.uploadToS3({
file: files.uploadFile,
pathName: 'myPathName/inS3',
fileKeyName: 'file',
})
.catch((err) => console.log('S3 error =>', err));
}
This is how i solved this problem:
When I parse incoming form-multipart data I have access to all the details of the files. Because it's already parsed and saved to local disk on the server/my computer. So using the path variable given to me by formidable I unlink/remove that file using node's built-in fs.unlink function. Of course I remove the file after saving it to AWS S3.
This is the code:
import fs from 'fs';
import formidable, { Files, Fields } from 'formidable';
const form = new formidable.IncomingForm();
form.multiples = true;
form.parse(req, async (err: any, fields: Fields, files: Files) => {
const pathArray = [];
try {
const s3Url = await util.uploadToS3(files);
// do something with the s3Url
pathArray.push(files.uploadFileName.path);
} catch(error) {
console.log(error)
} finally {
pathArray.forEach((element: string) => {
fs.unlink(element, (err: any) => {
if (err) console.error('error:',err);
});
});
}
})
I also found a solution which you can take a look at here but due to the architecture if found it slightly hard to implement without changing my original code (or let's just say I didn't fully understand the given implementation)
I think i found it. According to the docs see options.fileWriteStreamHandler, "you need to have a function that will return an instance of a Writable stream that will receive the uploaded file data. With this option, you can have any custom behavior regarding where the uploaded file data will be streamed for. If you are looking to write the file uploaded in other types of cloud storages (AWS S3, Azure blob storage, Google cloud storage) or private file storage, this is the option you're looking for. When this option is defined the default behavior of writing the file in the host machine file system is lost."
const form = formidable({
fileWriteStreamHandler: someFunction,
});
EDIT: My whole code
import formidable from "formidable";
import { Writable } from "stream";
import { Buffer } from "buffer";
import { v4 as uuidv4 } from "uuid";
export const config = {
api: {
bodyParser: false,
},
};
const formidableConfig = {
keepExtensions: true,
maxFileSize: 10_000_000,
maxFieldsSize: 10_000_000,
maxFields: 2,
allowEmptyFiles: false,
multiples: false,
};
// promisify formidable
function formidablePromise(req, opts) {
return new Promise((accept, reject) => {
const form = formidable(opts);
form.parse(req, (err, fields, files) => {
if (err) {
return reject(err);
}
return accept({ fields, files });
});
});
}
const fileConsumer = (acc) => {
const writable = new Writable({
write: (chunk, _enc, next) => {
acc.push(chunk);
next();
},
});
return writable;
};
// inside the handler
export default async function handler(req, res) {
const token = uuidv4();
try {
const chunks = [];
const { fields, files } = await formidablePromise(req, {
...formidableConfig,
// consume this, otherwise formidable tries to save the file to disk
fileWriteStreamHandler: () => fileConsumer(chunks),
});
// do something with the files
const contents = Buffer.concat(chunks);
const bucketRef = storage.bucket("your bucket");
const file = bucketRef.file(files.mediaFile.originalFilename);
await file
.save(contents, {
public: true,
metadata: {
contentType: files.mediaFile.mimetype,
metadata: { firebaseStorageDownloadTokens: token },
},
})
.then(() => {
file.getMetadata().then((data) => {
const fileName = data[0].name;
const media_path = `https://firebasestorage.googleapis.com/v0/b/${bucketRef?.id}/o/${fileName}?alt=media&token=${token}`;
console.log("File link", media_path);
});
});
} catch (e) {
// handle errors
console.log("ERR PREJ ...", e);
}
}

Retrieve the attributes from a file using nodejs?

Hi I'm looking for an nodejs code which would probably return the attributes of each file in a folder. I developed the code to retrieve all the file name in a folder and another code to list data's of filename provide by us. But actually I need to return all the files names in a folder with its column name. I'm new to nodejs so someone help me please.
LISTING DATA CODE:
const AWS = require('aws-sdk');
const neatCsv = require('neat-csv');
var s3 = new AWS.S3({});
exports.handler = (event,context,callback)=>{
const params = {
Bucket:'ml-framework-api',
Key: wavicle.csv
};
s3.getObject(params,async(err, result) => {
if (err){
return console.error(err);
}
neatCsv(result.Body).then((parsedData) => {
callback(null,parsedData);
})
})
}
LISTING FILE IN S3:
const AWS = require('aws-sdk')
const s3 = new AWS.S3({
accessKeyId:'-------------',
secretAccessKey:'-------------------',
region:'ap-south-1'
})
const params = {
Bucket:'wavicle'
}
s3.listObjects(params,(err,data)=>{
if(err){
return console.log(err)
}
console.log(data)
})
It's best to start with node's file system api documentation.
Here is a simple example of how to get information about files of a folder (there are many ways, this is quickly from the example in the documentation above):
const fsp = require("fs/promises");
async function dirFilesInfo(path) {
const dir = await fsp.opendir(path);
for await (const dirEntry of dir) {
const fileInfo = await fsp.stat("./" + dirEntry.name);
console.log(dirEntry.name, fileInfo);
}
}
dirFilesInfo("./").catch(console.error);

Reading a ZIP archive from S3, and writing uncompressed version to new bucket

I have an app where user can upload a ZIP archive of resources. My app handles the upload and saves this to S3. At some point I want to run a transformation that will read this S3 bucket unzip it, and write it to a new S3 bucket. This is all happening on a node service.
I am using the unzipper library to handle unzipping. Here is my initial code.
async function downloadFromS3() {
let s3 = new AWS.S3();
try {
const object = s3
.getObject({
Bucket: "zip-bucket",
Key: "Archive.zip"
})
.createReadStream();
object.on("error", err => {
console.log(err);
});
await streaming_unzipper(object, s3);
} catch (e) {
console.log(e);
}
}
async function streaming_unzipper(s3ObjectStream, s3) {
await s3.createBucket({ Bucket: "unzip-bucket" }).promise();
const unzipStream = s3ObjectStream.pipe(unzipper.Parse());
unzipStream.pipe(
stream.Transform({
objectMode: true,
transform: function(entry, e, next) {
const fileName = entry.path;
const type = entry.type; // 'Directory' or 'File'
const size = entry.vars.uncompressedSize; // There is also compressedSize;
if (type === "File") {
s3.upload(
{ Bucket: "unzip-bucket", Body: entry, Key: entry.path },
{},
function(err, data) {
if (err) console.error(err);
console.log(data);
entry.autodrain();
}
);
next();
} else {
entry.autodrain();
next();
}
}
})
);
This code is works but I feel like it could be optimized. Ideally I would like to pipe the download stream -> unzipper stream -> uploader stream. So that chunks are uploaded to S3 as they get unzipped, instead of waiting for the full fill uzip to finish then uploading.
The problem I am running into is that I need the file name (to set as an S3 key), which I only have after unzipping. Before I can start to upload.
Is there any good way to create a streaming upload to S3. Initiated with a temporaryId, that gets rewritten with the final final name after the full stream is finished.

JSON files does not contain all the results in AWS Lambda using NodeJS

I'm currently working on a project using AWS S3, Rekognition and Lambda. I'm writing in NodeJS and created a working solution to what I want to achieve. The workflow in short is: an image of a face is loaded onto a S3 bucket, then the 'searchFacesByImage' API is called to see if that face has been indexed to the Master collection in the past. If it is a new face, the result will be false, and the 'indexFaces' API is called to index that face to the Master collection. Once that is done, I write the output to 3 separate JSON files that is in the same S3 bucket, called: 'metadata.json', 'indexing.json', 'rekognition.json'.
The 'metadata.json' file only contains the ExternalImageID (that I create myself), the date and time of indexing, the filename that was indexed, and a count that counts how many times that face has been indexed in the past.
The 'indexing.json' file contains the same ExternalImageID, the same data and time of indexing, and the response from the 'searchFacesByImage' API.
The 'rekognition.json' file contains the same ExternalImageID and date and time, as well as the response from the 'indexFaces' API.
The problem comes in that when I load on image at a time, the 3 JSON files will start to populate accordingly, but as soon as I load more than a few (I've tested it with 7) images at the same time, all 7 images will run through the workflow and the response data is written out to each file according to the Cloudwatch logs, but when I actually go to view the JSON files, not all the response data is there for all 7 images. Sometimes the data of 5 images are in the JSON, other times its 4 images. The data doesn't have to be in any specific order, it must just be there. I've also tested it where I uploaded 18 images at once and only the response of 10 images was in the JSON.
I believe the problem comes in that I'm calling the 'getObject' API on the JSON files, then I append the response data to those files, and then I'm calling the 'putObject' API on those JSON files to put them back into the S3 bucket, but while the first image is going through this process, the next image wants to do the same, but there is no file to use the 'getObject' on, because it is busy with the previous image, so then it just skips over the image, although the Cloudwatch logs said I has been added to the files.
I have no idea how to work around this. I believe the answer lies in Asynchronous JavaScript (which I don't know that much of so I have no idea where to begin)
My apologies for the long post. Here is my code below:
const AWS = require('aws-sdk');
const s3 = new AWS.S3({apiVersion: "2006-03-01"});
const rekognition = new AWS.Rekognition();
//const docClient = new AWS.DynamoDB.DocumentClient();
const uuidv4 = require('uuid/v4');
let bucket, key;
let dataSaveDate = new Date();
console.log('Loading function');
//-----------------------------------Exports Function---------------------------
exports.handler = function(event, context) {
bucket = event.Records[0].s3.bucket.name;
key = event.Records[0].s3.object.key;
console.log(bucket);
console.log(key);
searchingFacesByImage(bucket, key);
};
//---------------------------------------------------------------------------
// Search for a face in an input image
function searchingFacesByImage(bucket, key) {
let params = {
CollectionId: "allFaces",
FaceMatchThreshold: 95,
Image: {
S3Object: {
Bucket: bucket,
Name: key
}
},
MaxFaces: 5
};
const searchingFace = rekognition.searchFacesByImage(params, function(err, searchdata) {
if (err) {
console.log(err, err.stack); // an error occurred
} else {
// console.log(JSON.stringify(searchdata, null, '\t'));
// if data.FaceMatches > 0 : There that face in the image exists in the collection
if (searchdata.FaceMatches.length > 0) {
console.log("Face is a match");
} else {
console.log("Face is not a match");
let mapping_id = createRandomId();
console.log(`Created mapping_id: ${mapping_id}`);
console.log("Start indexing face to 'allFaces'");
indexToAllFaces(mapping_id, searchdata, bucket, key);
}
}
});
return searchingFace;
}
//---------------------------------------------------------------------------
// If face is not a match in 'allFaces', index face to 'allFaces' using mapping_id
function indexToAllFaces(mapping_id, searchData, bucket, key) {
let params = {
CollectionId: "allFaces",
DetectionAttributes: ['ALL'],
ExternalImageId: mapping_id,
Image: {
S3Object: {
Bucket: bucket,
Name: key
}
}
};
const indexFace = rekognition.indexFaces(params, function(err, data) {
if (err) {
console.log(err, err.stack); // an error occurred
} else {
console.log("INDEXING TO 'allFaces'");
//console.log(JSON.stringify(data, null, '\t'));
logAllData(mapping_id, bucket, key, searchData, data);
}
});
return indexFace;
}
//---------------------------------------------------------------------------
// Counting how many times a face has been indexed and logging ALL data in a single log
function logAllData(mapping_id, bucket, key, searchData, data) {
let params = {
CollectionId: mapping_id,
MaxResults: 20
};
const faceDetails = rekognition.listFaces(params, function(err, facedata) {
if (err) {
console.log(err, err.stack); // an error occurred
} else {
//console.log(JSON.stringify(facedata, null, '\t'));
metadata(mapping_id, bucket, key, facedata);
indexing(mapping_id, bucket, searchData);
rekognitionData(mapping_id, bucket, data);
}
});
return faceDetails;
}
//-----------------------------------------------------------------------------
function metadata(mapping_id, bucket, key, faceData) {
let body = [
{
"mapping_id": mapping_id,
"time": dataSaveDate,
"image_name": key,
"indexing_count": faceData.Faces.length - 1
}
];
//console.log(JSON.stringify(body, null, '\t'));
logData("metadata.json", bucket, body);
}
//------------------------------------------------------------------------------
function indexing(mapping_id, bucket, searchData) {
let body = [
{
"mapping_id": mapping_id,
"time": dataSaveDate,
"IndexingData": searchData
}
];
logData("indexing.json", bucket, body);
}
//------------------------------------------------------------------------------
function rekognitionData(mapping_id, bucket, data) {
let body = [
{
"mapping_id": mapping_id,
"time": dataSaveDate,
"rekognition": data
}
];
logData("rekognition.json", bucket, body);
}
//------------------------------------------------------------------------------
// Function to log all data to JSON files
function logData(jsonFileName, bucket, body) {
let params = {
Bucket: bucket,
Key: jsonFileName
};
const readFile = s3.getObject(params, function(err, filedata) {
if (err) {
console.log(err, err.stack); // an error occurred
} else {
console.log(`READING ${jsonFileName} CONTENTS`);
// Read data from 'jsonFileName'
let raw_content = filedata.Body.toString();
let content = JSON.parse(raw_content);
// Add new data to 'jsonFileName'
content.push(...body);
// Put new data back into jsonFileName
s3.putObject(
{
Bucket: bucket,
Key: jsonFileName,
Body: JSON.stringify(content, null, '\t'),
ContentType: "application/json"
},
function(err, res) {
if (err) {
console.log(err);
} else {
console.log(`DATA SAVED TO ${jsonFileName}`);
}
}
);
}
});
return readFile;
}
//----------------------------------SCRIPT ENDS---------------------------------
When a Node.js Lambda reaches the end of the main thread, it ends all other threads.
To make sure that the lambda does not prematurely terminate those threads, wait until that Promise is complete by using await.
The functions s3.getObject and s3.putObject can be made into a Promise like this:
await s3.getObject(params).promise()
await s3.putObject(params).promise()

Resources