Reading multiple files and uploading to AWS S3 - node.js

Requirement:
I have multiple files in a folder on my express server. I pass these file names as an API call, the backend function needs to read all these files, upload them to AWS S3 and then return an array of public URLs.
const s3 = new aws.S3();
const fs = require("fs");
module.exports = {
upload: async function (req, res, next) {
console.log("Inside upload controller");
let publicUrls = [];
const result = await Promise.all(
req.body.files.map(async (uploadFile) => {
fs.promises.readFile(uploadFile).then((file) => {
let body = fs.createReadStream(uploadFile);
const s3PutParams = {
Bucket: process.env.S3_BUCKET_NAME,
Key: uploadFile.substr(15),
Body: body,
ACL: "public-read",
};
s3.upload(s3PutParams)
.promise()
.then((response) => {
console.log(response.Location);
publicUrls.push(response.Location);
});
});
})
);
if (result) {
console.log("Result", result);
res.json(publicUrls);
}
},
};
Observed Output:
Inside upload controller
Result [ undefined, undefined, undefined, undefined ]
https://xxx.s3.amazonaws.com/2_30062022.pdf
https://xxx.s3.amazonaws.com/1_30062022.pdf
https://xxx.s3.amazonaws.com/1_30062022.pdf
https://xxx.s3.amazonaws.com/2_30062022.pdf
I am passing an array of 4 file names, hence 4 "undefined" while logging "result"
Issue:
The code is not awaiting for the Promise.all to be completed.
It right away returns the json response, which is an empty array at that point.
How can this be resolved?

Solved by referring to NodeJS write file to AWS S3 - Promise.All with async/await not waiting

Related

Delivering image from S3 to React client via Context API and Express server

I'm trying to download a photo from an AWS S3 bucket via an express server to serve to a react app but I'm not having much luck. Here are my (unsuccessful) attempts so far.
The Workflow is as follows:
Client requests photo after retrieving key from database via Context API
Request sent to express server route (important so as to hide the true location from the client)
Express server route requests blob file from AWS S3 bucket
Express server parses image to base64 and serves to client
Client updates state with new image
React Client
const [profilePic, setProfilePic] = useState('');
useEffect(() => {
await actions.getMediaSource(tempPhoto.key)
.then(resp => {
console.log('server resp: ', resp.data.data.newTest) // returns ����\u0000�\u0000\b\u0006\
const url = window.URL || window.webkitURL;
const blobUrl = url.createObjectURL(resp.data.data.newTest);
console.log("blob ", blobUrl);
setProfilePic({ ...profilePic, image : resp.data.data.newTest });
})
.catch(err => errors.push(err));
}
Context API - just axios wrapped into its own library
getMediaContents = async ( key ) => {
return await this.API.call(`http://localhost:5000/${MEDIA}/mediaitem/${key}`, "GET", null, true, this.state.accessToken, null);
}
Express server route
router.get("/mediaitem/:key", async (req, res, next) => {
try{
const { key } = req.params;
// Attempt 1 was to try with s3.getObject(downloadParams).createReadStream();
const readStream = getFileStream(key);
readStream.pipe(res);
// Attempt 2 - attempt to convert response to base 64 encoding
var data = await getFileStream(key);
var test = data.Body.toString("utf-8");
var container = '';
if ( data.Body ) {
container = data.Body.toString("utf-8");
} else {
container = undefined;
}
var buffer = (new Buffer.from(container));
var test = buffer.toString("base64");
require('fs').writeFileSync('../uploads', test); // it never wrote to this directory
console.log('conversion: ', test); // prints: 77+977+977+977+9AO+/vQAIBgYH - this doesn't look like base64 to me.
delete buffer;
res.status(201).json({ newTest: test });
} catch (err){
next(ApiError.internal(`Unexpected error > mediaData/:id GET -> Error: ${err.message}`));
return;
}
});
AWS S3 Library - I made my own library for using the s3 bucket as I'll need to use more functionality later.
const getFileStream = async (fileKey) => {
const downloadParams = {
Key: fileKey,
Bucket: bucketName
}
// This was attempt 1's return without async in the parameter
return s3.getObject(downloadParams).createReadStream();
// Attempt 2's intention was just to wait for the promise to be fulfilled.
return await s3.getObject(downloadParams).promise();
}
exports.getFileStream = getFileStream;
If you've gotten this far you may have realised that I've tried a couple of things from different sources and documentation but I'm not getting any further. I would really appreciate some pointers and advice on what I'm doing wrong and what I could improve on.
If any further information is needed then just let me know.
Thanks in advance for your time!
Maybe it be useful for you, that's how i get image from S3, and process image on server
Create temporary directory
createTmpDir(): Promise<string> {
return mkdtemp(path.join(os.tmpdir(), 'tmp-'));
}
Gets the file
readStream(path: string) {
return this.s3
.getObject({
Bucket: this.awsConfig.bucketName,
Key: path,
})
.createReadStream();
}
How i process file
async MainMethod(fileName){
const dir = await this.createTmpDir();
const serverPath = path.join(
dir,
fileName
);
await pipeline(
this.readStream(attachent.key),
fs.createWriteStream(serverPath + '.jpg')
);
const createFile= await sharp(serverPath + '.jpg')
.jpeg()
.resize({
width: 640,
fit: sharp.fit.inside,
})
.toFile(serverPath + '.jpeg');
const imageBuffer = fs.readFileSync(serverPath + '.jpeg');
//my manipulations
fs.rmSync(dir, { recursive: true, force: true }); //delete temporary folder
}

How to pipe a s3 getSignedUrl

I'm trying to pipe a signed url of an image I got stored in a bucket in S3.
When using regular "getObject" method I can do it like this
app.get("/images/:key", (req, res) => {
const key = req.params.key;
const downloadParams = {
Key: key,
Bucket: bucketName,
};
const readStream = s3.getObject(downloadParams).createReadStream();
readStream.pipe(res);
});
But when I try with getSignedUrlPromise, I can't use the createReadStream method because it says it's not a function.
const readStreamSigned = await s3
.getSignedUrlPromise("getObject", downloadParams).createdReadStream // throws createReadStream is not a function
readStreamSigned.pipe(res)
How can I achieve that with getSignedUrl or getSignedUrlPromise?
Found a solution! inspired by this answer https://stackoverflow.com/a/65976684/4179240 in a high level what we want to do is this:
Get the key of the item to search.
Pass it with the params to getSignedUrlPromises.
Get the generated url from the promise.
Pipe the result from the callback of the get().
I ended up doing it like this
app.get("/images/:key", async (req, res) => {
const key = req.params.key;
const downloadParams = {
Key: key,
Bucket: bucketName,
};
const url = await s3.getSignedUrlPromise("getObject", downloadParams);
https.get(readStream, (stream) => {
stream.pipe(res);
});
});
If you find a better way let me know! 😃

Use Cloud Function download a JSON file from URL then upload to a Cloud Storage bucket, status 200 but JSON file uploaded is only 20 bytes and empty

I'm trying to use the cloud function to download a JSON file from here: http://jsonplaceholder.typicode.com/posts? then upload it to Cloud Storage bucket.
Log of function execution seems fine, the status returns 200. However, the JSON file uploaded to the bucket is only 20 Bytes and it is empty (while the original file is ~27 KB)
So please help me if I missed something, there is code and logs:
index.js
const {Storage} = require('#google-cloud/storage');
exports.writeToBucket = (req, res) => {
const http = require('http');
const fs = require('fs');
const file = fs.createWriteStream("/tmp/post.json");
const request = http.get("http://jsonplaceholder.typicode.com/posts?", function(response) {
response.pipe(file);
});
console.log('file downloaded');
// Imports the Google Cloud client library
const {Storage} = require('#google-cloud/storage');
// Creates a client
const storage = new Storage();
const bucketName = 'tft-test-48c87.appspot.com';
const filename = '/tmp/post.json';
// Uploads a local file to the bucket
storage.bucket(bucketName).upload(filename, {
gzip: true,
metadata: {
cacheControl: 'no-cache',
},
});
res.status(200).send(`${filename} uploaded to ${bucketName}.`);
};
package.json
{
"name": "sample-http",
"version": "0.0.1",
"dependencies": {
"#google-cloud/storage": "^3.0.3"
}
}
Result:
Log:
As pointed by #DazWilkin, there are issues with asynchronous code. You must wait for onfinish() to trigger and then proceed. Also the upload() method returns a promise too. Try refactoring your function in async-await syntax as shown below:
exports.writeToBucket = async (req, res) => {
const http = require('http');
const fs = require('fs');
// Imports the Google Cloud client library
const {Storage} = require('#google-cloud/storage');
// Creates a client
const storage = new Storage();
const bucketName = 'tft-test-48c87.appspot.com';
const filename = '/tmp/post.json';
await downloadJson()
// Uploads a local file to the bucket
await storage.bucket(bucketName).upload(filename, {
gzip: true,
metadata: {
cacheControl: 'no-cache',
},
});
res.status(200).send(`${filename} uploaded to ${bucketName}.`);
}
const downloadJson = async () => {
const Axios = require('axios')
const fs = require("fs")
const writer = fs.createWriteStream("/tmp/post.json")
const response = await Axios({
url: "http://jsonplaceholder.typicode.com/posts",
method: 'GET',
responseType: 'stream'
})
response.data.pipe(writer)
return new Promise((resolve, reject) => {
writer.on('finish', resolve)
writer.on('error', reject)
})
}
This example uses Axios but you can do the same with http.
Do note that you can directly upload the fetched JSON as a file like this:
exports.writeToBucket = async (req, res) => {
const Axios = require("axios");
const { Storage } = require("#google-cloud/storage");
const storage = new Storage();
const bucketName = "tft-test-48c87.appspot.com";
const filename = "/tmp/post.json";
const { data } = await Axios.get("http://jsonplaceholder.typicode.com/posts");
const file = storage.bucket(bucketName).file("file.json");
const contents = JSON.stringify(data);
await file.save(contents);
res.status(200).send(`${filename} uploaded to ${bucketName}.`);
};
You can read more about the save() method in the documentation.
I don't write much NodeJS but I think your issue is with async code.
You create the stream and then issue the http.get but you don't block on the callback (piping the file) completing before you start the GCS upload.
You may want to attach an .on("finish", () => {...}) to the pipe and in that callback, upload the file to GCS.
NOTE IIRC GCS has a method that will let you write a stream directly from memory rather than going through a file.
NOTE if you pull the storage object up into the global namespace, it will only be created whenever the instance is created and not every time the function is invoked.
You don't need a write stream to get the URL data, fetch the URL, await the response to resolve, call the appropriate response.toJson() method.
Personally, I prefer to use Fetch and Axios over http as they are cleaner to work with. But with Nodes http you can do the following:
https.get(url,(res) => {
let body = "";
res.on("data", (chunk) => {
body += chunk;
});
res.on("end", () => {
try {
let json = JSON.parse(body);
// do something with JSON
} catch (error) {
console.error(error.message);
};
});
}).on("error", (error) => {
console.error(error.message);
});
Once you have that, you can pass it directly to a storage method as a data blob or byte array.
byte[] byteArray = resultJson.toString().getBytes("UTF-8");

Returning result of an async operation with a Node.js web server

I'm using Express to build a web API. In the following example, SVG data is converted to PNG and uploaded to S3.
const svg2png = require("svg2png");
const AWS = require('aws-sdk');
const s3 = new AWS.S3();
app.post('/svg_to_png', function (req, res) {
let params = req.body
// STEP 1: Convert SVG to PNG:
var outputBuffer = svg2png.sync(params.svg_data, {});
// STEP 2: Upload to S3:
let s3_params = {
Bucket:params.bucket,
Key:params.key,
Body:outputBuffer,
ContentType:'image/png',
ContentDisposition:'inline',
ACL: 'public-read'
}
result = s3.putObject(s3_params,function(err,data){
if (err){
return err;
}
return 'success';
});
// Return Image URL:
let image_url = 'https://s3.amazonaws.com/' + params.bucket + '/' + params.key
res.send(image_url)
})
I want the API to respond with the URL of the converted image, which the requesting client can then immediately download. The problem is, the S3 upload operation is async, and so when the response is delivered, the image does not yet exist at the URL location, forcing the client to poll for its existence.
Is there a way to get the web server to respond only once the S3 upload has completed?
What about something like this :
const putObjPromise = s3.putObject(params).promise();
putObjPromise
.then(data => {
// Return the URL here.
})
.catch(err => console.log(err))
AWS has this doc for Promises : https://docs.aws.amazon.com/sdk-for-javascript/v2/developer-guide/using-promises.html
Hope this helps.
as #Brandon mentioned, you can return the response once the s3 callback is completed. You can also use s3.putObject(params).promise(). I prefer this since it improves readability.
app.post('/svg_to_png', async function (req, res) {
let params = req.body
...
// STEP 2: Upload to S3:
let params = {
...
}
try {
const result = await s3.putObject(params).promise();
// Return Image URL:
// image_url = "https://s3.amazonaws.com/' + params.bucket + '/' + params.key
// res.body(....).end()
} catch(err) {
// return error response
}
})

Javascript AWS SDK S3 upload method with Body stream generating empty file

I'm trying to use the method upload from s3 using a ReadableStream from the module fs.
The documentation says that a ReadableStream can be used at Bodyparam:
Body — (Buffer, Typed Array, Blob, String, ReadableStream) Object data.
Also the upload method description is:
Uploads an arbitrarily sized buffer, blob, or stream, using intelligent concurrent handling of parts if the payload is large enough.
Also, here: Upload pdf generated to AWS S3 using nodejs aws sdk the #shivendra says he can use a ReadableStream and it works.
This is my code:
const fs = require('fs')
const S3 = require('aws-sdk/clients/s3')
const s3 = new S3()
const send = async () => {
const rs = fs.createReadStream('/home/osman/Downloads/input.txt')
rs.on('open', () => {
console.log('OPEN')
})
rs.on('end', () => {
console.log('END')
})
rs.on('close', () => {
console.log('CLOSE')
})
rs.on('data', (chunk) => {
console.log('DATA: ', chunk)
})
console.log('START UPLOAD')
const response = await s3.upload({
Bucket: 'test-bucket',
Key: 'output.txt',
Body: rs,
}).promise()
console.log('response:')
console.log(response)
}
send().catch(err => { console.log(err) })
It's getting this output:
START UPLOAD
OPEN
DATA: <Buffer 73 6f 6d 65 74 68 69 6e 67>
END
CLOSE
response:
{ ETag: '"d41d8cd98f00b204e9800998ecf8427e"',
Location: 'https://test-bucket.s3.amazonaws.com/output.txt',
key: 'output.txt',
Key: 'output.txt',
Bucket: 'test-bucket' }
The problem is that my file generated at S3 (output.txt) has 0 Bytes.
Someone know what am I doing wrong?
If I pass a buffer on Body it works.
Body: Buffer.alloc(8 * 1024 * 1024, 'something'),
But it's not what I want to do. I'd like to do this using a stream to generate a file and pipe a stream to S3 as long as I generate it.
It's an API interface issue using NodeJS ReadableStreams.
Just comment the code related to listen event 'data', solves the problem.
const fs = require('fs')
const S3 = require('aws-sdk/clients/s3')
const s3 = new S3()
const send = async () => {
const rs = fs.createReadStream('/home/osman/Downloads/input.txt')
rs.on('open', () => {
console.log('OPEN')
})
rs.on('end', () => {
console.log('END')
})
rs.on('close', () => {
console.log('CLOSE')
})
// rs.on('data', (chunk) => {
// console.log('DATA: ', chunk)
// })
console.log('START UPLOAD')
const response = await s3.upload({
Bucket: 'test-bucket',
Key: 'output.txt',
Body: rs,
}).promise()
console.log('response:')
console.log(response)
}
send().catch(err => { console.log(err) })
Though it's an strange API, when we listen to 'data' event, the ReadableStream starts the flowing mode (listening to an event changing publisher/EventEmitter state? Yes, very error prone...). For some reason the S3 need a paused ReadableStream. If whe put rs.on('data'...) after await s3.upload(...) it works. If we put rs.pause() after rs.on('data'...) and befote await s3.upload(...), it works too.
Now, what does it happen? I don't know yet...
But the problem was solved, even it isn't completely explained.
Check if file /home/osman/Downloads/input.txt actually exists and accessible by node.js process
Consider to use putObject method
Example:
const fs = require('fs');
const S3 = require('aws-sdk/clients/s3');
const s3 = new S3();
s3.putObject({
Bucket: 'test-bucket',
Key: 'output.txt',
Body: fs.createReadStream('/home/osman/Downloads/input.txt'),
}, (err, response) => {
if (err) {
throw err;
}
console.log('response:')
console.log(response)
});
Not sure how this will work with async .. await, better to make upload to AWS:S3 work first, then change the flow.
UPDATE:
Try to implement upload directly via ManagedUpload
const fs = require('fs');
const S3 = require('aws-sdk/clients/s3');
const s3 = new S3();
const upload = new S3.ManagedUpload({
service: s3,
params: {
Bucket: 'test-bucket',
Key: 'output.txt',
Body: fs.createReadStream('/home/osman/Downloads/input.txt')
}
});
upload.send((err, response) => {
if (err) {
throw err;
}
console.log('response:')
console.log(response)
});

Resources