Is this logic for a profile image uploader be scalable? - node.js

Would the following generic logic for implementing a profile upload feature be scalable in production?
1. Inside a web app user selects an image to upload
2. Image gets sent to the server where it gets stored in memory and validated using nodejs package called: multer
3. If the image file is valid, a unique name is generated for the file
4. The image is resized to 150 x 150 using nodejs package called: sharp
5. Image is streamed to google cloud storage
6. Once image is saved a public URL of the image is saved under the user’s profile inside of the database
7. The image ULR is sent back to the client and the image gets displayed
Languages used to implement the above
This would be implemented using:
firebase cloud functions running on Nodejs as a backend
Google Cloud Storage for holding images
firebase database for saving image url for the user uploading the image
My current concerns with this are:
Holding an image in memory while it gets validated and processed will potentially lead to servers clogging up during heavy load
How would this way of generating a unique name for the image scale?: uuidV4 + current date in milliseconds + last 5 characters of the image's original file name

Both multer and sharp are efficient enough for your needs:
Uploaded 2MB pictures downsized to 150x150 resolution.
Let's say you'll have 100,000 users. Every user uploads a picture that gets downsized to 20kb image. That's 1.907GB ~ 2GB of storage.
Google Cloud Storage:
One instance in Frankfurt
2GB regional storage
= $0.55 a year
Google Cloud Functions:
With 100k users, for simplicity, let's assume it will be evenly distributed and so we'll have 8334 per month users uploading their images. And let's be extremely pessimistic and say that one resize function will take 3s.
8334 resize operations per month
3s per function
Networking throughput 2MB per function
= $16.24 a year
So you're good to go. Happy coding!
This answer will get outdated eventually, so I'm including screenshot of google price calculator:

Related

Torn between a couple practices for file upload to CDN

The platform I'm working on involves the client(ReactJS) and server(NodeJs, Express), of course.
The major feature of this platform involves users uploading images constantly.
Everything has been setup successfully using multer to receive images in formdata on my api server and now its time to create an "image management system".
The main problem I'll be tackling is the unpredictable file size of users. The files are images and the depend on the OS of the user i.e user taking pictures, users taking screenshots.
The first solution is to determine the max file size, and to transport it using a compression algorithm to the api server. When the backend receives it successfully, images are uploaded to a CDN (Cloudinary) and then the link is stored in the database along with other related records.
The second which im strongly leaning towards is shifting this "upload to CDN" system to the client side and make the client connect to cloudinary directly and after grab the secure link and insert into the JSON that would be sent to the server.
This eliminates the problem of grappling with file sizes which is progress, but I'll like to know if it is a good practice.
Restricting the file size is possible when using the Cloudinary Upload Widget for client-side uploads.
You can include the 'maxFileSize' parameter when calling the widget, setting its value to 500000 (value should be provided in bytes).
https://cloudinary.com/documentation/upload_widget
If the client will try and upload a larger file he/she will get back an error stating the max file size is exceeded and the upload will fail.
Alternatively, you can choose to limit the dimensions of the image, so if exceeded, instead of failing the upload with an error, it will automatically scale down to the given dimensions while retaining aspect ratio and the upload request will succeed.
However, using this method doesn't guarantee the uploaded file will be below a certain desired size (e.g. 500kb) as each image is different and one image that is scaled-down to given dimensions can result in a file size that is smaller than your threshold, while another image may slightly exceed it.
This can be achieved using the limit cropping method as part of an incoming transformation.
https://cloudinary.com/documentation/image_transformations#limit
https://cloudinary.com/documentation/upload_images#incoming_transformations

Blur photos for unauthorized users

I want to create a simple feature on my site (React, Node, MongoDB). I have users who can upload their photos, and I want to show their faces blurred for unauthorized visitors. What is the best way of developing this functionality? Saving blurred images separately in DB or calling every time API for blurring images before responding from the backend, or blurring images in frontend. How to make it fast and safe??? Please any help, thank you in advance.
Everything has a pro and con approach.
Upload one photo and using a tag in the data such as user object or better yet inside an auth token apply a blur filter to the image. The downside if someone is clever enough they can get the real picture e.g to intercept the download
Upload one photo and using a tag in the backend data models or user session reduce the quality of the image on the download. The downside pulling images down will be slower as there has to be image manipulation before its sent to the front end.
Upload two images one normal and one low quality. Downside longer initial upload and you are now taking up more space in your image bucket which will cost you more money.
There will be more approaches but each will have a trade-off between speed, security and cost/space. I personally would go with number three if the cost is not an issue and if you use good compression and don't get snowballed with users the cost difference should not be that much.
It depends on your use case blurring images on frontend after calling an API to verify whether user is authorised or not is least secure. Saving two images on upload seems like a good idea but it's a bit waste as you're saving same image twice. I would go with blurring images on the backend.

Python Boto3 - upload images to S3 in one put request

I have a script in python built out, that uploads images to s3, one image|put request at a time. Is it possible to upload all images to s3 at the same time, using one put request, to save $$ on requests?
for image_id in list_of_images:
#upload each image
filename = id_prefix+"/"+'{0}.jpg'.format(image_id)
s3.upload_fileobj(buffer, bucket_name, filename, ExtraArgs={ "ContentType": "image/jpeg"})
No.
The Amazon S3 API only allows creation of one object per API call.
Your options are to loop through each file (as you have done) or, if you wish to make it faster, you could use multi-threading to upload multiple files simultaneously to take advantage of more networking bandwidth.
If your desire is simply to reduce requests costs, do not panic. It is only $0.005 per 1000 requests.
Good news - it looks like several companies came out with boto3-like apis, with much better storage + download (per gb) pricing.
As of a few days ago - Backblaze came out with S3 compatible storage + API .
We did a few tests on our application, and everything seems to be working as advertised!

What is the best flow to store an image in CDN

I have a website created using Node Express, this website serves functionality where user can upload an image and it will be stored locally in server folder and the path will be saved in database.
The problem is the images size is taking too much space on the server disk, so i need to use cdn as a storage for those images and to show the image to the user. The problem is i don't know what is the proper end-to-end flow to store this image to cdn.
The end-to-end flow means the customer upload the picture , the server save it, and can be used again when the user need to see it.
My thought is, when the user uploaded the image, then the server save it first locally, the image path, there will be cron running to store the image to CDN, at the end the image stored in the server will be deleted after success store the image to CDN.
Is that the correct way? or there are any other way to do this?
you can do something like this
store images on a cheap storage for long term like s3. this serve as source of truth.
configure cdn to use the s3 url or your server as source => you don't neeed to upload to cdn.
bonus: create an image resizer service to sit in front of the source and configure cdn to use the image resizer service as source. this way, it will reduce the load to your resizer service
In addition to the Tran answer, you can perform some optimizations.
For example, converting to WebP image format which can reduce the size.
Also, you can look at Image CDNs available which are optimized for images. Following article can be helpful for you
https://imagekit.io/blog/what-is-image-cdn-guide/

NodeJS, how to handle image uploading with MongoDB?

I would like to know what is the best way to handle image uploading and saving the reference to the database. What I'm mostly interested is what order do you do the process in?
Should you upload the images first in the front-end (say Cloudinary), and then call the API with result links to the images and save it to the database?
Or should you upload the images to the server first, and upload them from the back-end and save the reference afterwards?
OR, should you do the image uploading after you save the record in the database and then update it once the images were uploaded?
It really depends on the resources, timeline, and number of images you need to upload daily.
So basically if you have very few images to upload then you can upload that image to your server then upload it to any cloud storage(s3, Cloudinary,..) you are using. As this will be very easy to implement(you can find code snippet over the internet) and you can securely maintain your secret keys/credential to your cloud platform on the server side.
But, according to me best way of doing this will be something like this. I am taking user registration as an example
Make server call to get a temporary credential to upload files on the cloud(Generally, all the providers give this functionality i.e. STS/Signed URL in AWS).
The user will fill up the form and select the image on the client side. When the user clicks the submit button make one call to save the user in the database and start upload with credentials. If possible keep a predictable path for upload. Like for user upload /users/:userId or something like that. this highly depends on your use case.
Now when upload finishes make a server call for acknowledgment and store some flag in the database.
Now advantages of this approach are:
You are completely offloading your server from handling file operations which are pretty heavy and I/O blocking and you are distributing that load to all clients.
If you want to post process the files after upload you can easily integrate this with serverless platforms and do that on there and again offload that.
You can easily provide retry mechanism to your users in case of file upload fails but they won't need to refill the data, just upload the image/file again
You don't need to expose the URL directly to the client for file upload as you are using temporary Creds.
If the significance of the images in your app is high then ideally, you should not complete the transaction until the image is saved. The approach should be to create an object in your code which you will eventually insert into mongodb, start upload of image to cloud and then add the link to this object. Finally then insert this object into mongodb in one go. Do not make repeated calls. Anything before that, raise an error and catch the exception
You can have many answers,
if you are working with big files greater than 16mb please go with gridfs and multer,
( changing the images to a different format and save them to mongoDB)
If your files are actually less than 16 mb, please try using this Converter that changes the image of format jpeg / png to a format of saving to mongodb, and you can see this as an easy alternative for gridfs ,
please check this github repo for more details..

Resources