I'm building a website on express.js and I'm wondering where to store images. Certain 'static' website info like the team pages will be backed by a database. If new team members come onboard, we push new data to CouchDB and a new team page shows up on the site.
A team page will include a profile picture, which will be stored in CouchDB along with other data.
Should I be sending the image through the webserver or just sending the reference to where the image is and having the client grab the image from the database, since CouchDB is an HTTP server itself?
I am not an expert from Couch DB, but here is my 2 cents. In general hitting DB for every image, is going to increase the load. If the website is going to be accessed by many people, that will be a lot.
Ideal way is serve it with CDN, and have the CDN server point to your resource server/ webserver.
You can store the profile pics (and any other file) as attachments to the docs. The load is the same like for every other web-server.
Documentation:
attachment endpoint /db/doc/attachment
document endpoint /db/doc
CouchDB manages the ETags for attachments as well as for docs or views. Clients which have cached the pics already will get a light-weight 304 response for every identical request. You can try it out with my CouchDB based blog lbl.io. Open you favorite browser developer bar and observe the image requests during multiple refreshes.
Hint 1: If you have the choice between inline-attachment-upload (Base64 encoded in the doc, 1 request to create a doc with attachment) or upload-attachment-only (multipart/related in the original content type, 2 requests to create a doc with attachment or 1 request to create an attachment when the doc already exists) .... then choose the second. Its more efficient handled by CouchDB.
Hint 2: You can configure CouchDB to handle gzip compression by the content-type of attachments. It reduces the load a lot.
I just dump avatars in /web/images/avatars, store the filename only in couchdb, and serve the folder with express.static()
You certainly can use a couchdb attachment
You can also create an amazon s3 bucket and save the absolute https path on your user objects
Related
I know what is the difference between Azure CDN query string modes and I have read a helpfull example of query string modes but...
I don't understand what is the purpose of "Ignore query strings" or how this can be useful in a real dynamic web.
For example, suppose we have a product purchase website with a URL similar to www.myweb.com/products?id=3
If we use "Ignore query strings"... Does this mean that if an user later requests product 4 (www.myweb.com/products?id=4), he will receive the page for product 3?
I think I'm not understanding correctly Azure CDN, I'm seeing Azure CDN as a dynamic content CDN, however Azure CDN is only used for static content as this article explains:
Standard content delivery network (CDN) capability includes the ability to cache files closer to end users to speed up delivery of static files.
This is correct? Any help or example on the subject is welcome
Yes, if you are selected Ignore query strings Query string caching behavior (this is the default), in your case subsequent requests after the initial request www.myweb.com/products?id=3, no matter the query string value, that POP server will serve the same content until it's cache period expires.
And for the second question, CDN is all about serving static files. To my understanding i believe what the article says is about dynamic site accelaration. It's about bunch of techniques to optimize dynamic web sites content serving performance. Because unlike static web sites, dynamic web sites assets (static files. ex: images, js, css, html) are loading dynamically based on the user behavior.
Now that I have it clearer, I will answer my question:
Azure CDN - Used to cache static content, even on dynamic web pages.
For the example in the question, all products must download the same javascript and css content, for those types of files Azure CDN is used. Real example using "Ignore query strings":
User A www.myweb.com/products?id=3, jquery-versionX.js and mystyles.css are not cached, the server is requested and the user receives it.
User B www.myweb.com/products?id=4, since we are using "Ignore query strings" the jquery-versionX.js and mystyles.css files are cached, they are served to the user without requesting it from the server again.
User C www.myweb.com/products?id=3, since we are using "Ignore query strings" the jquery-versionX.js and mystyles.css files are cached, they are served to the user without requesting it from the server again.
Reddis or other similar - Used to cache dynamic content (queries to databases for example).
For the example in the question, all the products have different information, which is obtained by doing a database query. We can store those queries or JSON objects in a Reddis cache. Real example:
User A www.myweb.com/products?id=3, product 3 is not cached, it is requested from the server and received by the user.
User B www.myweb.com/products?id=4, product 4 is not cached, it is requested from the server and received by the user.
User C www.myweb.com/products?id=3, product 3 is cached, the server is not requested and the user receives it from the cache.
Summary:
Both methods can be used simultaneously, Azure CDN is for static content and Reddis or similar for dynamic content.
I'm building an app on nodejs in Ubuntu 20.
Mainly the app have to handle a series of cars that each user submit to the server through a form. Through this form the user specify the name of the car, the model, an image of the car and other informations ...
I manage the submitted image using gridfs and store them into mongodb with all the other datas.
Each time that a user load the site, a 30 rows table with the uploaded cars like the above one is displayed so the server must manage on each site route request
reading the cars from the db
rendering the html and it's css.minified - js.minified
35/40 requests for rendering the images of the cars in the table and other images
I'm thinking that this 35/40 requests for reading-rendering the images from mongodb can be handled from a secondary nodejs server instead of the main one that manage the app.
I need to make the main app lighter to allow more users on the site, and to do so my idea was to have 2 node applications
The main app that serves the html pages and read/serve all the main informations, like the cars names etc..
The images app that handle all the requests for upload/render the images
But my concern is, Does this solution makes sense or it will only make the server busier to have two node applications instead of one?
Node really is designed for this type of architecture, and breaking up your services into micro-services is actually a good idea when it makes sense ( some will argue this point but personally I feel like the separation of concerns works well when you can scale horizontally for increased load)
Just a few comments ( take with a grain of salt) but..
I would not 'store' the image in the DB ( I'm taking this that you are storing a base64 object for the image?)
Use an object store like s3 to store your images
Use a CDN for static resources
If you architect your stack in such a way, and offload specific request to other services you can greatly reduce load AND increase scalability by not tethering the node server to 1 file system etc.
As an example, if you were to have a stack like
Front-end server that uses say ejs to render things to the client
processing server to do image uploads
Object store to store / serve images (static resources too, js, css etc)
DB server (postgres, mongo etc)
CDN to cache static resources
Redis to cache DB queries / API responses ( if needed )
The front end server should be able to run stateless, meaning it does not have any dependencies to the server (e.g images, css etc)
The processing server, this endpoint will just handle processing images, sending to the object store and updating the DB
The object store will house all images
DB serer to store data
The CDN will cache all request to further reduce server requests
Redis cache to cache API calls, again this is as needed and really depends on how you have it configured to pull your data, may or may not be needed.
The idea here is you are creating an application that can scale out horizontally and is now a great candidate for clustering, containers etc. Because each server has no dependencies to the server it can scale as needed to handle more load.
I've been working on the front-end so far, now I'm going to create my first full-stack application. I want to use node.js, express and AWS for this.
At the design stage, I already encountered a few problems. Therefore, I have a few questions and I am asking you for help:
Can I send a message (simple JSON or database value) from the server to all clients who have already opened my home page in a simple and cheap way?
I'm not talking about logged in users, but all who downloaded the main page (GET, '/')?
Using the admin panel ('www.xxxxxxxxx/admin'), I want to send a message to the server once a day. Then I want to change the HTML to display this message. I was thinking to use EJS for this and download this message from the database.
Can I make it better? If someone visits my home page (GET, '/'), EJS will download the message from the database each time! Even though its value is the same for 24 hours. Can I get the value once and then use it until the value is changed? How to store the message? As a JSON on the server? Or maybe in the .env file?
If the user refreshes the page, do I have to pay for calling all AWS functions to build the page each time? Even if nothing has changed in the files?
How to check if the page has new content and then send it to the user, instead of sending the unchanged page files: .html, .js, .css, etc.?
Can I send the user only the changed, dynamically created html file, and not send again unchanged .js and .css files?
Does every user who opens the home page (GET, '/') create a new connection to the server using WebSocket / socket.io?
I will try to answer some of your questions:
Can I send a message (simple JSON or database value) from the server to all clients who have already opened my home page in a simple
and cheap way? I'm not talking about logged in users, but all who
downloaded the main page (GET, '/')?
I guess you mean sending push notifications from the server to the user. This can be done with different services depending on what are you trying to build.
If you are planning to use GraphQL, you already have GraphQL subscriptions out of the box. If you are using AWS, go for Appsync, which is the AWS service for GraphQL.
If you are using REST and a WebApp (not a mobile app), go for AWS IoT using lambdas. Here is a good resource using Serverless Framework (API Gateway + lambdas + IoT) for unauthenticated users: https://www.serverless.com/blog/serverless-notifications-on-aws
If you are planning to use notifications on a mobile app, you can go for SNS, the "de facto" service for push notifications in AWS world.
Using the admin panel ('www.xxxxxxxxx/admin'), I want to send a message to the server once a day. Then I want to change the HTML to display this message. I was thinking to use EJS for this and download this message from the database. Can I make it better? If someone visits my home page (GET, '/'), EJS will download the message from the database each time! Even though its value is the same for 24 hours. Can I get the value once and then use it until the value is changed? How to store the message? As a JSON on the server? Or maybe in the .env file?
Yes, this is the way it's expected to work. The HTML is changed dynamically using frontend code in Javascript; which makes calls (using axios for example) to the backend every time you get into, i.e. "/" path. You can store this data in frontend variables, or even use state management in the frontend using REDUX, VUEX, etc. Remember the frontend code will always run in the browser of your users, not on your servers!
If the user refreshes the page, do I have to pay for calling all AWS functions to build the page each time? Even if nothing has changed in the files?
What you can do is store all your HTML, CSS, Javascript in an S3 bucket and serve from there (this is super cheap, even free till a certain limit). If you want to use Server Side Rendering (SSR), then yes, you'll need to serve your users every time they make a GET request for example. If you use lambda, the first million request per month are free. If you have an EC2 instance to serve your content, then a t2.micro is also free. If you need more than that, you'll need to pay.
How to check if the page has new content and then send it to the user, instead of sending the unchanged page files: .html, .js, .css, etc.?
I think you need to understand how JS (or frameworks like React, Vue or Angular) do this. Basically you download the js code on the client, and the js makes all the functionality to update backend and frontend accordingly. In order to connect frontend with backend, use Axios for example.
Can I send the user only the changed, dynamically created html file, and not send again unchanged .js and .css files?
See answer above. Use frameworks like React or Vue, will help you a lot.
Does every user who opens the home page (GET, '/') create a new connection to the server using WebSocket / socket.io?
Depends on what you code. But by default what happens is the user will make a new GET request everytime he accesses your domain, and that's it. (It's not establishing any connection if you don't tell the code to do so).
Hope this helps!! Happy coding!
I would like to know what is the best way to handle image uploading and saving the reference to the database. What I'm mostly interested is what order do you do the process in?
Should you upload the images first in the front-end (say Cloudinary), and then call the API with result links to the images and save it to the database?
Or should you upload the images to the server first, and upload them from the back-end and save the reference afterwards?
OR, should you do the image uploading after you save the record in the database and then update it once the images were uploaded?
It really depends on the resources, timeline, and number of images you need to upload daily.
So basically if you have very few images to upload then you can upload that image to your server then upload it to any cloud storage(s3, Cloudinary,..) you are using. As this will be very easy to implement(you can find code snippet over the internet) and you can securely maintain your secret keys/credential to your cloud platform on the server side.
But, according to me best way of doing this will be something like this. I am taking user registration as an example
Make server call to get a temporary credential to upload files on the cloud(Generally, all the providers give this functionality i.e. STS/Signed URL in AWS).
The user will fill up the form and select the image on the client side. When the user clicks the submit button make one call to save the user in the database and start upload with credentials. If possible keep a predictable path for upload. Like for user upload /users/:userId or something like that. this highly depends on your use case.
Now when upload finishes make a server call for acknowledgment and store some flag in the database.
Now advantages of this approach are:
You are completely offloading your server from handling file operations which are pretty heavy and I/O blocking and you are distributing that load to all clients.
If you want to post process the files after upload you can easily integrate this with serverless platforms and do that on there and again offload that.
You can easily provide retry mechanism to your users in case of file upload fails but they won't need to refill the data, just upload the image/file again
You don't need to expose the URL directly to the client for file upload as you are using temporary Creds.
If the significance of the images in your app is high then ideally, you should not complete the transaction until the image is saved. The approach should be to create an object in your code which you will eventually insert into mongodb, start upload of image to cloud and then add the link to this object. Finally then insert this object into mongodb in one go. Do not make repeated calls. Anything before that, raise an error and catch the exception
You can have many answers,
if you are working with big files greater than 16mb please go with gridfs and multer,
( changing the images to a different format and save them to mongoDB)
If your files are actually less than 16 mb, please try using this Converter that changes the image of format jpeg / png to a format of saving to mongodb, and you can see this as an easy alternative for gridfs ,
please check this github repo for more details..
I want to upload images from the client to the server. The client must see a list of all images he or she has and see the image itself (a thumbnail or something like that).
I saw people using two methods (generically speaking)
1- Upload image and save the binaries to MongoDB
2- Upload an image and move it to a folder, save the path somewhere (the classic method, and the one I implemented so far)
What are the pros and cons of each method and how can I retrieve the data and show it in a template in each case (getting the path and writing to the src attribute of and img tag and sending the binaries) ?
Problems found so far: when I request foo.jpg (localhost:3000/uploads/foo.jpg) that I uploaded and the server moved to a known folder, my router (iron router) fails to find how to deal with the request.
1- Upload image and save the binaries to MongoDB
Either you limit the file size to 16MB and use only basic mongoDB, either you use gridFS and can store anything (no size limit). There are several pro-cons of using this method, but IMHO it is much better than storing on the file system :
Files don't touch your file system, they are piped to you database
You get back all the benefits of mongo and you can scale up without worries
Files are chunked and you can only send a specific byte range (useful for streaming or download resuming)
Files are accessed like any other mongo document, so you can use the allow/deny function, pub/sub, etc.
2- Upload an image and move it to a folder, save the path somewhere
(the classic method, and the one I implemented so far)
In this case, either you store everything in your public folder and make everything publicly accessible using the files names + paths, either you use dedicated asset delivery system, such as an ngix server. Either way, you will be using something less secure and maintenable than the first option.
This being said, have a look at the file collection package. It is much simpler than collection-fs and will offer you everything you are looking for out of the box (including a file api, gridFS storing, resumable uploads, and many other things).
Problems found so far: when I request foo.jpg
(localhost:3000/uploads/foo.jpg) that I uploaded and the server moved
to a known folder, my router (iron router) fails to find how to deal
with the request.
Do you know this path leads to your root folder public/uploads/foo.jpg directory? If you put it there, you should be able to request it.