Saving images to a database - node.js

I am building a personal project in Node, and I want my users to be able to upload 'cover' photos. The approach I am using right now is to save these images to my fs, and just add the path to that image in a MongoDB database. So if a user adds an image I add that image to my images folder with the name lets say userID.jpg, and I save "/public/images/userID.jpg" as a string in my database. I have a feeling that this approach might not be the most efficient. Should I be directly saving it to the database? What are the advantages or disadvantages ?

Storing your images as files is actually pretty efficient (unless we're talking about 10's of K's of images, but still). Also, on the serving side, the images would probably be handled by something like express.static() (assuming that you're using Express), which is also quite lightweight.
However, if you eventually want to be a bit more scalable, you can take a look at using GridFS, which implements a file-system like storage on top of MongoDB. I use gridfs-stream for something similar (uploads/downloads) and it works fine.
If the images are small enough (smaller than about 16MB, which is the size limit for BSON documents), you might not even need to use GridFS and just store the images as Binary type.

Related

How to send reduced/optimised images from node server to Angular client (but save original quality to database)

I'm working on a project which requires an image database. Many of the images are rather larger than they need to be for 90% of the use cases of the application, however, there are cases where I need the original quality image.Therefore, I cannot reduce the image before uploading/saving to the filesystem (MongoDB database).
My plan was to reduce the size of the image after retrieving the original from MongoDB but before actually sending it from the Node server to the Angular client.
I cannot find an example of this using either Sharp or Jimp (which seem to be leading contenders) and I'm struggling to make either work:
When I use Jimp, I get errors about the constructor.
When I use Sharp it throws errors because it doesn't like the input - it seems to want the
original JPEG or PNG as input, which obviously I don't have as the returned image is an object with a buffer binary field.
Could anyone point me in the right direction about how to achieve this?
In the end, after discussing multiple strategies with my superior, I saved each image with a high definition image (base64) string, and a reduced image string. I was fortunate in that the only time I needed high definition images I was calling a single image by id, therefore I queried the database to return only the image string which was required by the front end.
If anyone does query this or something similar in future I used Sharp on the backend to facilitate the image resizing.
The errors in Jimp and Sharp initially came from attempting to work with the binary. In the end it was simpler to convert the image to a string from the offset.

Are there cons to using GridFS as a default with MongoDB?

I'm creating a RESTful API with node, express, and mongodb and the book I'm using as a reference recommends using GridFS (namely gridfs-stream) for cases where one needs to handle files larger than the MongoDB cut-off (16MB)
I'm not sure if my app will ever need to handle files that size, but I'm wondering if there are cons to using it anyways in case I may need that feature later.
Are there any cons (i.e. significant unnecessary performance penalties, stability issues) that I should be aware of to help make this decision?
I'm also open to suggestions for alternate file management solutions that you may have.
Thanks!
dont use Gridfs for small binary data
GridFS requires two queries: one to fetch a file’s metadata and one to fetch its contents
Therefore, if you use GridFS to store small files, you are doubling the number
of queries that your application has to do. GridFS is basically a way of breaking up large
binary objects for storage in the database.
GridFS is for storing big data—larger than will fit in a single document. As a rule of best practice anything that is too big to load all at once on the client is probably not something
you want to load all at once on the server. Therefore, anything you’re going to
stream to a client is a good candidate for GridFS. Things that will be loaded all at once
on the client, such as images, sounds, or even small video clips, should generally just
be embedded in your main document
Furthermore, if your files are all smaller the 16 MB BSON Document Size limit, consider storing the file manually within a single document instead of using GridFS. You may use the BinData data type to store the binary data. See your drivers documentation for details on using BinData.
see https://docs.mongodb.com/manual/core/gridfs/
please mark correct if this helped

Differnce between storing images in RMS or in package

I am new to J2ME development, so i am having some problems regarding storing images in RMS.
What is the diffrence between storing image in a package and storing it in RMS, is it the same thing or having any difference. Moreover, if i store image on RMS, then ultimately i would i have to keep it in package too so it consumes space at two different places.
Please help me out with this issue and assist me with the best approach to be used.
RMS is used to store data that needs to be loaded again later, like e.g. variables that is input by the user, or highscores achieved in a game.
In my opinion it would only make sense to store images in RMS, if they have been edited by the user. But even then, I'd probably go with saving them on the file-system instead.
A MIDlet cannot come with data pre-defined in RMS. In order to put data into RMS, you need to get that data from somewhere first, e.g. inside the jar file resources folder, or downloaded from the web.

How should I load the contents of a .txt file to serve on a website?

I am trying to build excerpts for each document returned as a search results on my website. I am using the Sphinx search engine and the Apache web server on Linux CentOS. The function within the Sphinx API that I'd like to use is called BuildExcerpts. This function requires you to pass an array of strings where each string contains the documents contents.
I'm wondering what the best practice is for retrieving the document contents in real time as I serve the results on the web. Currently, these documents are in text files on my system, spread across multiple drives. There are roughly 100MM of them and they take up a few terabytes of space.
It's easy for me to call something like file_get_contents(), but that feels like the wrong way to do this. My databases are already gigantic ( 100GB+ ) and I don't particularly want to throw the document contents in there along with the document attributes that already exist. Perhaps this is the best way to do this, however.
Suggestions?
Well the source needs to be fetched from somewhere. If you dont want to duplicate it in your database, then you will need to fetch it from the filesystem. (using file_get_contets or similar)
Although the BuildExerpts function does give you one extra option "load_files"
... then sphinx will read the data from the filename for you.
What problem are you experiencing with reading it from files? Is it too slow? If so maybe use some caching in front - using memcache maybe.

Storing lots of attachments in single CouchDB document

tl;dr : Should I store directories in CouchDB as a list of attachments, or a single tar
I've been using CouchDB to store project documents. I just create documents via Futon and upload them directly from there. I've also written a script to bulk-upload directories. I am using it like a basic content repository. I replicate it, so other people on my team have a copy of the repository.
I noticed that saving directories as a series of files seems to have a lot of storage overhead, so instead I upload a .tar.gz file containing the directory. This does significantly reduce the size of the document but now any change to the directory requires replicating the entire tarball.
I am looking for thoughts or perspective on the matter.
It really depends one what you want to achieve. I will try and provide some options for you to consider.
Storing one tar.gz will save you space, but it does make it harder to work with. If you are simply archiving it may work for you.
Storing all the attachments on one document works well for couchapps. The workflow is you mess around with attachments until you are ready to release the application, then there is not a lot of overhead for replication, because it is usually one time. It is nice that they are one one document because they all move/replicate as one bundle. Downsides for using this approach for a content management system are that you can get a lot of history baggage that you have to compact on your local couch. Also you will get a lot of conflicts during replication between couches, and couch will keep conflicts around for you to resolve. Therefore if you choose this model, you should compact frequently to reduce disk size.
For a content management system, I might recommend using one document per attachment. That would give you less conflicts. There will be a slight overhead as each doc will have some space allocated for the doc itself, but the savings in having to do frequent compaction and/or conflict resolution will be better.
Hope that gives you some options to weigh out.

Resources